-
Notifications
You must be signed in to change notification settings - Fork 176
store strings instead of tokens in nodes #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'll take a look at CI failure a bit later today. |
d26b3d8 to
65da2bb
Compare
65da2bb to
7960c48
Compare
|
@kddnewton I understand that you are still busy with your other work, so no rush. PR is ready but it can wait of course. I'm pretty sure that nodes shouldn't be based on tokens in general. It does make sense in many cases (especially for things like keywords or literals), but there are cases when location should be computed dynamically, for example here: def foo(bar:)
end^ the token is On the other side I understand that it's much easier to embed tokens, and maybe we could go with this approach at first and then slowly change it to be manual. I suppose right now there are more obvious things that must be solved first (like the absence of the whole parser lol). This is simply a thing that instantly caught my eye, but I don't know if we should focus on it rn. |
|
So in general I'm keen to reduce the depth of the tree, which this accomplishes. So I'm definitely open to the idea of it. However, on the Ruby side of things, I definitely don't want to lose as much information as this PR would cause it to. As a simple example, with the That being said, yeah, we're definitely going to run into limits with the templating at some point. I don't think it's going to be a long term solution for all of our AST building. It gets especially hairy with the location information as things become more and more optional. I'm not sure of the right solution yet, but I definitely want to leave this open for a while so we can keep talking about it. |
|
Sorry for not being clear. My goal is to keep all information about all locations as well. This PR is somewhat more about changing rules of how nodes are constructed. Of course in its current form it introduces quality degradation, for example I've sent this PR to change templates and node constructors and I think it makes sense to replace tokens with strings + locations (similar to how other parsers do it), but not in a single PR 😄 . It's quite big as it is, and I'm not even sure it's possible to easily review it. |
|
Yeah I definitely agree about tracking strings. We're going to need to do this eventually, I just don't think we can do it just yet. |
|
Ok, I'm closing it for now. Honestly I think we can iterate much faster with tokens that are not 100% precise in a few case (like those mentioned above). Later we can re-open it once we have the parser that works 😄 |
We need to free the current_block_exits in parse_program when we're done
with it to prevent memory leaks. This fixes the following memory leak detected
when running Ruby using `RUBY_FREE_AT_EXIT=1 ruby -nc -e "break"`:
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x5bd3c5bc66c8 in realloc (miniruby+0x616c8) (BuildId: ba6a96e5a060aec6fd9f05ed7e95d9627e1dbd74)
#1 0x5bd3c5f91fd9 in pm_node_list_grow prism/templates/src/node.c.erb:35:40
#2 0x5bd3c5f91e9d in pm_node_list_append prism/templates/src/node.c.erb:48:9
#3 0x5bd3c6001fa0 in parse_block_exit prism/prism.c:15788:17
#4 0x5bd3c5fee155 in parse_expression_prefix prism/prism.c:19221:50
#5 0x5bd3c5fe9970 in parse_expression prism/prism.c:22235:23
#6 0x5bd3c5fe0586 in parse_statements prism/prism.c:13976:27
#7 0x5bd3c5fd6792 in parse_program prism/prism.c:22508:40
Per #5.
Also, I think the format of the AST is redundant in many places, for example there's no need to store actual value for constant tokens like
while, orcolon, because they are constant. It makse more sense to simply store their location in source input (just like whitequark/parser does).Dynamic values should be stored of course, things like
IDENTIFIERmust be stored by value and by location too (whitequark/parser stores them as separate fields likenameandname_l)