-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding to assignment in grammar #25
Conversation
Lgtm, thanks! |
I am a bit concerned about the size of the generated parser. The grammar update after this PR (8d8a9c9) increased the size of Any ideas on what makes the parser so large or how we could slim it down? cc @aryx |
good point, not sure why indeed. tree-sitter-c# is also pretty good, but still less with 18MB. |
I remember the author of tree-sitter told me to look out for the number of states in parser.c, and it seems to be |
Maybe @maxbrunsfeld knows some tricks to reduce the number of states? Max, do you see anything obviously wrong in the grammar.js for kotlin? |
After some investigation, I found that b651419 seems to have bumped the parser size from 2.4 MB to 30.9 MB. Perhaps control structures in Kotlin are simply hard to parse? The corresponding keywords may indicate very different constructs in different contexts, since they could be either statements or (possibly nested) expressions. |
I wonder if using the 'word:' directive in grammar.js would help, |
Ok this reduces parser.c to 8MB (from 48MB):
|
and the STATE_COUNT is at 4119 now. Better! |
Added in 456c100, thanks! |
Nice, glad you solved it! |
this is awesome! |
Hmm was too easy. Apparently the word: directive reduce the size of parser.c but we now get more parsing errors on our corpus. |
Also @fwcd would it be possible for you before some commits to run some regressions testing like running tree-sitter stat |
We were thinking at some point at R2C to add some infrastructure to make it easy for tree-sitter projects to run those parsing regressions stats in CI but we're not there yet. |
Ideally we would add the critical snippets that currently don't parse correctly to the corpus in this repo, to ensure that we don't regress on anything. But I agree, running the bigger corpus, preferably in CI, would be great. |
I looked into this a bit more, and found two test cases that are erroring now that weren't erroring before: 1):
Before, this parsed to:
Now it parses to:
Before, this parsed to:
Now it parses to:
|
@colleend The failure in the second example was a regression that I mistakenly introduced in 5f18577. Apparently, tree-sitter assigns literal strings a higher precedence than regexes, causing it to never consider the conflict between Regarding your first example, I am not sure whether that is the correct parse in the first place. Shouldn't it be a single call expression? |
Interesting! Didn't know that tree-sitter did that -- thanks for fixing that 🙏 On the topic of the first example -- good point, that should definitely be a single call expression (probably something like
I'll look into this case and see what we can change to make this parsing correct! |
Hmm something weird is this |
Fixes errors where lines like
this.foo = bar
don't parse.To test: use
The
this assignment
test should pass.Pad's Comment: