-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse empty lines and comments #8
Conversation
Thanks for sending this. I'm curious how GNU bash handles new lines, Can you add some tests?
|
I did some thinking prior to making this and on line 388 they do create a NULL command and yacc accept: The "None" approach would not really be viable, because a pointer can be of specific type, but None is of no type, so I assumed this to be error-prone. Also, in parser.py around lines 600-620, the index is being set and should there be None, the position would have to be remembered differently (not to mention testing for None-ness). Perhaps this has to do with actually executing the code? |
Maybe. I tried a more naive approach:
But then started thinking about recursive parsing and indeed this fails when there's a newline in a command substitution, try: I'll still merge this, but let's see if we can fix this case too. |
Ok, I think I got it working. It deviates from bash a little because I got tired of debugging it. Can you try https://github.com/idank/bashlex/tree/newlines |
I played with it a bit and one assert (
It works without the assert, though. The two issues are barely related, so you could still combine the commits and both parse newlines and comments and fix the newline in substitution problem. I am not trying to enforce my changes, on the contrary, I'd love to see comment-parsing being handled non-trivially, it's just that each and every file that I have to parse contains comments and having a whole different version of the parser is not in my interest. Thanks for the advice, I would not dare alter your former test cases. |
What do you mean have a whole different version of the parser? |
I added my proposition for the combination of the two solutions. It should be able to parse all of the aforementioned problems: newlines, comments and newlines in command substitution. What I meant is that if you choose to go with your newlines version and disregard comments altogether I'd have to use my approach for comment-parsing and digress from the mainstream bashlex and I would like to avoid this. |
Squashed into a single commit for your convenience~ |
My changes shouldn't ignore comments. I made them in haste, I'm sure it's fixable. Does your original patch work with new lines in substitutions? Changing the grammar the way you did is a little scary and it's hard to see what else might be affected by it. |
No, my original patch does not regard new lines in substitutions as that was in no way related to what I was trying to accomplish. And I didn't know such problem existed. Your proposed version by itself also seems not to parse |
That's easy to fix. What I meant by touching the grammar is this line If you want to refine my patch, feel free. Otherwise I'll work on it when I have the time. |
FYI I tried this PR with moderate complexity multiline bash files and it doesn't work. |
Did you debug further? What was giving it trouble? |
Can you provide an example? I analysed the testsuite of the abrt project (some 100+ files) and only a handful of tests failed to parse for reasons unrelated to newline parsing. |
I'll try again, gimme a min |
btw can you merge master to your branch? You're 18 commits behind, and have conflicts here |
so it fails on this script: #!/bin/bash -xe
docker build --pull --tag s2-build .
docker run -it --rm -v `pwd`:/s2 s2-build with error:
|
fixed by ea0c135 |
Allows to parse whole files.
The idea is that I force-reduce on inputunit -> NEWLINE rule and accept by creating a dummy newline node that only retains position (needed for further parsing). These nodes are filtered in the end to prevent output such as:
NewlineNode
NewlineNode
CommandNode
...
I handpicked the relevant changes, so I hope I didn't miss anything.