Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrote parser in tril #4345

Merged
merged 1 commit into from Oct 25, 2019
Merged

Rewrote parser in tril #4345

merged 1 commit into from Oct 25, 2019

Conversation

@Yuehan-Lin
Copy link
Contributor

Yuehan-Lin commented Sep 23, 2019

Used cpp file instead of yacc and lex.
In the parser.cpp, first used lexer to scanned and tokenlized the tril input, then parsed tokens with grammar rules to build AST.

Fixes: #4489

Signed-off-by: Yuehan-Lin Yuehan.Lin@ibm.com

@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch 2 times, most recently from c9b792d to 673f85c Sep 23, 2019
@fjeremic

This comment has been minimized.

Copy link
Contributor

fjeremic commented Sep 23, 2019

We're going to need some Doxygen documentation here. There is not a single comment in the entire file that was added. We'll also need a more descriptive commit message for this rather large change.

Taking a quick glance here it seems the lexer and parser are implemented using a bottom-up approach with an LR(x) parser. What is the value of x? Looking at the code I'm a little surprised we went with an LR parser to parse Tril, given that the grammar for the language is very Lisp like so it is trivially parsable with a recursive descent LL(2) parser which is easy to read (from a code perspective) and to extend to support more features in Tril. Will the current implementation be easy to extend to parse things like node flags, etc.?

Having said that the lexer and parser here is quite concise, so kudos on that! Good work!

@fjeremic

This comment has been minimized.

Copy link
Contributor

fjeremic commented Sep 23, 2019

@genie-omr build all

@Leonardo2718

This comment has been minimized.

Copy link
Contributor

Leonardo2718 commented Sep 23, 2019

Sorry, I accidentally close this PR while writing this comment.

We're going to need some Doxygen documentation here. There is not a single comment in the entire file that was added. We'll also need a more descriptive commit message for this rather large change.

Completely agree. In particular, we should document the Tril grammar so it's easy to see what the parser is looking for.

Taking a quick glance here it seems the lexer and parser are implemented using a bottom-up approach with an LR(x) parser. What is the value of x? Looking at the code I'm a little surprised we went with an LR parser to parse Tril, given that the grammar for the language is very Lisp like so it is trivially parsable with a recursive descent LL(2) parser which is easy to read (from a code perspective) and to extend to support more features in Tril.

Actually, the parser is already in recursive decent form. Parsing is top-down, but the AST is built bottom-up. The distinction is that the decision of which production rule to apply is done in a top-down fashion. That is, we figure out which AST nodes to construct top-down. Once we know which node to construct, instead of constructing it right away, we wait for the children to be constructed. Basically instead of

if (token == '(') {
  auto node = new Node();
  node->children = parseChildren();

we do

if (token == '(') {
  auto children = parseChildren();
  auto node = new Node(Children());

The point in the code where we figure out what node to custruct is the same, it's just where we construct it that's different.

That said, I do think the current implementation has room for simplification.

Will the current implementation be easy to extend to parse things like node flags, etc.?

This shouldn't be a problem. As with the old parser, the AST is already expressive enough to be able to represent things such as node flags. What will have to change is the IL generator, which will have to recognize new patterns in the AST.

@fjeremic

This comment has been minimized.

Copy link
Contributor

fjeremic commented Sep 24, 2019

Thanks for the explanation @Leonardo2718. This makes sense. What threw me off was that I was expecting to see a more canonical recursive descent implementation with function calls implementing each reduction rule. The current implementation makes this much more concise though.

Let's fix the compile errors seen in the build and in meanwhile we can start reviewing the code.

@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch from 673f85c to 30d75a6 Sep 24, 2019
@fjeremic

This comment has been minimized.

Copy link
Contributor

fjeremic commented Sep 24, 2019

@genie-omr build all

Copy link
Contributor

Leonardo2718 left a comment

Here is my intial review. First, some general comments:

  1. Please reword the commit and PR titles to be in the imperative mood. Also having a high-level description of what this change does and why would be good to have in the commit message.
  2. Please add comments for:
    • the Tril grammar; this can just be a block comment somewhere near the start of a file or just before the first function of the parser.
    • the various functions defined using Doxygen format.
    • all occurrences of tokenIt++ so we know which token is being consumed.
  3. Flex/Lex and Bison/Yacc should be removed as dependencies from the CI build scrips.
  4. In other parts of the code, there is no space between identifier and template parameters (i.e. foo<int> instead of foo <int>). Please update the usage of templates to conform to that style.
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Show resolved Hide resolved
Copy link
Contributor

Leonardo2718 left a comment

A few more notes:

  1. I think it would be really helpful to specify in the comments of the different parse functions what the state of the iterator is after the function returns. For example, the comments for buildAST() could say something like "When the function returns, the token iterator points to the token immediately after the ')' of the parsed s-expression." Also, document what exceptions can be thrown would be helpful.
  2. There are still spaces that should be removed between the name of templates and the template arguments.
  3. There are several try-catch blocks that should be removed because there is no way to recover from the errors they are catching.
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch 2 times, most recently from 79c4baa to 2f9747e Oct 8, 2019
@fjeremic

This comment has been minimized.

Copy link
Contributor

fjeremic commented Oct 9, 2019

@Yuehan-Lin a suggestion which may make it much easier for reviewers to review the code is to add individual commits addressing review comments and post them in the corresponding review thread. If you paste the SHA and say something like "Fixed in 79c4baa" for example, GitHub will hyperlink the commit which addresses the review and it gives reviewers a much smaller scope to check.

If commits are being force pushed to the PR branch it can get very difficult for a reviewer to find what exactly changed between the last time they reviewed and the next. Sometimes changes will get lost and new code not properly reviewed. If changes addressing reviews are split up into commits it makes it very easy to follow. We can then squash unneeded commits at the end before a final merge.

Copy link
Contributor

Leonardo2718 left a comment

Just a few minor comments.

fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
fvtest/tril/tril/parser.cpp Outdated Show resolved Hide resolved
@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch from fdb8b67 to 60f5d7f Oct 25, 2019
Copy link
Contributor

Leonardo2718 left a comment

I think this is a great change! Thanks @Yuehan-Lin!

There are still opportunities for further improvement but I think it's best to leave that work for subsequent PRs (I need to open issues for this). As is, the change is already a significant improvement over the existing solution and will enable other work once merged.

@fjeremic I'm planning on merging this by EOD today so I am kindly requesting for your approval before then. 🙂

@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch from 60f5d7f to 1f59b97 Oct 25, 2019
Signed-off-by: Yuehan-Lin <Yuehan.Lin@ibm.com>
@Yuehan-Lin Yuehan-Lin force-pushed the Yuehan-Lin:parser branch from 1f59b97 to 393044a Oct 25, 2019
@Leonardo2718

This comment has been minimized.

Copy link
Contributor

Leonardo2718 commented Oct 25, 2019

@genie-omr build zos

@Leonardo2718

This comment has been minimized.

Copy link
Contributor

Leonardo2718 commented Oct 25, 2019

@genie-omr build all

@Leonardo2718 Leonardo2718 self-assigned this Oct 25, 2019
Copy link
Contributor

fjeremic left a comment

Looks much better. Thanks @Leonardo2718 for the thorough review! Agreed we can address other improvements in separate PRs as merging this work unblocks other things. Great work everyone!

@Leonardo2718 Leonardo2718 merged commit 7caf559 into eclipse:master Oct 25, 2019
14 checks passed
14 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/eclipse-omr/pr/aix_ppc-64 Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_390-64 Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_aarch64 Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_arm Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_ppc-64_le_gcc Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_x86 Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_x86-64 Build finished.
Details
continuous-integration/eclipse-omr/pr/linux_x86-64_cmprssptrs Build finished.
Details
continuous-integration/eclipse-omr/pr/osx_x86-64 Build finished.
Details
continuous-integration/eclipse-omr/pr/win_x86-64 Build finished.
Details
continuous-integration/eclipse-omr/pr/zos_390-64 Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
eclipsefdn/eca The author(s) of the pull request is covered by necessary legal agreements in order to proceed!
Details
fjeremic added a commit to fjeremic/omr that referenced this pull request Oct 28, 2019
With the merge of eclipse#4345 we no longer depend on Flex and Bison being
installed for our testing. AppVeyor seems to be broken at the moment in
that checksums are failing when installing this package. We fix the
issue by removing the dependency as it is no longer needed.

Signed-off-by: Filip Jeremic <fjeremic@ca.ibm.com>
fjeremic added a commit to fjeremic/omr that referenced this pull request Oct 28, 2019
With the merge of eclipse#4345 we no longer depend on Flex and Bison being
installed for our testing. AppVeyor seems to be broken at the moment in
that checksums are failing when installing this package. We fix the
issue by removing the dependency as it is no longer needed.

Signed-off-by: Filip Jeremic <fjeremic@ca.ibm.com>
AidanHa added a commit to AidanHa/omr that referenced this pull request Nov 4, 2019
With the merge of eclipse#4345 we no longer depend on Flex and Bison being
installed for our testing. AppVeyor seems to be broken at the moment in
that checksums are failing when installing this package. We fix the
issue by removing the dependency as it is no longer needed.

Signed-off-by: Filip Jeremic <fjeremic@ca.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.