-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mini-language standalone parser for observe #1066
Conversation
The generated 2000+ lines parser file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The grammar file needs to be added to the project manifest file so that it gets bundled with source distributions. It doesn't need to be added to the package data in setup.py, however.
Other than that, it looks good.
etstool.py
Outdated
"{grammar_path}" | ||
) | ||
|
||
with open(out_path, "w") as out_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a command-line option to get lark
to output to a file? that would be preferable to capturing stdout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and unfortunately no:
https://github.com/lark-parser/lark/blob/master/lark/tools/standalone.py#L87
Well spotted about the source distribution. I will add the file to MANIFEST.in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -0,0 +1,54 @@ | |||
// Grammar for Traits Mini Language used in observe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized nowhere in the file says what this lark file is for (it is kind of implicitly there in the filename/context). So I added this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also record somewhere the exact command-line command needed to generate the parser from the grammar? (We probably eventually want that wrapped up in an etstool
task, too, but that's separate.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, ignore. I should have looked harder at the PR first. (/me crawls back into his hole)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, you helped me realize that I misspelt the filename for etstool.py
as etstools.py
on line 3. Now fixed. Waiting for CI...
etstool.py
Outdated
"{grammar_path}" | ||
) | ||
|
||
with open(out_path, "w") as out_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include an explicit encoding here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(related: https://www.python.org/dev/peps/pep-0597/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted, thanks.
It appears that the lark output can differ from run to run: if I do Do you know if there's any way to control this? Reproducibility would be good. |
From lark-parser/lark#278, it looks as though the answer for now is "not really, no". |
Hmm. But that issue's closed, and apparently fixed. I'm not sure what's going on here. |
Codecov Report
@@ Coverage Diff @@
## master #1066 +/- ##
==========================================
- Coverage 76.15% 72.80% -3.35%
==========================================
Files 54 63 +9
Lines 6493 8036 +1543
Branches 1263 1538 +275
==========================================
+ Hits 4945 5851 +906
- Misses 1205 1806 +601
- Partials 343 379 +36
Continue to review full report at Codecov.
|
I can reproduce the different outputs with lark 0.8, Python 3.8 (where dictionaries are ordered), even with PYTHONHASHSEED set to a fixed value. No idea why. |
Merging. It is not clear if there is a trivial way to get the output consistent and it does not seem important enough to block the parser from going in... |
Agreed, but please could you open an issue? We may want to report upstream too, but a local issue would be the first step for that. |
I typed and removed ("will open an issue upstream") because I did not want to promise that - I will need to work out a reproducible example first. |
Yep, just a local issue would be fine; that issue will act as a trigger for us to send an upstream issue when we're ready. |
Opened lark-parser/lark#584 I think lark-parser/lark#278 is about the parsed syntax tree showing nondeterministic behaviour. Here the nondeterministic behaviour is about the generated code not being static, creating noisy diff despite using the same grammar. |
@kitchoi Thank you! One option we might consider is not tracking the parser code in version control, but instead generating it at package install time (i.e., as a result of running the |
Part of item 5 in #977
This PR:
lark-parser
lark-parser
.Tests are added to verify the parser can differentiate good texts versus bad ones. The parser converts text into a syntax tree. Further interpretation on the syntax tree will require item 4 (
Expression
) in #977, which has not been introduced yet. But here is an example of how it will get used:traits/traits/observers/parsing.py
Lines 139 to 160 in 318f284
The standalone parser is only for internal use. There will be a user-facing one later when
Expression
is finalized.Checklist
Update API reference (docs/source/traits_api_reference
)Update User manual (docs/source/traits_user_manual
)Update type annotation hints intraits-stubs