New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore SyntaxError while parsing #214
Conversation
Could you add tests for this, please? |
@sigmavirus24: Sure thing. 9615c72 adds a test for the fix in dc34871. This also required making a fix in the test harness: 839b528. |
try: | ||
return next(self._generator, None) | ||
except SyntaxError as syntax_error: | ||
self.log.warning('error generating tokens: {0}', syntax_error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of capturing the exception and passing it, you can do:
self.log.warning('error generating tokens', exc_info=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good shout.
toto = 0 + 0 | ||
#: | ||
""" # noqa | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this huge snippet? You can create a syntax error with a single character (like "[").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This snippet is the test file that was reported to cause issues in #168, but yes, we can minimize the test example. Turns out that the single invalid token (e.g. "[" like you mentioned) leads to a different exception: tk.TokenError so I added that as a test case too.
@Nurdok: Addressed comments and resolved merge conflicts. |
pass | ||
""" | ||
|
||
source_token_error = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this to be a multi-line string? Why not just source_token_error = "["
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
def _next_from_generator(self): | ||
try: | ||
return next(self._generator, None) | ||
except (SyntaxError, tk.TokenError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add this bugfix to the release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a line to docs/release_notes.rst
.
Commit e886115 introduced some tests that intend to iterate over strings that contain definitions of invalid source code and then assert that the parser does something sensible with these sources. However, there was a typo in e886115 causing the tests to not execute as expected: the invalid sources are iterated in the `source_ucli` variable. However, the parsing is done against the `source_ucl` variable which contains a known parsable source string!
@Nurdok: Addressed the latest set of comments, added the fix to the release notes and did some commit squashing because the pull request was getting rather long. |
@@ -11,6 +11,10 @@ Major Updates | |||
|
|||
* Support for Python 2.6 has been dropped (#206, #217). | |||
|
|||
Bug Fixes | |||
|
|||
* Made parser more robust to bad source files (#168) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include the pull request as well. i.e., this should be:
* Made parser more robust to bad source files (#168, #214).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added.
This is a partial fix for #168
Merged. Thanks! |
This is a partial fix for #168. With this change, the parser no longer crashes on test file E90.py.
Note that this is a pretty minimal fix. A potential future refactor to increase the robustness of the parser could be to use the null-object pattern and return a NullToken from
TokenStream.move
instead ofNone
. In this way, any code that consumesTokenStream.current
doesn't have to null-check the property value explicitly which should make the other issues mentioned in #168 go away.