Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process Known Single-Character Lexemes Immediately #16

Closed
wants to merge 4 commits into from

Conversation

pydsigner
Copy link

Addresses #15 by checking for the existence of terminal lexemes.

@pydsigner
Copy link
Author

Hmm looks like this is going to be more complicated than I'd hoped.

@pydsigner
Copy link
Author

It's worth noting that this PR doesn't solve this issue for binary inputs; the codecs wrapper is unfortunately greedy in its own right.

@pydsigner
Copy link
Author

Looks like the tests aren't increasing coverage as much as expected. Odd.

Another note on these tests: they've uncovered something else really oddly ugly; suppressed UnexpectedSymbol errors that only come up if I iterate over items less.

'[1, "abc"' produces said suppressed error if next() is called on backend.python.items() once, but not if it's called twice.

@rtobar
Copy link

rtobar commented Nov 21, 2019

Hi,

I'm off until next week, so I won't be able to pay close attention to this until then. Coverage is bogus (don't know why), so don't mind it too much. Is this new code effectively faster? You could try running the benchmarks to get an idea.

@pydsigner
Copy link
Author

pydsigner commented Nov 21, 2019

@rtobar I wouldn't expect much difference under normal conditions and the benchmarks seem to support this.

> git checkout master
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
> python benchmark.py
#mbytes,test_case,backend,time,mb_per_sec
0.191, long_list, python, 0.681, 0.280
1.886, big_int_object, python, 1.409, 1.338
1.801, big_null_object, python, 1.264, 1.425
1.849, big_bool_object, python, 1.274, 1.451
2.649, big_str_object, python, 1.467, 1.805
8.000, big_longstr_object, python, 1.532, 5.221
19.264, object_with_10_keys, python, 15.684, 1.228

> git checkout patch-1 
Switched to branch 'patch-1'
Your branch is up to date with 'origin/patch-1'.
> python benchmark.py
#mbytes,test_case,backend,time,mb_per_sec
0.191, long_list, python, 0.669, 0.285
1.886, big_int_object, python, 1.412, 1.336
1.801, big_null_object, python, 1.274, 1.414
1.849, big_bool_object, python, 1.281, 1.443
2.649, big_str_object, python, 1.453, 1.823
8.000, big_longstr_object, python, 1.516, 5.276
19.264, object_with_10_keys, python, 15.524, 1.241

rtobar added a commit that referenced this pull request Nov 26, 2019
The new set of tests introduced in the previous commit highlighted a
corner case that wasn't properly covered, and that was mentioned in #16:
the yield expression within the try/expect block can also actually raise
an exception if the generator is being destroyed before it is consumed
fully (e.g., if manually iterated over via next() and then discarded).
In these cases an UnexpectedSymbol exception would be raised during the
destruction of the generator object, which would be ignored by the
interpreter.

The solution is simple: include *only* the number parsing logic within
the try/catch, leaving the yield expression outside to get a normal
logic flow.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
@rtobar
Copy link

rtobar commented Nov 26, 2019

Closing, although all changes made it to the repo via manual cherry picking plus some adjusting.

@rtobar rtobar closed this Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants