Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParseResults parameterizes generic types if passed as the value of toklist parameter #276

Closed
robes opened this issue Apr 22, 2021 · 6 comments

Comments

@robes
Copy link

robes commented Apr 22, 2021

If ParseResults is constructed with a type given for the toklist parameter, it will be unintentionally parameterized in the ParseResults instance if the type supports generic type hinting. Since standard collections support generics as of Python 3.9, the issue can be demonstrated with any of the collections classes. The issue also applies to any user-defined class that supports generics , for example subclasses of the collections classes.

Expected: the unaltered value passed to toklist to be present in the named value of the ParseResults.

Actual: a parameterized version of the original type passed to toklist.

Steps to reproduce, test.py:

from pyparsing import ParseResults
print(int, ParseResults(toklist=int, name='type_', asList=False)['type_'])
print(tuple, ParseResults(toklist=tuple, name='type_', asList=False)['type_'])

On Python 3.9:

% python3.9 test.py
<class 'int'> <class 'int'>
<class 'tuple'> tuple[0]

On Python 3.8:

% python3.8 test.py
<class 'int'> <class 'int'>
<class 'tuple'> <class 'tuple'>

It seems like the underlying issue is a brittle test of whether a subscript will raise a TypeError (among others).

On branch pyparsing_2.4.x:

pyparsing/pyparsing.py

Lines 587 to 591 in ad29bf9

else:
try:
self[name] = toklist[0]
except (KeyError, TypeError, IndexError):
self[name] = toklist

On branch master:

else:
try:
self[name] = toklist[0]
except (KeyError, TypeError, IndexError):
self[name] = toklist

I suspect a more explicit test could resolve the issue by adding an elif clause immediately before the above:

  elif isinstance(toklist, type):
    self[name] = toklist
  else:
@ptmcg
Copy link
Member

ptmcg commented Aug 15, 2021

I can certainly add the code you've provided, and it does work. But I've had difficulty writing a parser and/or parse action that actually reproduces this problem in normal pyparsing usage.

For the case you submitted (and for dict, list, and set as well), the problem is addressed by passing the value in a list, as:

pp.ParseResults(toklist=[tuple], name='type_', asList=False)['type_']

Can you give me a little more to go on as to how you came up with this issue? Would this workaround suffice? (The reason I ask is because the place where this fix would go is pretty performance-critical, and so adding another elif branch could negatively affect pyparsing's parser performance.)

@ptmcg
Copy link
Member

ptmcg commented Sep 9, 2021

Still looking for some sample code on where this is happening in your pyparsing parser. I could not reproduce it, short of explicitly creating a ParseResults with a type outside of the parser, which I've said earlier I am reluctant to make accommodations for, which may impair typical pyparsing usages (plus there is a workaround).

@robes
Copy link
Author

robes commented Sep 9, 2021

The issue I reported was encountered indirectly through the use of a library (pyfpm) that uses pyparsing. I created the smallest script above after isolating the issue in pyparsing, which was triggered due to the implementation of PEP 585 in python 3.9. I had a user-defined type that inherited from Iterator and was then used in a rule in the pyfpm library. It's not a simple matter to provide another sample, due to the indirect usage of pyparsing via pyfpm, and it would probably be a lot less clear than the code above that reproduces the issue in the minimal amount of code necessary. I understand how to workaround the issue, and I did workaround it through altering my usage of pyfpm, but I thought you might want to know that your API behaves differently before/after PEP 585. Thanks for looking at the issue.

@ptmcg
Copy link
Member

ptmcg commented Sep 9, 2021

Thanks for giving me a little more background. I took another look at where this occurs in the code, and there is a place in the code where this can be addressed, with what I believe will be minimal impact (expands an existing isinstance() call vs. adding another elif branch).

It didn't make it in time for yesterday's 3.0.0rc1 release, but will be included in the next.

@robes
Copy link
Author

robes commented Sep 9, 2021 via email

sthagen added a commit to sthagen/pyparsing-pyparsing that referenced this issue Sep 12, 2021
Handle types passed to ParseResults (Py3.9 behavior change) (pyparsing#276)
@ptmcg
Copy link
Member

ptmcg commented Oct 2, 2021

Released in 3.0.0rc2

@ptmcg ptmcg closed this as completed Oct 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants