Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inspect.getcallargs doesn't properly interpret set comprehension code objects. #63810

Closed
nedbat opened this issue Nov 15, 2013 · 21 comments
Closed
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@nedbat
Copy link
Member

nedbat commented Nov 15, 2013

BPO 19611
Nosy @ncoghlan, @nedbat, @bitdancer, @serhiy-storchaka, @1st1, @JelleZijlstra
Files
  • issue19611.patch: patch against master
  • issue19611.patch
  • issue19611.patch: updated patch
  • issue19611-decorators.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2016-06-04.21:45:58.142>
    created_at = <Date 2013-11-15.12:03:03.236>
    labels = ['type-bug', 'library']
    title = "inspect.getcallargs doesn't properly interpret set comprehension code objects."
    updated_at = <Date 2016-06-04.21:45:58.140>
    user = 'https://github.com/nedbat'

    bugs.python.org fields:

    activity = <Date 2016-06-04.21:45:58.140>
    actor = 'ncoghlan'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2016-06-04.21:45:58.142>
    closer = 'ncoghlan'
    components = ['Library (Lib)']
    creation = <Date 2013-11-15.12:03:03.236>
    creator = 'nedbat'
    dependencies = []
    files = ['43105', '43111', '43150', '43172']
    hgrepos = []
    issue_num = 19611
    keywords = ['patch']
    message_count = 21.0
    messages = ['202944', '202945', '266909', '266916', '266919', '266923', '266932', '266934', '266943', '266945', '266947', '266966', '266968', '266978', '266980', '266988', '267040', '267109', '267194', '267324', '267326']
    nosy_count = 7.0
    nosy_names = ['ncoghlan', 'nedbat', 'r.david.murray', 'python-dev', 'serhiy.storchaka', 'yselivanov', 'JelleZijlstra']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue19611'
    versions = ['Python 3.6']

    @nedbat
    Copy link
    Member Author

    nedbat commented Nov 15, 2013

    In 2.7, set comprehensions are compiled to code objects expecting an argument named ".0". This convention is also used for the unnamed arguments needed by tuple arguments. inspect.getcallargs understands the tuple argument case, but not the set comprehension case, and throws errors for correct arguments.

    This is also true for generator expressions and dictionary comprehensions.

    Demonstration:

    #-----

    import inspect
    import sys
    import types
    
    def make_set():
        return {z*z for z in range(5)}
    
    print(make_set())
    
    # The set comprehension is turned into a code object expecting a single
    # argument called ".0" with should be an iterator over range(5).
    if sys.version_info < (3,):
        setcomp_code = make_set.func_code.co_consts[1]
    else:
        setcomp_code = make_set.__code__.co_consts[1]
    setcomp_func = types.FunctionType(setcomp_code, {})
    
    # We can successfully call the function with the argument it expects.
    print(setcomp_func(iter(range(5))))
    
    # But inspect can't figure that out, because the ".0" argument also means
    # tuple arguments, which this code object doesn't expect.
    print(inspect.getcallargs(setcomp_func, iter(range(5))))

    #-----

    When run on Python 3.3, this produces:

    {0, 1, 4, 16, 9} 
    {0, 1, 4, 16, 9}
    {'.0': <range_iterator object at 0x10834be70>}
    

    When run on Python 2.7, it produces:

        set([0, 1, 4, 16, 9])
        set([0, 1, 4, 16, 9])
        Traceback (most recent call last):
          File "foo.py", line 17, in <module>
            print(inspect.getcallargs(setcomp_func, iter(range(5))))
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 935, in getcallargs
            assign(arg, value)
          File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 922, in assign
            raise ValueError('too many values to unpack')
        ValueError: too many values to unpack

    @nedbat nedbat added the stdlib Python modules in the Lib dir label Nov 15, 2013
    @nedbat
    Copy link
    Member Author

    nedbat commented Nov 15, 2013

    BTW: I don't hold any illusions that this bug is important enough to fix, but I would be interested in hearing ideas about how I could work around it...

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 2, 2016

    This doesn't work in Python 3.6 (current dev version) either. Using Ned's example, I get (snipping some of the ipython stack trace):

    /home/jelle/cpython-dev/cpython/Lib/inspect.py in __init__(self, name, kind, default, annotation)
    2399 if not name.isidentifier():
    -> 2400 raise ValueError('{!r} is not a valid parameter name'.format(name))
    2401

    ValueError: '.0' is not a valid parameter name

    print(inspect.Signature.from_callable(setcomp_func).bind(iter(range(5))))
    fails with the same error.

    However, both work if I take out the isidentifier check.

    In Python 2.7, the bug is actually in inspect.getargs:

    In [7]: inspect.getargs(setcomp_func.func_code)
    Out[7]: Arguments(args=[['z']], varargs=None, keywords=None)

    which assumes that any ".0" argument is a tuple.

    I'm not sure the bug in 2.7 is worth fixing, since it will require fragile bytecode manipulation to distinguish tuple arguments from set comprehensions, and you can get to this case only by manually creating function objects.

    I think the Python 3 level bug is worth fixing though, since it's an easy fix (just make the .isidentifier call accept names of the form .0). I'm attaching a patch against master with tests. I have signed the contributor agreement.

    @bitdancer
    Copy link
    Member

    bitdancer commented Jun 2, 2016

    The patch looks reasonable to me. I've added Yuri to nosy since he's responsible for the signature code in inspect, and he might have an opinion on the appriateness of the fix.

    I think this special case needs to be documented, since currently the name is documented as needing to be a valid python identifier.

    Sounds like we don't have a good answer to Ned's question about a 2.7 workaround. On the other hand, "fragile byte code inspection" is not really fragile, since the 2.7 bytecode will never change. The real question is if someone cares enough to do the work, since as you say (and Ned admits) the benefit is small.

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 2, 2016

    Thanks for the comment!

    It is fragile in the sense that I think there is a good chance that a fix (which is going to rely on details of generated bytecode) will have bugs and break someone's working 2.7 code.

    The patch's tests have a bug, which I'll fix together with a documentation patch.

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 2, 2016

    Fixed patch, added documentation

    @bitdancer
    Copy link
    Member

    bitdancer commented Jun 2, 2016

    I wonder if we should do the mention of .0 as a footnote, since it won't matter to most users. What do you think, Yuri?

    @1st1
    Copy link
    Member

    1st1 commented Jun 2, 2016

    Jelle, regarding your patch: it would probably be a good idea to only accept ".0"-like names for POSITIONAL_ONLY parameters. Since functions with positional-only parameters can only be created in C, that should make inspect.Parameter less error prone.

    David, maybe this is something that can be fixed using the Arguments Clinic? I'm not sure how to feel about parsing ".0" for parameter names.

    @bitdancer
    Copy link
    Member

    bitdancer commented Jun 2, 2016

    Yes, it feels ugly. I'd rather "fix" the source of the problem (the weird positional attribute name). I have no idea if argument clinic would be of any help here, I haven't looked at how these things come to be.

    @1st1
    Copy link
    Member

    1st1 commented Jun 2, 2016

    Yes, it feels ugly. I'd rather "fix" the source of the problem (the weird positional attribute name).

    +1.

    I have no idea if argument clinic would be of any help here, I haven't looked at how these things come to be.

    Me neither. Serhiy, what do you think about this?

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 2, 2016

    Since this only comes up with manually created functions (through calling types.FunctionType), I think it wouldn't be unreasonable to just not handle this case in the inspect module and close this as wontfix.

    Ned, is there a use case for converting comprehension code objects to functions and inspecting them?

    @ncoghlan ncoghlan self-assigned this Jun 2, 2016
    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 2, 2016

    The origin of the ".0" parameter name is the compiler's implicit symbol definition - we use the "." prefix for those to ensure we don't accidentally collide with actual identifiers used in the code.

    We can see that via dis.dis:

    >>> def make_set():
    ...     return {z*z for z in range(5)}
    ... 
    >>> import dis
    >>> dis.dis(make_set)
      2           0 LOAD_CONST               1 (<code object <setcomp> at 0x7f54c5b2cd20, file "<stdin>", line 2>)
                  3 LOAD_CONST               2 ('make_set.<locals>.<setcomp>')
                  6 MAKE_FUNCTION            0
                  9 LOAD_GLOBAL              0 (range)
                 12 LOAD_CONST               3 (5)
                 15 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
                 18 GET_ITER
                 19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
                 22 RETURN_VALUE
    >>> dis.dis(make_set.__code__.co_consts[1])
      2           0 BUILD_SET                0
                  3 LOAD_FAST                0 (.0)
            >>    6 FOR_ITER                16 (to 25)
                  9 STORE_FAST               1 (z)
                 12 LOAD_FAST                1 (z)
                 15 LOAD_FAST                1 (z)
                 18 BINARY_MULTIPLY
                 19 SET_ADD                  2
                 22 JUMP_ABSOLUTE            6
            >>   25 RETURN_VALUE

    It isn't just set comprehensions that will do this - it's all the implicit nested functions that accept a positional argument (list/dict/set comprehensions, generator expressions)

    I chatted to Brett Cannon about it, and we agree inspect should handle the implicit functions generated by the compiler, even if those signatures can't be expressed in normal Python code. The fact those may be emitted by the affected inspect functions can then be documented as a CPython implementation detail, rather than as a language level behaviour.

    @1st1
    Copy link
    Member

    1st1 commented Jun 2, 2016

    I chatted to Brett Cannon about it, and we agree inspect should handle the implicit functions generated by the compiler, even if those signatures can't be expressed in normal Python code.

    Can we make ".0" parameters positional-only then?
    And rename ".{digit}" into "arg{digit}"?

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 2, 2016

    We definitely can't use a valid identifier in the code generator, since any valid identifier we used might shadow a nonlocal, global or builtin name (and the latter two cases aren't visible to the compiler at compile time).

    They're also genuinely not positional only:

    >>> print(setcomp_func(iter(range(5))))
    {0, 1, 4, 9, 16}
    >>> print(setcomp_func(**{".0": iter(range(5))}))
    {0, 1, 4, 9, 16}

    @1st1
    Copy link
    Member

    1st1 commented Jun 2, 2016

    We definitely can't use a valid identifier in the code generator, since any valid identifier we used might shadow a nonlocal, global or builtin name (and the latter two cases aren't visible to the compiler at compile time).

    I wasn't proposing to fix code generator, but rather to transform the name to something that users will understand in inspect.signature code. ".0" will be extremely hard to google.

    They're also genuinely not positional only:
    >>> print(setcomp_func(**{".0": iter(range(5))}))
    {0, 1, 4, 9, 16}

    My opinion on this: this is a very low-level implementation detail of CPython. If users start abusing Signature.bind for binding ".0" they will write code that will be hard to port to other implementations, and that will also create a requirement for us (CPython devs) to maintain backwards compatibility here.

    What I propose, is to rename ".0" args and to classify them as "positional-only". This would still make it possible to use Signature.bind on such callables, but in a safe way.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 2, 2016

    Oh, I see what you mean, now - we can special case the compiler-generated ".N" names when extracting the signature.

    I like it - make the claimed parameter something like "implicit0" instead of ".0", mention that in the note on the docs, and folks may have some chance of working out what's going on if they manage to stumble across this behaviour.

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented Jun 3, 2016

    I have no idea what relations Argument Clinic have with this issue.

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 3, 2016

    This new patch makes the inspect module treat .0-type names as positional-only and renames them to implicit0. Includes a documentation patch too.

    I'm not too happy with the wording in the documentation, so suggestions for better naming are appreciated.

    @JelleZijlstra
    Copy link
    Member

    JelleZijlstra commented Jun 4, 2016

    Patch adding @cpython_only decorators to the tests.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 4, 2016

    New changeset 6247dd0f230e by Nick Coghlan in branch 'default':
    Issue bpo-19611: handle implicit parameters in inspect.signature
    https://hg.python.org/cpython/rev/6247dd0f230e

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 4, 2016

    Fix committed for 3.6 - this is obscure enough that I'm not going to worry about fixing it on maintenance branches.

    @ncoghlan ncoghlan closed this as completed Jun 4, 2016
    @ncoghlan ncoghlan added the type-bug An unexpected behavior, bug, or error label Jun 4, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants