Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying class __dict__ inside __set_name__ #72983

Closed
Wombatz mannequin opened this issue Nov 25, 2016 · 15 comments
Closed

Modifying class __dict__ inside __set_name__ #72983

Wombatz mannequin opened this issue Nov 25, 2016 · 15 comments
Labels
3.7 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@Wombatz
Copy link
Mannequin

Wombatz mannequin commented Nov 25, 2016

BPO 28797
Nosy @rhettinger, @ncoghlan, @ned-deily, @serhiy-storchaka, @tecki, @Vgr255
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • reproduce.py: Example of how to reproduce the bug
  • set_name-dict-copy.patch
  • set_name-filtered-copy.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-12-04.07:13:13.832>
    created_at = <Date 2016-11-25.02:48:28.883>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = 'Modifying class __dict__ inside __set_name__'
    updated_at = <Date 2017-03-31.16:36:20.327>
    user = 'https://bugs.python.org/Wombatz'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:20.327>
    actor = 'dstufft'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-12-04.07:13:13.832>
    closer = 'ncoghlan'
    components = ['Interpreter Core']
    creation = <Date 2016-11-25.02:48:28.883>
    creator = 'Wombatz'
    dependencies = []
    files = ['45632', '45635', '45681']
    hgrepos = []
    issue_num = 28797
    keywords = ['patch']
    message_count = 15.0
    messages = ['281674', '281675', '281681', '281939', '281942', '281945', '281949', '281950', '281951', '281952', '281954', '281955', '281956', '281960', '282321']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'ncoghlan', 'ned.deily', 'python-dev', 'serhiy.storchaka', 'Martin.Teichmann', 'abarry', 'Wombatz']
    pr_nums = ['552']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue28797'
    versions = ['Python 3.6', 'Python 3.7']

    @Wombatz
    Copy link
    Mannequin Author

    Wombatz mannequin commented Nov 25, 2016

    This behavior occurs at least in python-3.6.0b3 and python-3.6.0b4.

    Modifying the class __dict__ inside the __set_name__ method of a descriptor that is used inside that class can lead to a bug that possibly prevents other descriptors from being initialized (the call to their __set_name__ is prevented).

    That happens because internally the cpython interpreter iterates the class __dict__ when calling all the __set_name__ methods. Changing the class __dict__ while iterating it leads to undefined behavior. This is the line in the source code https://github.com/python/cpython/blob/master/Objects/typeobject.c#L7010

    See the attached file for an example where the bug can be observed. It defines a "desc" class that implements the __set_name__ method where it prints the name and modifies the class __dict__ of the containing class. Later a class is defined that has multiple instances of the "desc" class as class attributes. Depending on the number of attributes not all of them are initialized.

    When you see the underlying C-Code the reason for this behavior is obvious but in python itself the problem might not be apparent.

    To fix this bug the class __dict__ could be cashed and then the __set_name__ methods could be called while iterating over that copy.

    @Wombatz Wombatz mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Nov 25, 2016
    @Vgr255
    Copy link
    Mannequin

    Vgr255 mannequin commented Nov 25, 2016

    Ow!

    I can confirm the bug still happens on latest trunk. Nice finding!

    @Vgr255 Vgr255 mannequin added the release-blocker label Nov 25, 2016
    @serhiy-storchaka
    Copy link
    Member

    Proposed patch iterates a copy of __dict__.

    @serhiy-storchaka serhiy-storchaka added the 3.7 only security fixes label Nov 25, 2016
    @ned-deily
    Copy link
    Member

    Nick, as the original shepherd of PEP-487 (bpo-27366), can you review Serhiy's proposed patch for 3.6.0rc1? Thanks!

    @ncoghlan
    Copy link
    Contributor

    Rather than taking a snapshot of the whole class namespace, I think it will make more sense to make a new empty dictionary of descriptors to notify, and then split the current loop into two loops:

    • the first loop adds descriptors with __set_name__ attributes to the notification dict
    • the second loop calls __set_name__ on all the descriptors in the notification dict

    Serhiy's patch effectively already does that via the initial PyDict_Copy, but that approach also redundantly copies items that don't define __set_name__ into the snapshot and then filters them out on the second pass.

    Reviewing the patch also made me realise we're currently missing a specification of the expected behaviour in https://docs.python.org/dev/reference/datamodel.html#creating-the-class-object.

    I suggest adding the following paragraph between the one about setting __class__ and the one about calling class descriptors:

    """
    When using the default metaclass :cls:`type`, or any metaclass that ultimately calls ``type.__new__``, the following additional customisation steps are invoked:

    • first, type.__new__ collects all of the descriptors in the class namespace that define a __set_name__ method
    • second, all of these __set_name__ methods are called with the class being defined and the assigned name of that particular descriptor
    • finally, the __init_subclass__ hook is called on the immediate parent of the new class in its method resolution order
      """

    @serhiy-storchaka
    Copy link
    Member

    I considered this option, but if save only descriptors with the __set_name__ attribute, this would need either double looking up the __set_name__ attribute or packing it in a tuple with a value. All this too cumbersome. I chose copying all dictionary as the simplest way.

    @ncoghlan
    Copy link
    Contributor

    I was thinking that the notification dict could consist of mappings from attribute names to already bound __set_name__ methods, so the double lookup would be avoided that way (but hadn't actually sent the review where I suggested that).

    That is, on the second loop, the actual method call would be:

        tmp = PyObject_CallFunctionObjArgs(value, type, key, NULL);

    Have I missed something that would prevent that from working?

    @serhiy-storchaka
    Copy link
    Member

    Have I missed something that would prevent that from working?

    Error message contains value->ob_type->tp_name.

    @ncoghlan
    Copy link
    Contributor

    Ah, I'd missed that.

    In that case, I think my suggestion to change the name of the local variable is still valid, while the specific approach just needs a comment saying it's handled as a full namespace snapshot so the descriptor type name can be reported in any error messages.

    @serhiy-storchaka
    Copy link
    Member

    Here is much more complex patch that implements Nick's suggestion.

    @tecki
    Copy link
    Mannequin

    tecki mannequin commented Nov 29, 2016

    I personally prefer the first patch, which iterates over a copy of __dict__. Making a copy of a dict is actually a pretty fast operation, I would even expect that it is faster than the proposed alternative, creating tuples.

    Even if the second approach should be faster, the added code complexity is not worth the effort, as this is a code path where speed shouldn't matter much.

    @ncoghlan
    Copy link
    Contributor

    I'd be OK with starting with the simpler patch, and only looking at the more complex one if the benchmark suite shows a significant slowdown (I'd be surprised if it did, as it should mainly impact start-up, and class dicts generally aren't that big)

    @ncoghlan
    Copy link
    Contributor

    (Plus, as Martin noted, the tuple creation overhead could easily make the more complex patch slower in many cases)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 29, 2016

    New changeset 6b8f7d1e2ba4 by Serhiy Storchaka in branch '3.6':
    Issue bpo-28797: Modifying the class __dict__ inside the __set_name__ method of
    https://hg.python.org/cpython/rev/6b8f7d1e2ba4

    New changeset 18ed518d2eef by Serhiy Storchaka in branch 'default':
    Issue bpo-28797: Modifying the class __dict__ inside the __set_name__ method of
    https://hg.python.org/cpython/rev/18ed518d2eef

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 4, 2016

    Regarding the docs suggestion above, I need to make some other docs changes for bpo-23722, so I'll roll that update in with those.

    Given that, I think we can mark this issue as resolved.

    @ncoghlan ncoghlan closed this as completed Dec 4, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants