Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

namedtuple should support fully qualified name for more portable pickling #62141

Closed
elibendersky mannequin opened this issue May 9, 2013 · 19 comments
Closed

namedtuple should support fully qualified name for more portable pickling #62141

elibendersky mannequin opened this issue May 9, 2013 · 19 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented May 9, 2013

BPO 17941
Nosy @gvanrossum, @warsaw, @rhettinger, @ncoghlan, @ericvsmith, @serhiy-storchaka, @sdrees, @ctismer, @MojoVampire, @ctismer
Files
  • nt_module.diff: First draft patch to collections
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2016-09-12.07:19:09.245>
    created_at = <Date 2013-05-09.03:46:56.729>
    labels = ['type-bug', 'library']
    title = 'namedtuple should support fully qualified name for more portable pickling'
    updated_at = <Date 2016-09-12.07:19:09.243>
    user = 'https://bugs.python.org/elibendersky'

    bugs.python.org fields:

    activity = <Date 2016-09-12.07:19:09.243>
    actor = 'rhettinger'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2016-09-12.07:19:09.245>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2013-05-09.03:46:56.729>
    creator = 'eli.bendersky'
    dependencies = []
    files = ['30243']
    hgrepos = ['198']
    issue_num = 17941
    keywords = ['patch']
    message_count = 19.0
    messages = ['188748', '188762', '188775', '188785', '188835', '188836', '188838', '189086', '189167', '189168', '189171', '189177', '190977', '220444', '220521', '220523', '220568', '244149', '275982']
    nosy_count = 14.0
    nosy_names = ['gvanrossum', 'barry', 'rhettinger', 'ncoghlan', 'eric.smith', 'eli.bendersky', 'BreamoreBoy', 'python-dev', 'sbt', 'serhiy.storchaka', 'dilettant', 'Christian.Tismer', 'josh.r', 'ctismer']
    pr_nums = []
    priority = 'low'
    resolution = 'fixed'
    stage = 'needs patch'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17941'
    versions = ['Python 3.6']

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented May 9, 2013

    [this came up as part of the Enum discussions. Full details in this thread: http://mail.python.org/pipermail/python-dev/2013-May/126076.html]

    namedtuple currently uses this code to obtain the __module__ for the class it creates dynamically so that pickling works:

      result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__')

    This may not work correctly on all Python implementations, for example IronPython.

    To support some way to pickle on all implementations, namedtuple should support a fully qualified name for the class:

      Point = namedtuple('mymodule.Point', ['x', 'y'])

    If the name is a qualified dotted name, it will be split and the first part becomes the __module__.

    @elibendersky elibendersky mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 9, 2013
    @sbt
    Copy link
    Mannequin

    sbt mannequin commented May 9, 2013

    If the name is a qualified dotted name, it will be split and the first
    part becomes the __module__.

    That will not work correctly if the module name has a dot in it.

    @ericvsmith
    Copy link
    Member

    ericvsmith commented May 9, 2013

    I agree it should work the same as Enum, and I agree it should be possible to supply the module name. But wouldn't it be cleaner as:

    Point = namedtuple('Point', 'x y z', module=__name__)

    rather than:

    Point = namedtuple(__name__ + '.Point', 'x y z')

    ?

    @gvanrossum
    Copy link
    Member

    gvanrossum commented May 9, 2013

    Agreed with Eric. We're already modifying PEP-435 to do it that way.

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented May 10, 2013

    A question that came up while reviewing the new enum code: "module" or "module_name" for this extra argument? The former is shorter, but the latter is more correct in a way.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented May 10, 2013

    Module, please. The class attribute is also called __module__ after all.

    On Friday, May 10, 2013, Eli Bendersky wrote:

    Eli Bendersky added the comment:

    A question that came up while reviewing the new enum code: "module" or
    "module_name" for this extra argument? The former is shorter, but the
    latter is more correct in a way.

    ----------


    Python tracker <report@bugs.python.org <javascript:;>>
    <http://bugs.python.org/issue17941\>


    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented May 10, 2013

    On Fri, May 10, 2013 at 7:02 AM, Guido van Rossum <report@bugs.python.org>wrote:

    Guido van Rossum added the comment:

    Module, please. The class attribute is also called __module__ after all.

    Makes sense. Thanks.

    @rhettinger rhettinger self-assigned this May 11, 2013
    @rhettinger
    Copy link
    Contributor

    rhettinger commented May 13, 2013

    Marking this a low priority. I want to see how it pans out for Enums before adding a new parameter to the namedtuple API. Also, I would like to look at the test cases for Enum so I can see how you're writing tests that would fail without this parameter.

    @gvanrossum
    Copy link
    Member

    gvanrossum commented May 13, 2013

    I'd like to propose one slight tweak to the patch. (Also to enum.py.)

    If no module name was passed and _getframe() fails, can you set the __module__ attribute to something that will cause pickling the object to fail? That would be much better than letting it pickle but then being unable to unpickle it.

    (TBH I'm not sure how this should be accomplished, and if it can't, forget that I said it. But it would be nice.)

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented May 13, 2013

    When pickling a class (or instance of a class) there is already a check
    that the invariant

     getattr(sys.modules[cls.__module__], cls.__name__) == cls
    

    holds.

     >>> import pickle
     >>> class A: pass
    ...
     >>> A.__module__ = 'nonexistent'
     >>> pickle.dumps(A())
    Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
    _pickle.PicklingError: Can't pickle <class 'nonexistent.A'>: import of 
    module 'nonexistent' failed

    @gvanrossum
    Copy link
    Member

    gvanrossum commented May 13, 2013

    Great, forget I said anything then.

    LGTM to the patch, feel free to commit (with update to Misc/NEWS please).

    @warsaw
    Copy link
    Member

    warsaw commented May 13, 2013

    LGTM too. Needs test and docs.

    @ctismer
    Copy link
    Contributor

    ctismer commented Jun 11, 2013

    I would like to make an additional suggestion.
    (and I implemented this yesterday):

    Native namedtuple (not a derived class) can be made much simpler to handle
    when no module and class machinery is involved at all.

    The way I implemented it has no need for sys._getframes, and it does
    not store a reference to the class at all.

    The advantage of doing so is that this maximizes the compatibility
    with ordinary tuples. Ordinary tuples have no pickling issue at all,
    and this way the named tuple should behave as well.

    My implementation re-creates the namedtuple classes on the fly by a
    function in __reduce_ex__. There is no speed penalty for this because
    of caching the classes by their unique name and set of field names.

    This is IMHO the way things should work:
    A namedtuple replaces a builtin type, so it has the same pickling
    behavior: nothing needed.

    Rationale:
    tuples are used everywhere and dynamically. namedtuple should be as
    compatible to that as possible. By having to specify a module etc., this dynamic is partially lost.

    Limitation:
    When a class is derived from namedtuple, pickling support is no longer
    automated. This is compatible with ordinary tuples as well.

    Cheers - Chris

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jun 13, 2014

    Just slipped under the radar?

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Jun 14, 2014

    I was already thinking of the same solution Christian.Tismer proposed before I reached his post. namedtuples seem cleaner if they naturally act as singletons, so (potentially whether or not pickling is involved) declaring a namedtuple with the same name and fields twice returns the same class. Removes the need to deal with module qualified names, and if pickle can be made to support it, the namedtuple class itself could be pickled uniquely in such a way that it could be recreated by someone else who didn't even have a local definition of the namedtuple, the module that defines it, etc.

    @rhettinger
    Copy link
    Contributor

    rhettinger commented Jun 14, 2014

    Just slipped under the radar?

    Nope, I'll get to it before long.

    @ctismer
    Copy link
    Contributor

    ctismer commented Jun 14, 2014

    Ok, I almost forgot about this because I thought my idea
    was not considered, and wondered if I should keep that code online.

    It is still around, and I could put it into an experimental branch
    if someone asks for it.

    Missing pull-request on python.org.

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented May 27, 2015

    That will not work correctly if the module name has a dot in it.

    Pickling qualified names with arbitrary number of dots is supported in 3.4 with protocol 4 and in 3.5 with all protocols (backward compatibly).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 12, 2016

    New changeset c8851a0ce7ca by Raymond Hettinger in branch 'default':
    Issue bpo-17941: Add a *module* parameter to collections.namedtuple()
    https://hg.python.org/cpython/rev/c8851a0ce7ca

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants