Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python doc does not say that the state kwarg in Pickler.save_reduce can be a tuple (and not only a dict) #80114

Closed
pierreglaser mannequin opened this issue Feb 7, 2019 · 9 comments
Labels
3.8 only security fixes docs Documentation in the Doc dir

Comments

@pierreglaser
Copy link
Mannequin

pierreglaser mannequin commented Feb 7, 2019

BPO 35933
Nosy @rhettinger, @pitrou, @avassalotti, @serhiy-storchaka, @pierreglaser
PRs
  • bpo-35933: Document/fix case when __getstate__ returns a tuple of 2 dictionnaries #11955
  • Files
  • test_slots.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-05-18.12:39:24.819>
    created_at = <Date 2019-02-07.17:55:22.032>
    labels = ['3.8', 'docs']
    title = 'python doc does not say that the state kwarg in Pickler.save_reduce can be a tuple (and not only a dict)'
    updated_at = <Date 2019-05-18.12:39:24.819>
    user = 'https://github.com/pierreglaser'

    bugs.python.org fields:

    activity = <Date 2019-05-18.12:39:24.819>
    actor = 'pierreglaser'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2019-05-18.12:39:24.819>
    closer = 'pierreglaser'
    components = ['Documentation']
    creation = <Date 2019-02-07.17:55:22.032>
    creator = 'pierreglaser'
    dependencies = []
    files = ['48114']
    hgrepos = []
    issue_num = 35933
    keywords = ['patch']
    message_count = 9.0
    messages = ['335031', '335033', '335041', '335043', '335044', '335066', '336102', '336105', '336106']
    nosy_count = 6.0
    nosy_names = ['rhettinger', 'pitrou', 'alexandre.vassalotti', 'docs@python', 'serhiy.storchaka', 'pierreglaser']
    pr_nums = ['11955']
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue35933'
    versions = ['Python 3.8']

    @pierreglaser
    Copy link
    Mannequin Author

    pierreglaser mannequin commented Feb 7, 2019

    Hello all,
    This 16-year old commit (*) allows an object's state to be updated
    using its slots instead of its __dict__ at unpickling time. To use this
    functionality, the state keyword-argument of Pickler.save_reduce (which maps to
    the third item of the tuple returned by __reduce__) should be a length-2 tuple.
    As far as I can tell, this is not mentioned in the documentation (**). I suggest having the docs updated. What do you think?

    (*) ac5b5d2
    (**) https://docs.python.org/3.8/library/pickle.html#object.\_\_reduce__

    @pierreglaser pierreglaser mannequin added the 3.8 only security fixes label Feb 7, 2019
    @pierreglaser pierreglaser mannequin assigned docspython Feb 7, 2019
    @pierreglaser pierreglaser mannequin added the docs Documentation in the Doc dir label Feb 7, 2019
    @pitrou
    Copy link
    Member

    pitrou commented Feb 7, 2019

    Does it still work? With both the C and Python pickler?
    Can you post an example?

    @pierreglaser
    Copy link
    Mannequin Author

    pierreglaser mannequin commented Feb 7, 2019

    It turns out that both pickle and _pickle implement this feature, but the behavior is inconsistent.

    • As a reminder, instances of slotted classes do not have a dict attribute (1)
    • On the other side, when pickling slotted class instances, __getstate__ can return a tuple of 2 dicts. The first dict represents the __dict__ attribute. Because of (1), this first dict should simply be a sentinel value. In pickle, the condition is that it evaluates to False, but in _pickle, it should be explicitly None.

    (- Finally, The second dict in state contains the slotted attribute. )

    Here are the lines in the two files causing the inconsistent behavior:
    https://github.com/python/cpython/blob/master/Modules/_pickle.c#L6236
    https://github.com/python/cpython/blob/master/Lib/pickle.py#L1549

    I included an example that illustrates it.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 7, 2019

    You can have both a dict and slots by subclassing:

    >>> class A: 
    ...:     __slots__ = ('x',) 
    ...:                                                                                                                                                                   
    >>> class B(A): pass                                                                                                                                                   
    >>>                                                                                                                                                                    
    >>> b = B()                                                                                                                                                            
    >>> b.x = 5                                                                                                                                                            
    >>> b.y = 6                                                                                                                                                            
    >>> b.__dict__                                                                                                                                                         
    {'y': 6}
    >>> A.x                                                                                                                                                                
    <member 'x' of 'A' objects>
    >>> A.x.__get__(b)                                                                                                                                                     
    5

    @rhettinger
    Copy link
    Contributor

    Interestingly, you can also put an instance dict in slots:

    >>> class A:
    	__slots__ = ['x', '__dict__']
    	
    >>> a = A()
    >>> a.x = 5
    >>> a.y = 6
    >>> a.__dict__
    {'y': 6}
    >>> a.x
    5

    @pierreglaser
    Copy link
    Mannequin Author

    pierreglaser mannequin commented Feb 8, 2019

    Thanks Antoine and Raymond for the feedback.

    Indeed, a subclass of a slotted class can have a dict: I enriched the script, pickling_depickling instances of such subclasses, with the length-2 tuple __getstate__ method, and made sure their attributes were properly retrieved.

    Apart from the different checks on state carried out in the c load_build and the python load_build, AFAICT, it seems like this feature works :)

    @serhiy-storchaka
    Copy link
    Member

    A slotted class will have a dict also when it inherits it from a non-slotted class. This is why the base class of slotted class should have slots if you do not want an instance dict.

    __getstate__ and __setstate__ for slotted classes are described in PEP-307. Unfortunately this was not copied to the module documentation.

    @pierreglaser
    Copy link
    Mannequin Author

    pierreglaser mannequin commented Feb 20, 2019

    I added a PR with a small patch to document this behavior and reconcile _pickle.c and pickle.py

    Some explanations on why I am pushing this forward:

    Pickling instances of classes/subclasses with slots is done natively for pickle protocol >= 2. Mentioning this behavior in the docs should *not* make the user worry about implementing custom __getstate__ methods just to preserve slots.

    Here is the reason why I think this functionality (allowing state and slotstate) is worth documenting: pickle gives us a lot of flexibility for reducing thanks to the dispatch_table. But unpickling is rather rigid: if no __setstate__ exists, we have to deal with the default state updating procedure present in load_build. Might as well document all of it ;)

    @serhiy-storchaka
    Copy link
    Member

    See also bpo-26579 which propose to add a function or method for standard implementation of __getstate__ and use it consistently in custom __getstate__ implementations. I am not sure about API yet.

    @pierreglaser pierreglaser mannequin closed this as completed May 18, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes docs Documentation in the Doc dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants