Skip to content

UnicodeDecodeError when loading pickle from ZODB that was migrated from Python 2 to 3 #285

Closed
@pbauer

Description

@pbauer

I have 18 seemingly broken OISet pickles in by ZODB after migrating it from 2.7 to 3.7.
In py2 all seems to be ok and I can load them from the DB. The same pickle throws a UnicodeDecodeError when I try to load it in Python 3.

I don't know what the actual object is or where in the tree it is used (it's from a medium-sized Plone site). The transaction-description points to plone.app.iterate (working-copies for content) and to content that no longer exists. So I think I could simply remove them.

I migrated the ZODB with zodbupdate: ./bin/zodbupdate --convert-py3 --file=var/filestorage/Data.fs --encoding=utf8. I've done this multiple times so far for various projects and never had this issue.

Below is information I gathered while debugging in py2 and py3.

Python 2:

>>> from ZODB.utils import p64
>>> from plone import api
>>> portal = api.portal.get()
>>> app = portal.__parent__
>>> oid = 0x2c0ab6
>>> oid = p64(oid)
>>> oid
'\x00\x00\x00\x00\x00,\n\xb6'
>>> obj = app._p_jar.get(oid)
>>> obj
OISet([
 <InterfaceClass OFS.EtagSupport.EtagBaseInterface>,
 <InterfaceClass Acquisition.interfaces.IAcquirer>,
 <SchemaClass plone.app.dexterity.behaviors.discussion.IAllowDiscussion>,
 <InterfaceClass zope.annotation.interfaces.IAnnotatable>,
 <InterfaceClass zope.annotation.interfaces.IAttributeAnnotatable>,
 <InterfaceClass plone.uuid.interfaces.IAttributeUUID>,
 <InterfaceClass Products.CMFDynamicViewFTI.interfaces.IBrowserDefault>,
 <InterfaceClass Products.CMFCore.interfaces.ICatalogAware>,
 <InterfaceClass Products.CMFCore.interfaces.ICatalogableDublinCore>,
 <InterfaceClass zope.location.interfaces.IContained>,
 <InterfaceClass Products.CMFCore.interfaces.IContentish>,
 <class 'grokcore.component.interfaces.IContext'>,
 <InterfaceClass OFS.interfaces.ICopySource>,
 <InterfaceClass webdav.interfaces.IDAVResource>,
 <InterfaceClass plone.dexterity.interfaces.IDexterityContent>,
 <InterfaceClass plone.app.relationfield.interfaces.IDexterityHasRelations>,
 <InterfaceClass plone.dexterity.interfaces.IDexterityItem>,
 <InterfaceClass plone.dexterity.interfaces.IDexteritySchema>,
 <InterfaceClass plone.app.contenttypes.interfaces.IDocument>,
 <InterfaceClass Products.CMFCore.interfaces.IDublinCore>,
 <InterfaceClass Products.CMFCore.interfaces.IDynamicType>,
 <SchemaClass plone.app.dexterity.behaviors.exclfromnav.IExcludeFromNavigation>,
 <InterfaceClass webdav.interfaces.IFTPAccess>,
 <InterfaceClass z3c.relationfield.interfaces.IHasIncomingRelations>,
 <InterfaceClass z3c.relationfield.interfaces.IHasOutgoingRelations>,
 <InterfaceClass z3c.relationfield.interfaces.IHasRelations>,
 <InterfaceClass plone.namedfile.interfaces.IImageScaleTraversable>,
 <InterfaceClass OFS.interfaces.IItem>,
 <InterfaceClass plone.app.iterate.interfaces.IIterateAware>,
 <InterfaceClass plone.app.contenttypes.behaviors.leadimage.ILeadImage>
])

>>> pickle, state = app._p_jar.db()._storage.load(oid)
>>> pickle
'cBTrees.OIBTree\nOISet\nq\x01.((cwebdav.EtagSupport\nEtagBaseInterface\nq\x02cAcquisition.interfaces\nIAcquirer\nq\x03cplone.app.dexterity.behaviors.discussion\nIAllowDiscussion\nq\x04czope.annotation.interfaces\nIAnnotatable\nq\x05czope.annotation.interfaces\nIAttributeAnnotatable\nq\x06cplone.uuid.interfaces\nIAttributeUUID\nq\x07cProducts.CMFDynamicViewFTI.interfaces\nIBrowserDefault\nq\x08cProducts.CMFCore.interfaces._content\nICatalogAware\nq\tcProducts.CMFCore.interfaces._content\nICatalogableDublinCore\nq\nczope.location.interfaces\nIContained\nq\x0bcProducts.CMFCore.interfaces._content\nIContentish\nq\x0ccgrokcore.component.interfaces\nIContext\nq\rcOFS.interfaces\nICopySource\nq\x0ecwebdav.interfaces\nIDAVResource\nq\x0fcplone.dexterity.interfaces\nIDexterityContent\nq\x10cplone.app.relationfield.interfaces\nIDexterityHasRelations\nq\x11cplone.dexterity.interfaces\nIDexterityItem\nq\x12cplone.dexterity.interfaces\nIDexteritySchema\nq\x13cplone.app.contenttypes.interfaces\nIDocument\nq\x14cProducts.CMFCore.interfaces._content\nIDublinCore\nq\x15cProducts.CMFCore.interfaces._content\nIDynamicType\nq\x16cplone.app.dexterity.behaviors.exclfromnav\nIExcludeFromNavigation\nq\x17cOFS.interfaces\nIFTPAccess\nq\x18cz3c.relationfield.interfaces\nIHasIncomingRelations\nq\x19cz3c.relationfield.interfaces\nIHasOutgoingRelations\nq\x1acz3c.relationfield.interfaces\nIHasRelations\nq\x1bcplone.namedfile.interfaces\nIImageScaleTraversable\nq\x1ccOFS.interfaces\nIItem\nq\x1dcplone.app.iterate.interfaces\nIIterateAware\nq\x1ecplone.app.contenttypes.behaviors.leadimage\nILeadImage\nq\x1ft(U\x08\x00\x00\x00\x00\x00,\n\xb7q h\x01tQtq!.'

>>> state
'\x03\xa8\x11W\x01h%\xcc'

>>> from io import BytesIO
>>> wrap = BytesIO(pickle)
>>> import pickletools
>>> memo = {}
>>> pickletools.dis(wrap, memo=memo)
    0: c    GLOBAL     'BTrees.OIBTree OISet'
   22: q    BINPUT     1
   24: .    STOP
>>> pickletools.dis(wrap, memo=memo)
   25: (    MARK
   26: (        MARK
   27: c            GLOBAL     'webdav.EtagSupport EtagBaseInterface'
   65: q            BINPUT     2
   67: c            GLOBAL     'Acquisition.interfaces IAcquirer'
  101: q            BINPUT     3
  103: c            GLOBAL     'plone.app.dexterity.behaviors.discussion IAllowDiscussion'
  162: q            BINPUT     4
  164: c            GLOBAL     'zope.annotation.interfaces IAnnotatable'
  205: q            BINPUT     5
  207: c            GLOBAL     'zope.annotation.interfaces IAttributeAnnotatable'
  257: q            BINPUT     6
  259: c            GLOBAL     'plone.uuid.interfaces IAttributeUUID'
  297: q            BINPUT     7
  299: c            GLOBAL     'Products.CMFDynamicViewFTI.interfaces IBrowserDefault'
  354: q            BINPUT     8
  356: c            GLOBAL     'Products.CMFCore.interfaces._content ICatalogAware'
  408: q            BINPUT     9
  410: c            GLOBAL     'Products.CMFCore.interfaces._content ICatalogableDublinCore'
  471: q            BINPUT     10
  473: c            GLOBAL     'zope.location.interfaces IContained'
  510: q            BINPUT     11
  512: c            GLOBAL     'Products.CMFCore.interfaces._content IContentish'
  562: q            BINPUT     12
  564: c            GLOBAL     'grokcore.component.interfaces IContext'
  604: q            BINPUT     13
  606: c            GLOBAL     'OFS.interfaces ICopySource'
  634: q            BINPUT     14
  636: c            GLOBAL     'webdav.interfaces IDAVResource'
  668: q            BINPUT     15
  670: c            GLOBAL     'plone.dexterity.interfaces IDexterityContent'
  716: q            BINPUT     16
  718: c            GLOBAL     'plone.app.relationfield.interfaces IDexterityHasRelations'
  777: q            BINPUT     17
  779: c            GLOBAL     'plone.dexterity.interfaces IDexterityItem'
  822: q            BINPUT     18
  824: c            GLOBAL     'plone.dexterity.interfaces IDexteritySchema'
  869: q            BINPUT     19
  871: c            GLOBAL     'plone.app.contenttypes.interfaces IDocument'
  916: q            BINPUT     20
  918: c            GLOBAL     'Products.CMFCore.interfaces._content IDublinCore'
  968: q            BINPUT     21
  970: c            GLOBAL     'Products.CMFCore.interfaces._content IDynamicType'
 1021: q            BINPUT     22
 1023: c            GLOBAL     'plone.app.dexterity.behaviors.exclfromnav IExcludeFromNavigation'
 1089: q            BINPUT     23
 1091: c            GLOBAL     'OFS.interfaces IFTPAccess'
 1118: q            BINPUT     24
 1120: c            GLOBAL     'z3c.relationfield.interfaces IHasIncomingRelations'
 1172: q            BINPUT     25
 1174: c            GLOBAL     'z3c.relationfield.interfaces IHasOutgoingRelations'
 1226: q            BINPUT     26
 1228: c            GLOBAL     'z3c.relationfield.interfaces IHasRelations'
 1272: q            BINPUT     27
 1274: c            GLOBAL     'plone.namedfile.interfaces IImageScaleTraversable'
 1325: q            BINPUT     28
 1327: c            GLOBAL     'OFS.interfaces IItem'
 1349: q            BINPUT     29
 1351: c            GLOBAL     'plone.app.iterate.interfaces IIterateAware'
 1395: q            BINPUT     30
 1397: c            GLOBAL     'plone.app.contenttypes.behaviors.leadimage ILeadImage'
 1452: q            BINPUT     31
 1454: t            TUPLE      (MARK at 26)
 1455: (        MARK
 1456: U            SHORT_BINSTRING '\x00\x00\x00\x00\x00,\n\xb7'
 1466: q            BINPUT     32
 1468: h            BINGET     1
 1470: t            TUPLE      (MARK at 1455)
 1471: Q        BINPERSID
 1472: t        TUPLE      (MARK at 25)
 1473: q    BINPUT     33
 1475: .    STOP
highest protocol among opcodes = 1

Here the same in Python 3:

>>> from ZODB.utils import p64
>>> from plone import api
>>> portal = api.portal.get()
>>> app = portal.__parent__
>>> oid = 0x2c0ab6
>>> oid = p64(oid)
>>> oid
b'\x00\x00\x00\x00\x00,\n\xb6'

>>> pickle, state = app._p_jar.db()._storage.load(oid)
>>> pickle
b'cBTrees.OIBTree\nOISet\nq\x01.((cwebdav.EtagSupport\nEtagBaseInterface\nq\x02cAcquisition.interfaces\nIAcquirer\nq\x03cplone.app.dexterity.behaviors.discussion\nIAllowDiscussion\nq\x04czope.annotation.interfaces\nIAnnotatable\nq\x05czope.annotation.interfaces\nIAttributeAnnotatable\nq\x06cplone.uuid.interfaces\nIAttributeUUID\nq\x07cProducts.CMFDynamicViewFTI.interfaces\nIBrowserDefault\nq\x08cProducts.CMFCore.interfaces._content\nICatalogAware\nq\tcProducts.CMFCore.interfaces._content\nICatalogableDublinCore\nq\nczope.location.interfaces\nIContained\nq\x0bcProducts.CMFCore.interfaces._content\nIContentish\nq\x0ccgrokcore.component.interfaces\nIContext\nq\rcOFS.interfaces\nICopySource\nq\x0ecwebdav.interfaces\nIDAVResource\nq\x0fcplone.dexterity.interfaces\nIDexterityContent\nq\x10cplone.app.relationfield.interfaces\nIDexterityHasRelations\nq\x11cplone.dexterity.interfaces\nIDexterityItem\nq\x12cplone.dexterity.interfaces\nIDexteritySchema\nq\x13cplone.app.contenttypes.interfaces\nIDocument\nq\x14cProducts.CMFCore.interfaces._content\nIDublinCore\nq\x15cProducts.CMFCore.interfaces._content\nIDynamicType\nq\x16cplone.app.dexterity.behaviors.exclfromnav\nIExcludeFromNavigation\nq\x17cOFS.interfaces\nIFTPAccess\nq\x18cz3c.relationfield.interfaces\nIHasIncomingRelations\nq\x19cz3c.relationfield.interfaces\nIHasOutgoingRelations\nq\x1acz3c.relationfield.interfaces\nIHasRelations\nq\x1bcplone.namedfile.interfaces\nIImageScaleTraversable\nq\x1ccOFS.interfaces\nIItem\nq\x1dcplone.app.iterate.interfaces\nIIterateAware\nq\x1ecplone.app.contenttypes.behaviors.leadimage\nILeadImage\nq\x1ft(U\x08\x00\x00\x00\x00\x00,\n\xb7q h\x01tQtq!.'
>>> state
b'\x03\xa8\x11W\x01h%\xcc'

>>> obj = app._p_jar.get(oid)
>>> obj
ERROR:ZODB.Connection:Couldn't load state for BTrees.OIBTree.OISet 0x2c0ab6
Traceback (most recent call last):
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/Connection.py", line 795, in setstate
    self._reader.setGhostState(obj, p)
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/serialize.py", line 633, in setGhostState
    state = self.getState(pickle)
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/serialize.py", line 626, in getState
    return unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 7: ordinal not in range(128)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/Connection.py", line 795, in setstate
    self._reader.setGhostState(obj, p)
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/serialize.py", line 633, in setGhostState
    state = self.getState(pickle)
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/ZODB/src/ZODB/serialize.py", line 626, in getState
    return unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 7: ordinal not in range(128)

>>> from io import BytesIO
>>> wrap = BytesIO(pickle)
>>> import pickletools
>>> memo = {}
>>> pickletools.dis(wrap, memo=memo)
    0: c    GLOBAL     'BTrees.OIBTree OISet'
   22: q    BINPUT     1
   24: .    STOP
highest protocol among opcodes = 1
>>> pickletools.dis(wrap, memo=memo)
   25: (    MARK
   26: (        MARK
   27: c            GLOBAL     'webdav.EtagSupport EtagBaseInterface'
   65: q            BINPUT     2
   67: c            GLOBAL     'Acquisition.interfaces IAcquirer'
  101: q            BINPUT     3
  103: c            GLOBAL     'plone.app.dexterity.behaviors.discussion IAllowDiscussion'
  162: q            BINPUT     4
  164: c            GLOBAL     'zope.annotation.interfaces IAnnotatable'
  205: q            BINPUT     5
  207: c            GLOBAL     'zope.annotation.interfaces IAttributeAnnotatable'
  257: q            BINPUT     6
  259: c            GLOBAL     'plone.uuid.interfaces IAttributeUUID'
  297: q            BINPUT     7
  299: c            GLOBAL     'Products.CMFDynamicViewFTI.interfaces IBrowserDefault'
  354: q            BINPUT     8
  356: c            GLOBAL     'Products.CMFCore.interfaces._content ICatalogAware'
  408: q            BINPUT     9
  410: c            GLOBAL     'Products.CMFCore.interfaces._content ICatalogableDublinCore'
  471: q            BINPUT     10
  473: c            GLOBAL     'zope.location.interfaces IContained'
  510: q            BINPUT     11
  512: c            GLOBAL     'Products.CMFCore.interfaces._content IContentish'
  562: q            BINPUT     12
  564: c            GLOBAL     'grokcore.component.interfaces IContext'
  604: q            BINPUT     13
  606: c            GLOBAL     'OFS.interfaces ICopySource'
  634: q            BINPUT     14
  636: c            GLOBAL     'webdav.interfaces IDAVResource'
  668: q            BINPUT     15
  670: c            GLOBAL     'plone.dexterity.interfaces IDexterityContent'
  716: q            BINPUT     16
  718: c            GLOBAL     'plone.app.relationfield.interfaces IDexterityHasRelations'
  777: q            BINPUT     17
  779: c            GLOBAL     'plone.dexterity.interfaces IDexterityItem'
  822: q            BINPUT     18
  824: c            GLOBAL     'plone.dexterity.interfaces IDexteritySchema'
  869: q            BINPUT     19
  871: c            GLOBAL     'plone.app.contenttypes.interfaces IDocument'
  916: q            BINPUT     20
  918: c            GLOBAL     'Products.CMFCore.interfaces._content IDublinCore'
  968: q            BINPUT     21
  970: c            GLOBAL     'Products.CMFCore.interfaces._content IDynamicType'
 1021: q            BINPUT     22
 1023: c            GLOBAL     'plone.app.dexterity.behaviors.exclfromnav IExcludeFromNavigation'
 1089: q            BINPUT     23
 1091: c            GLOBAL     'OFS.interfaces IFTPAccess'
 1118: q            BINPUT     24
 1120: c            GLOBAL     'z3c.relationfield.interfaces IHasIncomingRelations'
 1172: q            BINPUT     25
 1174: c            GLOBAL     'z3c.relationfield.interfaces IHasOutgoingRelations'
 1226: q            BINPUT     26
 1228: c            GLOBAL     'z3c.relationfield.interfaces IHasRelations'
 1272: q            BINPUT     27
 1274: c            GLOBAL     'plone.namedfile.interfaces IImageScaleTraversable'
 1325: q            BINPUT     28
 1327: c            GLOBAL     'OFS.interfaces IItem'
 1349: q            BINPUT     29
 1351: c            GLOBAL     'plone.app.iterate.interfaces IIterateAware'
 1395: q            BINPUT     30
 1397: c            GLOBAL     'plone.app.contenttypes.behaviors.leadimage ILeadImage'
 1452: q            BINPUT     31
 1454: t            TUPLE      (MARK at 26)
 1455: (        MARK
 1456: U            SHORT_BINSTRING '\x00\x00\x00\x00\x00,\n·'
 1466: q            BINPUT     32
 1468: h            BINGET     1
 1470: t            TUPLE      (MARK at 1455)
 1471: Q        BINPERSID
 1472: t        TUPLE      (MARK at 25)
 1473: q    BINPUT     33
 1475: .    STOP

The only noteable difference I see is the SHORT_BINSTRING where \xb7 is turned into a middle dot · but I've no idea if that means something.

I worked around the whole problem by emptying the problematic OISets manually in Python 2 and then redoing the migration to Python 3. Packing and fsrefs then worked again:

def empty_broken_sets():
    portal = api.portal.get()
    app = portal.__parent__
    broken_oids = [0x2c0ab6, 0x2c0ab8, 0x2c56c4, 0x2c56c6, 0x2c57ca, 0x2c57cc, 0x2c5896, 0x2c5898, 0x2e929e, 0x2e92a0, 0x330780, 0x330782, 0x33835f, 0x338361, 0x356df9, 0x356dfb, 0x35907c, 0x35907e]
    from ZODB.utils import p64
    for oid in broken_oids:
        oid = p64(oid)
        obj = app._p_jar.get(oid)
        obj.clear()
        obj._p_changed = 1

It would be better to fix the underlying issue but so far I don't know what it is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions