Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POSKeyError while packing #325

Open
krissik opened this issue Aug 13, 2019 · 2 comments

Comments

@krissik
Copy link

commented Aug 13, 2019

This is related to #274. @andbag could reproduce what happend to the database. Write access while packing seems to cause the problem. What he did:

  • Create a plain plone site
  • Create 1110 folders (the problem does not occur if the site is empty)
  • Run script to stress the site (reindex random objects in a loop)
  • Run zodbpack in series (until error in stress-script occurs)

After some runs of zodbpack the stress-script ends with an error. This error was ZODB.POSException.ReadConflictError, relstorage.adapters.interfaces.UnableToLockRowsToReadCurrentError or immediately POSKeyError. After starting the plone site there are a lot of POSKeyErrors.

The first pack runs did only find a few objects to remove but the run that cause to problem did find a lot of (almost all) objects to remove.

Output of zodbpack without error in stress script:

# /opt/plone/worker0/instance/bin/zodbpack  -d 0 /opt/plone/worker0/instance/zodbpack.conf
2019-08-13 09:49:03,921 [zodbpack] INFO Opening storage (RelStorageFactory)...
2019-08-13 09:49:03,989 [relstorage] INFO Reading stored procedures from /opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/postgresql/procs and /opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/postgresql/procs/hf
2019-08-13 09:49:04,127 [zodbpack] INFO Packing storage (RelStorageFactory).
2019-08-13 09:49:04,140 [relstorage.storage.pack] INFO pack: beginning pre-pack
2019-08-13 09:49:04,154 [relstorage.storage.pack] INFO Analyzing transactions committed Tue Aug 13 09:49:01 2019 or before (TID 275204964550870272)
2019-08-13 09:49:04,692 [relstorage.adapters.packundo] INFO pre_pack: analyzing references from 187 object(s)
2019-08-13 09:49:04,941 [relstorage.adapters.packundo] INFO pre_pack: objects analyzed: 187/187
2019-08-13 09:49:04,942 [relstorage.adapters.packundo] INFO pre_pack: filling the pack_object table
2019-08-13 09:49:05,085 [relstorage.adapters.packundo] INFO pre_pack: downloading pack_object and object_ref.
2019-08-13 09:49:05,177 [relstorage.adapters.packundo] INFO pre_pack: traversing the object graph to find reachable objects.
2019-08-13 09:49:05,604 [relstorage.adapters.packundo] INFO pre_pack: marking objects reachable: 18729
2019-08-13 09:49:05,965 [relstorage.adapters.packundo] INFO pre_pack: finished successfully
2019-08-13 09:49:05,966 [relstorage.storage.pack] INFO pack: pre-pack complete
2019-08-13 09:49:05,993 [relstorage.adapters.packundo] INFO pack: will remove 46 object(s)
2019-08-13 09:49:06,045 [relstorage.adapters.packundo] INFO pack: cleaning up
2019-08-13 09:49:06,201 [relstorage.adapters.packundo] INFO pack: finished successfully
2019-08-13 09:49:07,329 [zodbpack] INFO Packed storage (RelStorageFactory).

Output of zodbpack with error in stress script:

# /opt/plone/worker0/instance/bin/zodbpack  -d 0 /opt/plone/worker0/instance/zodbpack.conf
2019-08-13 09:49:18,967 [zodbpack] INFO Opening storage (RelStorageFactory)...
2019-08-13 09:49:19,036 [relstorage] INFO Reading stored procedures from /opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/postgresql/procs and /opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/postgresql/procs/hf
2019-08-13 09:49:19,179 [zodbpack] INFO Packing storage (RelStorageFactory).
2019-08-13 09:49:19,193 [relstorage.storage.pack] INFO pack: beginning pre-pack
2019-08-13 09:49:19,207 [relstorage.storage.pack] INFO Analyzing transactions committed Tue Aug 13 09:49:16 2019 or before (TID 275204965579591253)
2019-08-13 09:49:21,027 [relstorage.adapters.packundo] INFO pre_pack: analyzing references from 174 object(s)
2019-08-13 09:49:21,584 [relstorage.adapters.packundo] INFO pre_pack: objects analyzed: 174/174
2019-08-13 09:49:21,584 [relstorage.adapters.packundo] INFO pre_pack: filling the pack_object table
2019-08-13 09:49:23,553 [relstorage.adapters.packundo] INFO pre_pack: downloading pack_object and object_ref.
2019-08-13 09:49:23,662 [relstorage.adapters.packundo] INFO pre_pack: traversing the object graph to find reachable objects.
2019-08-13 09:49:23,838 [relstorage.adapters.packundo] INFO pre_pack: marking objects reachable: 102
2019-08-13 09:49:27,227 [relstorage.adapters.packundo] INFO pre_pack: finished successfully
2019-08-13 09:49:27,227 [relstorage.storage.pack] INFO pack: pre-pack complete
2019-08-13 09:49:27,287 [relstorage.adapters.packundo] INFO pack: will remove 18592 object(s)
2019-08-13 09:49:30,377 [relstorage.adapters.packundo] INFO pack: removed 2000 (10.8%) state(s)
2019-08-13 09:49:31,984 [relstorage.adapters.packundo] INFO pack: removed 4200 (22.6%) state(s)
2019-08-13 09:49:36,357 [relstorage.adapters.packundo] INFO pack: removed 5000 (26.9%) state(s)
2019-08-13 09:49:40,230 [relstorage.adapters.packundo] INFO pack: removed 7000 (37.7%) state(s)
2019-08-13 09:49:45,810 [relstorage.adapters.packundo] INFO pack: removed 9300 (50.0%) state(s)
2019-08-13 09:49:48,329 [relstorage.adapters.packundo] INFO pack: removed 11500 (61.9%) state(s)
2019-08-13 09:49:50,802 [relstorage.adapters.packundo] INFO pack: removed 13900 (74.8%) state(s)
2019-08-13 09:49:55,454 [relstorage.adapters.packundo] INFO pack: removed 16300 (87.7%) state(s)
2019-08-13 09:50:00,589 [relstorage.adapters.packundo] INFO pack: removed 18300 (98.4%) state(s)
2019-08-13 09:50:06,011 [relstorage.adapters.packundo] INFO pack: cleaning up
2019-08-13 09:50:07,055 [relstorage.adapters.packundo] INFO pack: finished successfully

Sometimes the pack script hangs after finishing and have to be killed.

We tested this with RelStorage 2.0.0 and 3.0a7.

Example output of stress-script with RelStorage 2.0.0:

...
Traceback (most recent call last):
  File "/opt/plone/worker0/instance/parts/instance/bin/interpreter", line 375, in <module>
    exec(_val)
  File "<string>", line 1, in <module>
  File "./stress.py", line 92, in <module>
    stress(app)
  File "./stress.py", line 86, in stress
    transaction.commit()
  File "/opt/plone/buildout-cache/eggs/transaction-2.1.2-py2.7.egg/transaction/_manager.py", line 131, in commit
    return self.get().commit()
  File "/opt/plone/buildout-cache/eggs/transaction-2.1.2-py2.7.egg/transaction/_transaction.py", line 310, in commit
    reraise(t, v, tb)
  File "/opt/plone/buildout-cache/eggs/transaction-2.1.2-py2.7.egg/transaction/_transaction.py", line 301, in commit
    self._commitResources()
  File "/opt/plone/buildout-cache/eggs/transaction-2.1.2-py2.7.egg/transaction/_transaction.py", line 446, in _commitResources
    reraise(t, v, tb)
  File "/opt/plone/buildout-cache/eggs/transaction-2.1.2-py2.7.egg/transaction/_transaction.py", line 423, in _commitResources
    rm.tpc_vote(self)
  File "/opt/plone/buildout-cache/eggs/ZODB-5.3.0-py2.7.egg/ZODB/Connection.py", line 693, in tpc_vote
    s = vote(transaction)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 133, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-2.0.0-py2.7-linux-x86_64.egg/relstorage/storage.py", line 953, in tpc_vote
    resolved_by_vote = self._vote()
  File "/opt/plone/buildout-cache/eggs/RelStorage-2.0.0-py2.7-linux-x86_64.egg/relstorage/storage.py", line 1001, in _vote
    oid=oid, serials=(actual, expect))
ZODB.POSException.ReadConflictError: database read conflict error (oid 0x0be6, serial this txn started with 0x03d1a45f50bdc0bb 2019-08-09 13:51:18.923724, serial currently committed 0x00 1900-01-01 00:00:00.000000)

Example output of stress-script with RelStorage 3.0a7:

...
2019-08-13 09:49:30,923 - StressTest - INFO - Reindex brains
2019-08-13 09:49:35,856 - StressTest - INFO - Reindex brains
2019-08-13 09:49:40,897 - StressTest - INFO - Reindex brains
2019-08-13 09:49:40 WARNING relstorage.storage.load POSKeyError on oid 30169: no tid found; history-free adapter
2019-08-13 09:49:40,908 - relstorage.storage.load - WARNING - POSKeyError on oid 30169: no tid found; history-free adapter
2019-08-13 09:49:40 ERROR ZODB.Connection Couldn't load state for plone.app.contenttypes.content.Folder 0x75d9
Traceback (most recent call last):
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/load.py", line 120, in load
    raise POSKeyError(oid)
POSKeyError: 0x75d9
2019-08-13 09:49:40,908 - ZODB.Connection - ERROR - Couldn't load state for plone.app.contenttypes.content.Folder 0x75d9
Traceback (most recent call last):
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/load.py", line 120, in load
    raise POSKeyError(oid)
POSKeyError: 0x75d9
2019-08-13 09:49:40 WARNING relstorage.storage.load POSKeyError on oid 30169: no tid found; history-free adapter
2019-08-13 09:49:40,911 - relstorage.storage.load - WARNING - POSKeyError on oid 30169: no tid found; history-free adapter
2019-08-13 09:49:40 ERROR ZODB.Connection Couldn't load state for plone.app.contenttypes.content.Folder 0x75d9
Traceback (most recent call last):
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/load.py", line 120, in load
    raise POSKeyError(oid)
POSKeyError: 0x75d9
2019-08-13 09:49:40,911 - ZODB.Connection - ERROR - Couldn't load state for plone.app.contenttypes.content.Folder 0x75d9
Traceback (most recent call last):
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/load.py", line 120, in load
    raise POSKeyError(oid)
POSKeyError: 0x75d9
Traceback (most recent call last):
  File "/opt/plone/worker1/instance/parts/instance/bin/interpreter", line 313, in <module>
    exec(_val)
  File "<string>", line 1, in <module>
  File "/opt/plone/plone-notes/stress.py", line 97, in <module>
    stress(app)
  File "/opt/plone/plone-notes/stress.py", line 89, in stress
    ob.reindexObject()
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/load.py", line 120, in load
    raise POSKeyError(oid)
ZODB.POSException.POSKeyError: 0x75d9

and

...
2019-08-13 09:33:52,282 - StressTest - INFO - Reindex brains
Traceback (most recent call last):
  File "/opt/plone/worker1/instance/parts/instance/bin/interpreter", line 313, in <module>
    exec(_val)
  File "<string>", line 1, in <module>
  File "/opt/plone/plone-notes/stress.py", line 97, in <module>
    stress(app)
  File "/opt/plone/plone-notes/stress.py", line 91, in stress
    transaction.commit()
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_manager.py", line 252, in commit
    return self.manager.commit()
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_manager.py", line 131, in commit
    return self.get().commit()
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 311, in commit
    reraise(t, v, tb)
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 302, in commit
    self._commitResources()
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 447, in _commitResources
    reraise(t, v, tb)
  File "/opt/plone/buildout-cache/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 424, in _commitResources
    rm.tpc_vote(self)
  File "/opt/plone/buildout-cache/eggs/ZODB-5.5.0-py2.7.egg/ZODB/Connection.py", line 692, in tpc_vote
    s = vote(transaction)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 133, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/__init__.py", line 442, in tpc_vote
    next_phase = self._tpc_phase.tpc_vote(transaction, self)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/tpc/begin.py", line 97, in tpc_vote
    next_phase.enter(storage)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/tpc/vote.py", line 124, in enter
    resolved_in_vote_oid_ints = self.__vote(storage)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/storage/tpc/vote.py", line 270, in __vote
    conflicts = adapter.lock_objects_and_detect_conflicts(cursor, self.required_tids)
  File "/opt/plone/buildout-cache/eggs/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 133, in call_with_metric
    return f(*args, **kw)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/adapter.py", line 160, in lock_objects_and_detect_conflicts
    kind
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/locker.py", line 263, in reraise_commit_lock_error
    sys.exc_info()[2])
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/adapter.py", line 150, in lock_objects_and_detect_conflicts
    return self._best_lock_objects_and_detect_conflicts(cursor, read_current_oids)
  File "/opt/plone/buildout-cache/eggs/RelStorage-3.0a7-py2.7-linux-x86_64.egg/relstorage/adapters/postgresql/adapter.py", line 289, in _best_lock_objects_and_detect_conflicts
    (read_current_oids_p, read_current_tids_p,))
relstorage.adapters.interfaces.UnableToLockRowsToReadCurrentError: Acquiring a commit lock failed: could not obtain lock on row in relation "object_state"
CONTEXT:  PL/pgSQL function lock_objects_and_detect_conflicts(bigint[],bigint[]) line 35 at RETURN QUERY

Transaction failed; no lock info available

We use Postgresql.

Please let me know if more information is needed.

@jamadden

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

Thanks for the detailed report! We are actively working on the locking strategy in RelStorage currently.

@andbag

This comment has been minimized.

Copy link

commented Aug 23, 2019

PR #325 now demonstrates the error relatively reliably. Packing with simultaneous write access at ZODB level undermines ZODB transaction logic. Although the transaction is successfully completed at ZODB level, there is no complete persistence of the changes in the history-free RelStorage backend. Consequently, the changed objects at the ZODB level are corrupt and raise a POSKeyError when they are accessed after successful commit. Apparently, the packing logic at SQL level does not harmonize with the transaction logic of the ZODB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.