Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] MVCCAdapter.load: Raise ReadConflictError only if pack is running simultaneously #322

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

navytux
Copy link
Contributor

@navytux navytux commented Jul 27, 2020

Currently when load(oid) finds that the object was deleted, it raises
ReadConflictError - not POSKeyError - because a pack could be running
simultaneously and the deletion could result from the pack. In that case
we want corresponding transaction to be retried - not failed - via
raising ConflictError subclass for backward-compatibility reason.
However from semantic point of view, it is more logically correct to
raise POSKeyError, when an object is found to be deleted or
not-yet-created, and raise ReadConflictError only if a pack was actually
running simultaneously, and the deletion could result from that pack.

-> Fix MVCCAdapter.load to do this - now it raises ReadConflictError
only if MVCCAdapterInstance view appears before storage packtime, which
indicates that there could indeed be conflict in between read access and
pack removing the object.

To detect if pack was running and beyond MVCCAdapterInstance view, we
need to teach storage drivers to provide way to known what was the last
pack time/transaction. Add optional IStorageLastPack interface with
.lastPack() method to do so. If a storage does not implement lastPack,
we take conservative approach and raise ReadConflictError
unconditionally as before.

Add/adapt corresponding tests.

Teach FileStorage, MappingStorage and DemoStorage to implement the new
interface.

NOTE: storages that implement IMVCCStorage natively already raise
POSKeyError - not ReadConflictError - when load(oid) finds deleted
object. This is so because IMVCCStorages natively provide isolation, via
e.g. RDBMS in case of RelStorage. The isolation property provided by
RDBMS guarantees that connection view of the database is not affected by
other changes - e.g. pack - until connection's transaction is complete.

/cc @jimfulton

@navytux
Copy link
Contributor Author

navytux commented Jul 27, 2020

( fixed travis wrt ZOPE_INTERFACE_STRICT_IRO=1 )

navytux added a commit to navytux/ZEO that referenced this pull request Jul 29, 2020
Else those non-current entries can be used to serve a loadBefore request
with data, while, after pack that loadBefore request must return "data
deleted" if requested object has current revision >= packtime.

Fixes checkPackVSConnectionGet from ZODB from zopefoundation/ZODB#322
which, without this patch fails as e.g.

    Failure in test checkPackVSConnectionGet (ZEO.tests.testZEO.MappingStorageTests)
    Traceback (most recent call last):
      File "/usr/lib/python2.7/unittest/case.py", line 329, in run
        testMethod()
      File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/PackableStorage.py", line 636, in checkPackVSConnectionGet
        raises(ReadConflictError, conn1.get, oid)
      File "/usr/lib/python2.7/unittest/case.py", line 473, in assertRaises
        callableObj(*args, **kwargs)
      File "/usr/lib/python2.7/unittest/case.py", line 116, in __exit__
        "{0} not raised".format(exc_name))
    AssertionError: ReadConflictError not raised
@navytux
Copy link
Contributor Author

navytux commented Jul 29, 2020

ZEO cache coherency fix that was revealed by checkPackVSConnectionGet test added by this patch: zopefoundation/ZEO#162.

…ultaneously

Currently when load(oid) finds that the object was deleted, it raises
ReadConflictError - not POSKeyError - because a pack could be running
simultaneously and the deletion could result from the pack. In that case
we want corresponding transaction to be retried - not failed - via
raising ConflictError subclass for backward-compatibility reason.
However from semantic point of view, it is more logically correct to
raise POSKeyError, when an object is found to be deleted or
not-yet-created, and raise ReadConflictError only if a pack was actually
running simultaneously, and the deletion could result from that pack.

-> Fix MVCCAdapter.load to do this - now it raises ReadConflictError
only if MVCCAdapterInstance view appears before storage packtime, which
indicates that there could indeed be conflict in between read access and
pack removing the object.

To detect if pack was running and beyond MVCCAdapterInstance view, we
need to teach storage drivers to provide way to known what was the last
pack time/transaction. Add optional IStorageLastPack interface with
.lastPack() method to do so. If a storage does not implement lastPack,
we take conservative approach and raise ReadConflictError
unconditionally as before.

Add/adapt corresponding tests.

Teach FileStorage, MappingStorage and DemoStorage to implement the new
interface.

NOTE: storages that implement IMVCCStorage natively already raise
POSKeyError - not ReadConflictError - when load(oid) finds deleted
object. This is so because IMVCCStorages natively provide isolation, via
e.g. RDBMS in case of RelStorage. The isolation property provided by
RDBMS guarantees that connection view of the database is not affected by
other changes - e.g. pack - until connection's transaction is complete.

/cc @jimfulton
@navytux
Copy link
Contributor Author

navytux commented Jul 31, 2020

Rebased on top of master since #320 was merged.

Reminder:

  1. Kill leftovers of pre-MVCC read conflicts #320 removed pre-MVCC leftovers without change of semantic. It is prerequisite for this patch (change 2).
  2. this patch: let's switch MVCCAdapter to raise POSKeyError when load cannot find the object, or sees that it was deleted, and only raise ReadConflictError if load actually detects simultaneous pack that overlaps with MVCCAdapter view of the database. This is implemented via teaching storages to report lastPack time. It is prerequisite for change 3.
  3. let's add loadAt and switch ZODB codebase to it. This fixes DemoStorage corruption (DemoStorage does not take whiteouts into account -> leading to data corruption #318) and offloads storage servers from doing unneccessary work on every object access. In particular this provides potential to reduce 2x number of SQL queries on every load for NEO (DemoStorage does not take whiteouts into account -> leading to data corruption #318 (comment)). This is loadAt #323 and was initial motivation for all this work.

/cc @jamadden, @vpelletier, @jimfulton, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82

@jmuchemb
Copy link
Member

In NEO, we are about to implement partial pack, and even different ways of packing partially. This PR seems to be a deadend because lastPack won't be enough.

IStorage is a better place than MVCCAdapterInstance to raise the appropriate exception. For minimum compatibility breakage, I suggest the following changes:

  • in MVCCAdapterInstance, just do s/ReadConflictError/POSKeyError/
  • in IStorage.loadBefore, clarify that ReadConflictError should be raised if the deletion could result from some pack
  • update storage implementations

In the worst case (old storage implementation), POSKeyError could be raised instead of ReadConflictError. Would it be a major issue ?

An alternative solution is to add a more generic method (rather than lastPack, which assumes too much about how pack is working). For example:

     def isBeingPacked(oid, before_tid): # same parameters as loadBefore

@navytux
Copy link
Contributor Author

navytux commented Dec 7, 2020

( @jmuchemb your feedback is not ignored - I just did not get back to this topic, hopefully yet. Sorry for the delay with replying )

@navytux
Copy link
Contributor Author

navytux commented Mar 26, 2021

I've reworked loadAt not to depend on this patch (#323 (comment)).
I would like to put this patch on hold for now and focus on merging loadAt first.

@navytux navytux changed the title MVCCAdapter.load: Raise ReadConflictError only if pack is running simultaneously [WIP] MVCCAdapter.load: Raise ReadConflictError only if pack is running simultaneously Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants