Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interface: Require invalidations to be called with full set of objects and not to skip transactions #319

Closed
wants to merge 1 commit into from

Conversation

navytux
Copy link
Contributor

@navytux navytux commented Jul 16, 2020

Currently invalidate documentation is not clear whether it should be
called for every transaction and whether it should include full set of
objects created/modified by that transaction. Until now this was working
relatively well for the sole purpose of invalidating client ZEO cache,
because for that particular task it is relatively OK not to include just
created objects into invalidation messages, and even to completely skip
sending invalidation if transaction only create - not modify - objects.
Due to this fact the workings of the client cache was indifferent to the
ambiguity of the interface.

In 2016 skipping transactions with only created objects was reconsidered
as bug and fixed in ZEO5 because ZODB5 relies more heavily on MVCC
semantic and needs to be notified about every transaction committed to
storage to be able to properly update ZODB.Connection view:

zopefoundation/ZEO@02943acd#diff-52fb76aaf08a1643cdb8fdaf69e37802L889-R834
zopefoundation/ZEO@9613f09b

However just-created objects were not included into invalidation
messages until, hopefully, recently:

zopefoundation/ZEO#160

As ZODB is started to be used more widely in areas where it was not
traditionally used before, the ambiguity in invalidate interface and the
lack of guarantees - for any storage - to be notified with full set of
information, creates at least the following problems:

  • a ZODB client (not necessarily native ZODB/py client) can maintain
    raw cache for the storage. If such client tries to load an oid at
    database view when that object did not existed yet, gets "no object"
    reply and stores that information into raw cache, to properly invalidate
    the cache it needs an invalidation message from ZODB server that
    includes created object.

  • tools like zodb watch [1,2,3] don't work properly (give incorrect output)
    if not all objects modified/created by a transaction are included into
    invalidation messages.

  • similarly to zodb watch, a monitoring tool, that would want to be
    notified of all created/modified objects, won't see full
    database-change picture, and so won't work properly without knowing
    which objects were created.

  • wendelin.core 2 - which builds data from ZODB BTrees and data objects
    into virtual filesystem - needs to get invalidation messages with both
    modified and created objects to properly implement its own lazy
    invalidation and isolation protocol for file blocks in OS cache: when
    a block of file is accessed, all clients, that have this block mmaped,
    need to be notified and asked to remmap that block into particular
    revision of the file depending on a client's view of the filesystem and
    database [4,5].

    To compute to where a client needs to remmap the block, WCFS server
    (that in turn acts as ZODB client wrt ZEO/NEO server), needs to be able
    to see whether client's view of the filesystem is before object creation
    (and then ask that client to pin that block to hole), or after creation
    (and then ask the client to pin that block to corresponding revision).

    This computation needs ZODB server to send invalidation messages in
    full: with both modified and just created objects.

Also:

Current state of storages with respect to new requirements:

While editing invalidate documentation, use the occasion to document
recently added property that invalidate(tid) is always called before
storage starts to report its lastTransaction() ≥ tid - see 4a6b028
(mvccadapter: check if the last TID changed without invalidation).

/cc @jimfulton, @jamadden, @jmuchemb, @vpelletier, @arnaud-fontaine, @gidzit, @klawlf82, @jwolf083

[1] https://lab.nexedi.com/kirr/neo/blob/049cb9a0/go/zodb/zodbtools/watch.go
[2] https://lab.nexedi.com/kirr/neo/commit/e0d59f5d
[3] https://lab.nexedi.com/kirr/neo/commit/c41c2907

[4] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/wcfs.go#L94-182
[5] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/client/wcfs.h#L20-71

navytux added a commit to navytux/ZODB that referenced this pull request Jul 27, 2020
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
@navytux navytux mentioned this pull request Jul 27, 2020
@navytux
Copy link
Contributor Author

navytux commented Jul 27, 2020

Also:

Corresponding patch is here: #323.

navytux added a commit to navytux/ZODB that referenced this pull request Jul 27, 2020
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
navytux added a commit to navytux/ZODB that referenced this pull request Jul 27, 2020
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
@navytux
Copy link
Contributor Author

navytux commented Jul 29, 2020

Current state of storages with respect to new requirements:

https://github.com/zodb/relstorage/blob/3.1.2-1-gaf57d6c/src/relstorage/adapters/poller.py#L28-L145

ZEO has just been fixed by zopefoundation/ZEO@ab86bd72 (zopefoundation/ZEO#160). Now it should thus be safe to merge this patch to interface.

Is it ok?

/cc @jamadden, @jimfulton

Copy link
Member

@dataflake dataflake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just language improvements. I lack the knowledge to judge the actual technical content.

src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
@navytux
Copy link
Contributor Author

navytux commented Jul 29, 2020

@dataflake, thanks for feedback. I'm not a native speaker and am out of spoken language practice for a long time. Thanks for pointing out the places to fix. I took your input into account and amended the patch correspondingly. Is it better now?

interdiff
diff --git a/src/ZODB/interfaces.py b/src/ZODB/interfaces.py
index c2568c3c6..2e0dfea1a 100644
--- a/src/ZODB/interfaces.py
+++ b/src/ZODB/interfaces.py
@@ -281,9 +281,9 @@ class IStorageWrapper(Interface):
     - Out-of-band invalidation support
 
       A storage can notify it's wrapper of object invalidations that
-      don't occur due to direct operations on the storage. This is
-      used by non-IMVCCStorage storages like ZEO and NEO to pass
-      invalidation messages sent from storage server.
+      don't occur due to direct operations on the storage. This is used
+      by client part of non-IMVCCStorage storages like ZEO and NEO to
+      pass invalidation messages sent from a storage server.
 
     - Record-reference extraction
 
@@ -319,13 +319,14 @@ def invalidate(transaction_id, oids):
 
         The oids argument is an iterable of object identifiers.
 
-        Unless cache needs to be completely cleared via invalidateCache
-        event, wrapped storage calls invalidate for all transactions in
-        the order as they are committed. For every transaction full set
-        of its objects - both modified and just created - is reported.
+        Unless the cache needs to be completely cleared via
+        invalidateCache event, the wrapped storage calls invalidate for
+        all transactions in the order as they are committed. For every
+        transaction the full set of its objects - both modified and just
+        created - is reported.
 
-        invalidate(tid) is always called before storage starts to report
-        its lastTransaction() ≥ tid.
+        invalidate(tid) is always called before the storage starts to
+        report its lastTransaction() ≥ tid.
 
         The version argument is provided for backward
         compatibility. If passed, it must be an empty string.
@@ -1218,13 +1219,18 @@ def release():
     def poll_invalidations():
         """Poll the storage for external changes.
 
-        Returns either None, or a sequence of OIDs that have created and changed.
-        When a sequence is returned, the corresponding objects should be removed
-        from the ZODB in-memory cache.  When None is returned, the storage is
-        indicating that so much time has elapsed since the last poll that it
-        is no longer possible to enumerate all of the changed OIDs, since the
-        previous transaction seen by the connection has already been packed.
-        In that case, the ZODB in-memory cache should be cleared.
+        Returns either None, or a sequence of OIDs with the full set of
+        objects that have been created or changed since previous call to
+        poll_invalidations.
+
+        When a sequence is returned, the corresponding objects should be
+        removed from the ZODB in-memory cache.
+
+        When None is returned, the storage is indicating that so much time has
+        elapsed since the last poll that it is no longer possible to enumerate
+        all of the changed OIDs, since the previous transaction seen by the
+        connection has already been packed. In that case, the ZODB in-memory
+        cache should be cleared.
         """
 
     def sync(force=True):

@navytux
Copy link
Contributor Author

navytux commented Jul 29, 2020

( CI failures are unrelated: it is only "AppVeyor: Could not resolve host: github.com" (example) )

Copy link
Member

@dataflake dataflake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some more cleanup.

I am marking this as "Approved" just based on my language changes, I can't judge the technical details.

src/ZODB/interfaces.py Outdated Show resolved Hide resolved
src/ZODB/interfaces.py Outdated Show resolved Hide resolved
…s and not to skip transactions

Currently invalidate documentation is not clear whether it should be
called for every transaction and whether it should include full set of
objects created/modified by that transaction. Until now this was working
relatively well for the sole purpose of invalidating client ZEO cache,
because for that particular task it is relatively OK not to include just
created objects into invalidation messages, and even to completely skip
sending invalidation if transaction only create - not modify - objects.
Due to this fact the workings of the client cache was indifferent to the
ambiguity of the interface.

In 2016 skipping transactions with only created objects was reconsidered
as bug and fixed in ZEO5 because ZODB5 relies more heavily on MVCC
semantic and needs to be notified about every transaction committed to
storage to be able to properly update ZODB.Connection view:

zopefoundation/ZEO@02943acd#diff-52fb76aaf08a1643cdb8fdaf69e37802L889-R834
zopefoundation/ZEO@9613f09b

However just-created objects were not included into invalidation
messages until, hopefully, recently:

zopefoundation/ZEO#160

As ZODB is started to be used more widely in areas where it was not
traditionally used before, the ambiguity in invalidate interface and the
lack of guarantees - for any storage - to be notified with full set of
information, creates at least the following problems:

- a ZODB client (not necessarily native ZODB/py client) can maintain
  raw cache for the storage. If such client tries to load an oid at
  database view when that object did not existed yet, gets "no object"
  reply and stores that information into raw cache, to properly invalidate
  the cache it needs an invalidation message from ZODB server that
  *includes* created object.

- tools like `zodb watch` [1,2,3] don't work properly (give incorrect output)
  if not all objects modified/created by a transaction are included into
  invalidation messages.

- similarly to `zodb watch`, a monitoring tool, that would want to be
  notified of all created/modified objects, won't see full
  database-change picture, and so won't work properly without knowing
  which objects were created.

- wendelin.core 2 - which builds data from ZODB BTrees and data objects
  into virtual filesystem - needs to get invalidation messages with both
  modified and created objects to properly implement its own lazy
  invalidation and isolation protocol for file blocks in OS cache: when
  a block of file is accessed, all clients, that have this block mmaped,
  need to be notified and asked to remmap that block into particular
  revision of the file depending on a client's view of the filesystem and
  database [4,5].

  To compute to where a client needs to remmap the block, WCFS server
  (that in turn acts as ZODB client wrt ZEO/NEO server), needs to be able
  to see whether client's view of the filesystem is before object creation
  (and then ask that client to pin that block to hole), or after creation
  (and then ask the client to pin that block to corresponding revision).

  This computation needs ZODB server to send invalidation messages in
  full: with both modified and just created objects.

Also:

- the property that all objects - both modified and just created -
  are included into invalidation messages is required and can help to
  remove `next_serial` from `loadBefore` return in the future.
  This, in turn, can help to do 2x less SQL queries in loadBefore for
  NEO and RelStorage (and maybe other storages too):
  zopefoundation#318 (comment)

Current state of storages with respect to new requirements:

- ZEO: does not skip transactions, but includes only modified - not
  created - objects. This is fixed by zopefoundation/ZEO#160

- NEO: already implements the requirements in full

- RelStorage: already implements the requirements in full, if I
  understand correctly:

  https://github.com/zodb/relstorage/blob/3.1.2-1-gaf57d6c/src/relstorage/adapters/poller.py#L28-L145

While editing invalidate documentation, use the occasion to document
recently added property that invalidate(tid) is always called before
storage starts to report its lastTransaction() ≥ tid - see 4a6b028
(mvccadapter: check if the last TID changed without invalidation).

/cc @jimfulton, @jamadden, @jmuchemb, @vpelletier, @arnaud-fontaine, @gidzit, @klawlf82, @jwolf083
/reviewed-on zopefoundation#319
/reviewed-by @dataflake

[1] https://lab.nexedi.com/kirr/neo/blob/049cb9a0/go/zodb/zodbtools/watch.go
[2] https://lab.nexedi.com/kirr/neo/commit/e0d59f5d
[3] https://lab.nexedi.com/kirr/neo/commit/c41c2907

[4] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/wcfs.go#L94-182
[5] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/client/wcfs.h#L20-71
@navytux
Copy link
Contributor Author

navytux commented Jul 30, 2020

@dataflake, thanks again. Oh, those articles... I've amended the patch according to your suggestions. Thanks for the approval.

interdiff
--- a/COMMIT_MSG
+++ b/COMMIT_MSG
@@ -92,6 +92,8 @@ Date:   Thu Jul 16 11:23:45 2020 +0300
     (mvccadapter: check if the last TID changed without invalidation).
     
     /cc @jimfulton, @jamadden, @jmuchemb, @vpelletier, @arnaud-fontaine, @gidzit, @klawlf82, @jwolf083
+    /reviewed-on https://github.com/zopefoundation/ZODB/pull/319
+    /reviewed-by @dataflake
     
     [1] https://lab.nexedi.com/kirr/neo/blob/049cb9a0/go/zodb/zodbtools/watch.go
     [2] https://lab.nexedi.com/kirr/neo/commit/e0d59f5d

--- a/src/ZODB/interfaces.py
+++ b/src/ZODB/interfaces.py
@@ -282,8 +282,8 @@ class IStorageWrapper(Interface):
 
       A storage can notify it's wrapper of object invalidations that
       don't occur due to direct operations on the storage. This is used
-      by client part of non-IMVCCStorage storages like ZEO and NEO to
-      pass invalidation messages sent from a storage server.
+      by the client part of non-IMVCCStorage storages like ZEO and NEO
+      to pass invalidation messages sent from a storage server.
 
     - Record-reference extraction
 
@@ -1220,8 +1220,8 @@ def poll_invalidations():
         """Poll the storage for external changes.
 
         Returns either None, or a sequence of OIDs with the full set of
-        objects that have been created or changed since previous call to
-        poll_invalidations.
+        objects that have been created or changed since the previous
+        call to poll_invalidations.
 
         When a sequence is returned, the corresponding objects should be
         removed from the ZODB in-memory cache.

@navytux
Copy link
Contributor Author

navytux commented Jul 30, 2020

( CI is ok again, besides "AppVeyor: Failed to connect to github.com port 443:" once )

@dataflake
Copy link
Member

There's no need to explain every single CI failure, especially if it's obvious the tool is at fault. The reviewer's job is to look at those failures and recognize tool failures - or spurious test failures because the tests themselves are not reliable.

@navytux
Copy link
Contributor Author

navytux commented Jul 30, 2020

@dataflake, thanks. I'm used to situation where reviewers do not even take a look unless tests pass, that's why I point out that CI failures are unrelated all the time. In my understanding it is noone's "job" to look into my changes, that's why I try to present it in clean and easy-to-review state. It would be, of course, good to generally improve continuous integration and remove flakiness.

Using the occassion: I would indeed be grateful if someone can take a look at technical bits. Or is it all ok, given that ZEO is now fixed and so all the storages, that are in use today, confirm to the amended interface definition?

Thanks beforehand for feedback,
Kirill

@dataflake
Copy link
Member

dataflake commented Jul 30, 2020

Using the occassion: I would indeed be grateful if someone can take a look at technical bits. Or is it all ok, given that ZEO is now fixed and so all the storages, that are in use today, confirm to the amended interface definition?

My experience has been that unless you specifically assign reviewers no one will feel responsible and no review happens. It's not enough to put CC @ style mentions into the text. Assign people as reviewers and, if necessary, prod them by adding comments.

@navytux
Copy link
Contributor Author

navytux commented Jul 30, 2020

ok, thanks, assigning @jimfulton and @jamadden.

navytux added a commit to navytux/ZODB that referenced this pull request Jul 31, 2020
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
@@ -280,9 +281,9 @@ class IStorageWrapper(Interface):
- Out-of-band invalidation support

A storage can notify it's wrapper of object invalidations that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it's/its/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'm not changing this line at all and the "it's" come from old text. There are other usage of "it's" in ZODB/interfaces.py where I would put "its" by myself. I thought it is kind of US style or something similar and not delved into changing what is semantically not covered by the patch subject. If we are to change "it's" -> "its" I think it is better to do in a separate patch and for the whole file or whole codebase.

Thanks for the approval.

@navytux
Copy link
Contributor Author

navytux commented Aug 26, 2020

If there is no objection I will merge on Monday Aug 31.

navytux added a commit that referenced this pull request Aug 31, 2020
…s and not to skip transactions

Currently invalidate documentation is not clear whether it should be
called for every transaction and whether it should include full set of
objects created/modified by that transaction. Until now this was working
relatively well for the sole purpose of invalidating client ZEO cache,
because for that particular task it is relatively OK not to include just
created objects into invalidation messages, and even to completely skip
sending invalidation if transaction only create - not modify - objects.
Due to this fact the workings of the client cache was indifferent to the
ambiguity of the interface.

In 2016 skipping transactions with only created objects was reconsidered
as bug and fixed in ZEO5 because ZODB5 relies more heavily on MVCC
semantic and needs to be notified about every transaction committed to
storage to be able to properly update ZODB.Connection view:

zopefoundation/ZEO@02943acd#diff-52fb76aaf08a1643cdb8fdaf69e37802L889-R834
zopefoundation/ZEO@9613f09b

However just-created objects were not included into invalidation
messages until, hopefully, recently:

zopefoundation/ZEO#160

As ZODB is started to be used more widely in areas where it was not
traditionally used before, the ambiguity in invalidate interface and the
lack of guarantees - for any storage - to be notified with full set of
information, creates at least the following problems:

- a ZODB client (not necessarily native ZODB/py client) can maintain
  raw cache for the storage. If such client tries to load an oid at
  database view when that object did not existed yet, gets "no object"
  reply and stores that information into raw cache, to properly invalidate
  the cache it needs an invalidation message from ZODB server that
  *includes* created object.

- tools like `zodb watch` [1,2,3] don't work properly (give incorrect output)
  if not all objects modified/created by a transaction are included into
  invalidation messages.

- similarly to `zodb watch`, a monitoring tool, that would want to be
  notified of all created/modified objects, won't see full
  database-change picture, and so won't work properly without knowing
  which objects were created.

- wendelin.core 2 - which builds data from ZODB BTrees and data objects
  into virtual filesystem - needs to get invalidation messages with both
  modified and created objects to properly implement its own lazy
  invalidation and isolation protocol for file blocks in OS cache: when
  a block of file is accessed, all clients, that have this block mmaped,
  need to be notified and asked to remmap that block into particular
  revision of the file depending on a client's view of the filesystem and
  database [4,5].

  To compute to where a client needs to remmap the block, WCFS server
  (that in turn acts as ZODB client wrt ZEO/NEO server), needs to be able
  to see whether client's view of the filesystem is before object creation
  (and then ask that client to pin that block to hole), or after creation
  (and then ask the client to pin that block to corresponding revision).

  This computation needs ZODB server to send invalidation messages in
  full: with both modified and just created objects.

Also:

- the property that all objects - both modified and just created -
  are included into invalidation messages is required and can help to
  remove `next_serial` from `loadBefore` return in the future.
  This, in turn, can help to do 2x less SQL queries in loadBefore for
  NEO and RelStorage (and maybe other storages too):
  #318 (comment)

Current state of storages with respect to new requirements:

- ZEO: does not skip transactions, but includes only modified - not
  created - objects. This is fixed by zopefoundation/ZEO#160

- NEO: already implements the requirements in full

- RelStorage: already implements the requirements in full, if I
  understand correctly:

  https://github.com/zodb/relstorage/blob/3.1.2-1-gaf57d6c/src/relstorage/adapters/poller.py#L28-L145

While editing invalidate documentation, use the occasion to document
recently added property that invalidate(tid) is always called before
storage starts to report its lastTransaction() ≥ tid - see 4a6b028
(mvccadapter: check if the last TID changed without invalidation).

/cc @jimfulton, @jamadden, @jmuchemb, @vpelletier, @arnaud-fontaine, @gidzit, @klawlf82, @jwolf083
/reviewed-on #319
/reviewed-by @dataflake
/reviewed-by @jmuchemb

[1] https://lab.nexedi.com/kirr/neo/blob/049cb9a0/go/zodb/zodbtools/watch.go
[2] https://lab.nexedi.com/kirr/neo/commit/e0d59f5d
[3] https://lab.nexedi.com/kirr/neo/commit/c41c2907

[4] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/wcfs.go#L94-182
[5] https://lab.nexedi.com/kirr/wendelin.core/blob/1efb5876/wcfs/client/wcfs.h#L20-71
@navytux
Copy link
Contributor Author

navytux commented Aug 31, 2020

Patch applied to master as c1e0805.

@navytux navytux closed this Aug 31, 2020
@navytux navytux deleted the y/invalidate-full branch August 31, 2020 10:43
navytux added a commit to navytux/ZODB that referenced this pull request Mar 16, 2021
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants