-
Notifications
You must be signed in to change notification settings - Fork 95
Make TransactionMetaData in charge of (de)serializing extension data #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I recently got more interested by #183 while doing 2 things in NEO: check that a FS DB was correctly imported (using zodbtools), and fight with bugs and code duplication about serialization of extension data. But like Jim, I prefer to serialize between Since |
I think I like this (though I'm not fond of the magic that the decorator uses to get the attribute names from the argspec)! |
I found important to minimize code duplication here, so that both properties remain always symmetric. I wish I could find something better. Despite code that is a little harder to read, there's still little some duplication at the places I don't want to wait too long before merging this one, as I have other changes I'd like to submit. I plan to merge mid-September, unless anyone wants more time to review. |
34f026c
to
f01451b
Compare
What is the status of this PR? What needs to be done to get it merged? |
Currently we exercise zodbdump and zodbcommit+zodbdump with non-empty extensions, which works if ZODB is patched for txn.extension_bytes support, but fails on pristine ZODB. Support for txn.extension_bytes cannot get into upstream ZODB for more than a year: zopefoundation/ZODB#183 zopefoundation/ZODB#207 and even if it somehow will make it, it will likely be only in ZODB5, while we still care to support ZODB4 and ZODB3. Skipping zodbdump / zodbcommit tests, if a ZODB does not have txn.extension_bytes support, would result in significant reduction of zodbtools test coverage, because practically that is the current situation with all upstream ZODB{3,4,5}. Dropping test coverage for non-empty extensions is neither a good option. For those reason, let's rework the tests and test both zodbdump and zodbcommit with two scenarios: 1. on a test database where transactions extensions are always empty. This should work on all ZODB irregardless of whether txn.extension_bytes patch is there or not. 2. on a test database where transactions extensions are present. This should work if ZODB has txn.extension_bytes support, but if not, we can mark this case as xfail, since the failure is expected. This way we make the testsuite pass irregardless of whether txn.extension_bytes support is there, and we don't abandon dump/commit testing coverage. /helped-by Jérome Perrin <jerome@nexedi.com>
a89b480
to
c8ce8a4
Compare
First, I rebased the original commit. But even with the small diff I suggested above, it would remain possible to mutate extensions after having kept a reference to them. Magic does not look a good idea if the result is not perfect, especially when there's no need. So I changed for something simpler, following tseaver comment partially. See |
I'll merge next monday. |
c8ce8a4
to
f094f77
Compare
IStorage implementations used to do this task themselves which leads to code duplication and sometimes bugs (one was fixed recently in NEO). Like for object serialization, this should be done by the upper layer (Connection). This commit also provides a way to get raw extensions data while iterating over transactions (this is actually the original purpose[2]). So far, extension data could only be retrieved unpickled, which caused several problems: - tools like `zodb dump` [1] cannot dump data exactly as stored on a storage. This makes database potentially not bit-to-bit identical to its original after restoring from such dump. - `zodb dump` output could be changing from run to run on the same database. This comes from the fact that e.g. python dictionaries are unordered and so when pickling a dict back to bytes the result could be not the same as original. ( this problem can be worked-around partly to work reliably for e.g. dict with str keys - by always emitting items in key sorted order, but it is hard to make it work reliably for arbitrary types ) Both issues make it hard to verify integrity of database at the lowest possible level after restoration, and make it hard to verify bit-to-bit compatibility with non-python ZODB implementations. For this, TransactionMetaData gets a new 'extension_bytes' attribute and and common usage becomes: * Application committing a transaction: - 'extension' is set with a dictionary - the storage gets the bytes via 'extension_bytes' * Iteration: - the storage passes bytes as 'extension' parameter of TransactionMetaData - the application can get extension data either as bytes ('extension_bytes') or deserialized ('extension'): in the former case, no deserialization happens and the returned value is exactly what was passed by the storage [1] https://lab.nexedi.com/nexedi/zodbtools [2] #183 Co-Authored-By: Kirill Smelkov <kirr@nexedi.com>
f094f77
to
2f8cc67
Compare
If test class does not indicate that underlying storage supports extension_bytes (see zopefoundation/ZODB#207), then ZODB.tests.IteratorStorage will verify that extension_bytes is computed from extension instead of taken from on-storage raw value: https://github.com/zopefoundation/ZODB/blob/5.6.0-48-gbc13ca74b/src/ZODB/tests/IteratorStorage.py#L93-L115 However when underlying storage supports extension_bytes, so is zlibstorage that wraps it. Many classes in zc.zlibstorage tests inherit from classes test classes for FileStorage https://github.com/zopefoundation/zc.zlibstorage/blob/1.2.0-4-g51cca6b/src/zc/zlibstorage/tests.py#L314-L338 Those classes will have .use_extension_bytes properly set because ZODB sets so for FileStorageTests: https://github.com/zopefoundation/ZODB/blob/5.6.0-48-gbc13ca74b/src/ZODB/tests/testFileStorage.py#L44-L62 However for tests that inherit from ZEO tests, .use_extension_bytes is not uniformly set for ZEO over FileStorage, and so with ZODB >= 5.6.0 zc.zlibstorage fails with Failure in test checkTransactionExtensionFromIterator (zc.zlibstorage.tests.FileStorageClientZlibZEOServerZlibTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/IteratorStorage.py", line 115, in checkTransactionExtensionFromIterator self.assertNotEqual(extension_bytes, txn.extension_bytes) File "/usr/lib/python2.7/unittest/case.py", line 522, in assertNotEqual raise self.failureException(msg) AssertionError: '\x80\x03}q\x01U\x03fooK\x01s.' == '\x80\x03}q\x01U\x03fooK\x01s.' and Failure in test checkTransactionExtensionFromIterator (zc.zlibstorage.tests.FileStorageClientZlibZEOZlibTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/IteratorStorage.py", line 115, in checkTransactionExtensionFromIterator self.assertNotEqual(extension_bytes, txn.extension_bytes) File "/usr/lib/python2.7/unittest/case.py", line 522, in assertNotEqual raise self.failureException(msg) AssertionError: '\x80\x03}q\x01U\x03fooK\x01s.' == '\x80\x03}q\x01U\x03fooK\x01s.' Fix it similarly to zopefoundation/ZEO@db24ed2f
If test class does not indicate that underlying storage supports extension_bytes (see zopefoundation/ZODB#207), then ZODB.tests.IteratorStorage will verify that extension_bytes is computed from extension instead of taken from on-storage raw value: https://github.com/zopefoundation/ZODB/blob/5.6.0-48-gbc13ca74b/src/ZODB/tests/IteratorStorage.py#L93-L115 However when underlying storage supports extension_bytes, so is zlibstorage that wraps it. Many classes in zc.zlibstorage tests inherit from classes test classes for FileStorage https://github.com/zopefoundation/zc.zlibstorage/blob/1.2.0-4-g51cca6b/src/zc/zlibstorage/tests.py#L314-L338 Those classes will have .use_extension_bytes properly set because ZODB sets so for FileStorageTests: https://github.com/zopefoundation/ZODB/blob/5.6.0-48-gbc13ca74b/src/ZODB/tests/testFileStorage.py#L44-L62 However for tests that inherit from ZEO tests, .use_extension_bytes is not uniformly set for ZEO over FileStorage, and so with ZODB >= 5.6.0 zc.zlibstorage fails with Failure in test checkTransactionExtensionFromIterator (zc.zlibstorage.tests.FileStorageClientZlibZEOServerZlibTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/IteratorStorage.py", line 115, in checkTransactionExtensionFromIterator self.assertNotEqual(extension_bytes, txn.extension_bytes) File "/usr/lib/python2.7/unittest/case.py", line 522, in assertNotEqual raise self.failureException(msg) AssertionError: '\x80\x03}q\x01U\x03fooK\x01s.' == '\x80\x03}q\x01U\x03fooK\x01s.' and Failure in test checkTransactionExtensionFromIterator (zc.zlibstorage.tests.FileStorageClientZlibZEOZlibTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/IteratorStorage.py", line 115, in checkTransactionExtensionFromIterator self.assertNotEqual(extension_bytes, txn.extension_bytes) File "/usr/lib/python2.7/unittest/case.py", line 522, in assertNotEqual raise self.failureException(msg) AssertionError: '\x80\x03}q\x01U\x03fooK\x01s.' == '\x80\x03}q\x01U\x03fooK\x01s.' Fix it similarly to zopefoundation/ZEO@db24ed2f
IStorage implementations used to do this task themselves which leads to code
duplication and sometimes bugs (one was fixed recently in NEO). Like for object
serialization, this should be done by the upper layer (Connection).
This commit also provides a way to get raw extensions data while iterating
over transactions (this is actually the original purpose[2]). So far, extension
data could only be retrieved unpickled, which caused several problems:
tools like
zodb dump
[1] cannot dump data exactly as stored on astorage. This makes database potentially not bit-to-bit identical to
its original after restoring from such dump.
zodb dump
output could be changing from run to run on the samedatabase. This comes from the fact that e.g. python dictionaries are
unordered and so when pickling a dict back to bytes the result could
be not the same as original.
( this problem can be worked-around partly to work reliably for e.g.
dict with str keys - by always emitting items in key sorted order,
but it is hard to make it work reliably for arbitrary types )
Both issues make it hard to verify integrity of database at the lowest
possible level after restoration, and make it hard to verify bit-to-bit
compatibility with non-python ZODB implementations.
For this, TransactionMetaData gets a new 'extension_bytes' attribute and
and common usage becomes:
Application committing a transaction:
Iteration:
or deserialized ('extension'): in the former case, no deserialization
happens and the returned value is exactly what was passed by the storage
[1] https://lab.nexedi.com/nexedi/zodbtools
[2] #183
Co-Authored-By: Kirill Smelkov kirr@nexedi.com