-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DemoStorage: Add support for deleteObject (IExternalGC) #341
base: master
Are you sure you want to change the base?
Conversation
loadAt is new optional storage interface that is intended to replace loadBefore with more clean and uniform semantic. Compared to loadBefore, loadAt: 1) returns data=None and serial of the removal, when loaded object was found to be deleted. loadBefore is returning only data=None in such case. This loadAt property allows to fix DemoStorage data corruption when whiteouts in overlay part were not previously correctly taken into account. zopefoundation#318 2) for regular data records, does not require storages to return next_serial, in addition to (data, serial). loadBefore requirement to return both serial and next_serial is constraining storages unnecessarily, and, while for FileStorage it is free to implement, for other storages it is not - for example for NEO and RelStorage, finding out next_serial, after looking up oid@at data record, costs one more SQL query: https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508 https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482 https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264 https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199 next_serial is not only about execution overhead - it is semantically redundant to be there and can be removed from load return. The reason I say that next_serial can be removed is that in ZODB/py the only place, that I could find, where next_serial is used on client side is in client cache (e.g. in NEO client cache), and that cache can be remade to work without using that next_serial at all. In simple words whenever after loadAt(oid, at) -> (data, serial) query, the cache can remember data for oid in [serial, at] range. Next, when invalidation message from server is received, cache entries, that had at == client_head, are extended (at -> new_head) for oids that are not present in invalidation message, while for oids that are present in invalidation message no such extension is done. This allows to maintain cache in correct state, invalidate it when there is a need to invalidate, and not to throw away cache entries that should remain live. This of course requires ZODB server to include both modified and just-created objects into invalidation messages ( zopefoundation/ZEO#160 , zopefoundation#319 ). Switching to loadAt should thus allow storages like NEO and, maybe, RelStorage, to do 2x less SQL queries on every object access. zopefoundation#318 (comment) In other words loadAt unifies return signature to always be (data, serial) instead of POSKeyError object does not exist at all None object was removed (data, serial, next_serial) regular data record used by loadBefore. This patch: - introduces new interface. - introduces ZODB.utils.loadAt helper, that uses either storage.loadAt, or, if the storage does not implement loadAt interface, tries to mimic loadAt semantic via storage.loadBefore to possible extent + emits corresponding warning. - converts MVCCAdapter to use loadAt instead of loadBefore. - changes DemoStorage to use loadAt, and this way fixes above-mentioned data corruption issue; adds corresponding test; converts DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt. - adds loadAt implementation to FileStorage and MappingStorage. - adapts other tests/code correspondingly. /cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
So that it is generally possible to zodbrestore[1,2] zodbdumped[3] database/transactions into DemoStorage. Because without this patch, when dump contains deletion records, e.g. txn 0285cbacc06d3a4c " " user "" description "" extension "" obj 0000000000000007 delete it fails like this: Traceback (most recent call last): ... File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main zodbrestore(stor, asbinstream(sys.stdin), _) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore zodbcommit(stor, at, txn) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 131, in zodbcommit _() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 118, in _ stor.deleteObject(obj.oid, current_serial(obj.oid), txn) AttributeError: 'DemoStorage' object has no attribute 'deleteObject' demo_test.go:105: demo:(file:///tmp/demo470143633/δ0285cbac258bf266/base.fs)/(file:///tmp/demo470143633/δ0285cbac258bf266/δ.fs): zpyrestore: exit status 1 To be able to implement DemoStorage.deleteObject, we have to fix FileStorage.deleteObject first: currently FileStorage.deleteObject raises POSKeyError if an object is not present in the database, but for situation where - demo=base+δ, - object is present in base, and - object is not present in δ it creates a problem because there is no way to add whiteout record for the object into δ. This change should be generally good; let me quote https://lab.nexedi.com/kirr/neo/commit/543041a3 for why: ---- 8< ---- Even though currently it is not possible to create such data record via FileStorage(py).deleteObject (because it raises POSKeyError if there is no previous object revision), being able to use such data records is semantically useful in overlayed DemoStorage settings, where δ part marks an object that exists only in base with delete record whiteout. It is also generally meaningful to be able to create "delete" record even if object was not previously existing: "deleteObject" is actually similar to "store" (and so should be better named as "storeDelete"). If one wants to store deletion, there should not be a reason to reject it, because deleteObject already has seatbelt in the form of oldserial, and if the user calls deleteObject(oid, oldserial=z64), he/she is already telling that "I know that this object does not exist in this storage (oldserial=z64), but still please create a deletion record for it". Once again this is useful in overlayed DemoStorage settings described above. For the reference, such whiteout deletion records pass ZODB/scripts/fstest just fine. ---- 8< ---- DemoStorage.deleteObject implementation is straightforward, but builds on loadAt[4] as precondition. /cc @jimfulton, @jamadden, @perrinjerome [1] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbrestore.py [2] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19 [3] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py [4] zopefoundation#323
See also https://lab.nexedi.com/kirr/neo/commit/4fb6bd0a?expanded=1 for the reference. |
/cc @d-maurer |
I was running ZODB/go tests with forgetting to switch my ZODB/py checkout from master to my custom branch, and once again hit the problem of So is there anyone with feedback maybe? Once again, hereby patch builds on top of #323 which, by the way, fixes data corruption problem of DemoStorage (#318). |
@d-maurer suggests[1]: The ZODB logic relating to historical data (including MVCC) was largely centered around before. You have changed this to at - requiring wide spread modifications. I would much prefer to keep the before centered approach... (zopefoundation#323 (review)) So let's change "at"-based logic to "before"-based logic and rename the new method from loadAt to loadBeforeEx.
…rface not provided" Suggested by @d-maurer: zopefoundation#323 (comment)
@d-maurer suggests to keep loadBefore without deprecation (zopefoundation#323 (review)). -> Don't emit warnings about deprecating loadBefore. -> Keep the deprecation text in loadBefore interface, since loadBeforeEx should practically provide wider functionality without putting unnecessary constraint on storage implementations. In other words loadBefore deprecation is still there, but less aggressively advertised with the idea to make transition for outside-of-ZODB code to loadBeforeEx more smooth and with a bit more steps (we might want to reinstate the deprecation warnings at a later time).
…s "interface not provided" loadBeforeEx uses the same docstring as loadBefore as @d-maurer suggests: zopefoundation#323 (comment)
The changelog entry uses loadBeforeEx, but we are likely to change this name during zopefoundation#323 review.
@d-maurer says (zopefoundation#323 (comment)): The changes around `loadBeforeEx` have much more impact than the `DemoStorage` fix. -------- kirr: adjusted the text a bit: "Introduces" -> "Introduce"; add "interface". @d-maurer, I hope it is ok.
Dieter Maurer notes that loadBefore cannot be deprecated yet because ZEO essentially depends on the `end_tid` information returned by loadBefore to update its cache: zopefoundation#323 (comment) And to remove this dependency it would require to rework ZODB caching layer: zopefoundation#323 (comment) So we cannot deprecate loadBefore until this rework is implemented first. -> Remove general loadBefore deprecation, and emit loadBefore vs loadBeforeEx warning only when actually hitting a "deletion" record, because only that case is known to lead to data corruption.
Is there anyone able to review this PR? |
to resolve trivial conflict on CHANGES.rst * origin/master: (22 commits) Fix TypeError for fsoids (zopefoundation#351) Fix deprecation warnings occurring on Python 3.10. fix more PY3 incompatibilities in `fsstats` fix Python 3 incompatibility for `fsstats` add `fsdump/fsstats` test fsdump/fsstats improvements - add coverage combine step - first cut moving tests from Travis CI to GitHub Actions - ignore virtualenv artifacts [ci skip] tests: Run race-related tests with high frequency of switches between threads tests: Add test for load vs external invalidation race tests: Add test for open vs invalidation race fixup! doc/requirements: Require pygments < 2.6 on py2 doc/requirements: Require pygments < 2.6 on py2 fixup! buildout: Fix Sphinx install on Python2 buildout: Fix Sphinx install on Python2 Update README.rst Security fix documentation dependencies (zopefoundation#342) changes: Correct link to UnboundLocalError fsoids.py fix fsrefs: Optimize IO (take 2) (zopefoundation#340) ...
Resolve many conflicts after zopefoundation#357. * master: Let the year float. Configuring for pure-python Specify a PyPy2 version. Lint the code. Configuring for pure-python
Sorry, but some of flake8 complaints are really stupid...
* y/loadAt.8: (35 commits) Make lint happy Let the year float. Configuring for pure-python Specify a PyPy2 version. Lint the code. Configuring for pure-python Fix TypeError for fsoids (zopefoundation#351) Fix deprecation warnings occurring on Python 3.10. fix more PY3 incompatibilities in `fsstats` fix Python 3 incompatibility for `fsstats` add `fsdump/fsstats` test fsdump/fsstats improvements Undeprecate loadBefore fixup! changes: Add draft entry for loadAt/loadBeforeEx/DemoStorage fix changes: Add draft entry for loadAt/loadBeforeEx/DemoStorage fix fixup! Handle NotImplementedError raised by loadBefore/loadBeforeEx as "interface not provided" *: Don't emit warnings on loadBefore Handle NotImplementedError raised by loadBefore/loadBeforeEx as "interface not provided" loadAt -> loadBeforeEx - add coverage combine step ...
loadAt -> loadBeforeEx since it was renamed so on zopefoundation@805bf36f .
Make flake8 happy src/ZODB/DemoStorage.py:388:80: E501 line too long (89 > 79 characters) src/ZODB/DemoStorage.py:391:80: E501 line too long (87 > 79 characters) src/ZODB/DemoStorage.py:393:80: E501 line too long (83 > 79 characters) src/ZODB/DemoStorage.py:395:80: E501 line too long (87 > 79 characters) src/ZODB/DemoStorage.py:397:80: E501 line too long (100 > 79 characters) src/ZODB/DemoStorage.py:398:80: E501 line too long (86 > 79 characters) src/ZODB/DemoStorage.py:408:5: E303 too many blank lines (2) src/ZODB/tests/testDemoStorage.py:397:9: F401 'ZODB.FileStorage.FileStorage' imported but unused src/ZODB/tests/testDemoStorage.py:399:80: E501 line too long (101 > 79 characters)
Refreshed the code by syncing with #323 to resolve conflicts. As explained in the main message, @d-maurer, please note how loadAt -> loadBeforeEx renaming made the code somewhat less clear: c3820ff. There, we are obtaining To me it is a common situation - when something determines revision as of which database state it wants to observe data snapshot, and then accesses/loads data at that particular revision. Frequently determining revision is done via I don't insist on this strongly, but maybe it would make sense to reconsider loadAt -> loadBeforeEx rename and rename This will unify loading semantic with Thanks beforehand for feedback, |
And I would appreciate if someone would look at the main patch in the PR: 49f5959. |
@d-maurer Do you think you are able to review this PR? Or is there anyone else with deep enough ZODB knowledge? |
Michael Howitz wrote at 2022-3-17 00:15 -0700:
***@***.*** Do you think you are able to review this PR? Or is there anyone else with deep enough ZODB knowledge?
This PR depends on the `loadAt` PR (--> #323).
I have approved this PR (after long discussions).
But, @jamadden might have had some concerns
(--> "#323 (comment)"),
in particular he was unsure about the impact on `RelStorage`.
PR #323 introduces a new storage method (`loadAt` or `loadBeforeEx`);
@navitux has implemented it for the storages directly provided
by `ZODB`; however other storages lack the new method and the
provided fallback might not be efficient.
If #323 is accepted, this PR is trivial and poses no further risk.
|
* master: - vb [ci skip] - force recognizing ellipsis - try to fix changed traceback representation - select more specific pypy versions - prepare release 5.7.0 - revert 5b8e2dc
* y/loadAt.9: - vb [ci skip] - force recognizing ellipsis - try to fix changed traceback representation - select more specific pypy versions - prepare release 5.7.0 - revert 5b8e2dc
@icemac, @d-maurer, thanks for heads up and for feedback. This patch is indeed small, hopefully safe, change, but it requires #323 to work. Regarding potential impact of #323 on RelStorage - I believe I've addressed that with #323 (comment) and zodb/relstorage#485. However taking into account that @jamadden might be having a hard time, it is understandable that there is no feedback on this. Another point is that personally, even though #323 is approved (@d-maurer thanks for the approval and for the discussion), I'm not so comfortable to merge it in loadBeforeEx form compared to merging it as loadAt. The reason here is that until we establish a public interface the freedom to change naming is still there, but once the interface is established it has to be kept in stone. In my view loadAt is more descriptive and "speaking" name compared to loadBeforeEx. It is also more handy to use loadAt compared to loadBefore(Ex) because one does not need +1 transactionID obtained via e.g. lastTransaction. loadAt can also make loadSerial obsolete today, without any blockers. Dieter, would you please try to review your decision to prefer loadBefore-style for new function compared to loadAt? I understand that you still might be keeping on your preferences, but maybe taking another look afresh might be helpful. Because you originally asked to change loadAt->loadBefore* in the hope to simplify the code and keep diff of changes minimal, but in the end it did not work like that. And also, like I said above, with loadAt it is straightforward to e.g. directly use tid, that lastTransaction returns, while with loadBefore that tid has to be converted to Let me qoute #341 (comment) to explain the latter point: ---- 8< ---- There, we are obtaining To me it is a common situation - when something determines revision as of which database state it wants to observe data snapshot, and then accesses/loads data at that particular revision. Frequently determining revision is done via I don't insist on this strongly, but maybe it would make sense to reconsider loadAt -> loadBeforeEx rename and rename This will unify loading semantic with So @d-maurer, would you please provide some feedback on this? I understand that you might reject the idea to return to loadAt naming, and probably I will accept and live with that, but it would be a pity not to use a more fitting version. That's why I'm asking for you to please review this once again from scratch. Thanks beforehand for feedback, |
( synced to updated #323 to resolve conflict on CHANGES.rst ) |
Kirill Smelkov wrote at 2022-3-17 09:48 -0700:
...
Dieter, would you please try to review your decision to prefer loadBefore-style for new function compared to loadAt?
I have not advocated for `loadBeforeEx` but to change (incompatibly)
the behavior of `loadBefore` for delete records.
The `loadBeforeEx` has been a compromise, you proposed; I dislike the name
(likely as much as you do).
`loadAt` is a good name -- but ZODB internals use the "before" semantics
and forward and backwards adaptations are necessary to switch
to `loadAt` -- or a rewrite of those ZODB internals.
I am also author and maintainer of `dm.historical` which uses
the "before" semantics (because the ZODB does).
It would not be difficult to switch
to "at" semantics but it is nasty to have code depending
on the ZODB version.
I understand that you still might be keeping on your preferences, but maybe taking another look afresh might be helpful.
Personally, I do not see the need to change anything.
Your change allows to fix a bug in `DemoStorage` (and other
stacked storages) -- but this bug affects only "delete" records
(which I never have used). I have also never used
the stacking feature of `DemoStorage`.
Thus, form my point of view, this is a bug which does not require fixing --
at least not at the cost of a new load method.
Unlike you, I would not mind that `loadBefore` would change
its behavior for "delete" records in an incompatible way.
|
@d-maurer, thanks for feedback.
To fix the data corruption bug is not the only motivation for me. As said in the description of #323 fixing data corruption for DemoStorage is only the first point. The second point is to save work for storages beside FileStorage to save time on every load by not unneedingly retrieving next_serial from disk/database. This change does affect regular - not only deletion - records. And that's why a new method is intoduced to avoid backward compatibility breakage to loadBefore. Regarding ZODB internals - all that is needed to switch them from loadBeforeEx back to loadAt is to semantically revert the following patch: 805bf36f4442. I can do that in practically no time, and we already know that after that amendment, the whole amount of changes to ZODB internals of whole #323 will stay almost the same. So in my view ZODB internals are not keeping us to use "before" semantic. After reading your message about dm.historical I was initially going to agree that if dm.historical uses "before" semantic via loadBefore, then it makes sense not to break that and keep #323 in current form as is. But I looked at dm.historical sources and ... as I see it, nothing there uses loadBefore at all. If I understood it correctly, dm.historical is published only in the form of tarballs on PyPI, So I will refer to relevant parts of dm.historical source code by quoting them here from what I obtained via dm.historical-2.0.4.tar.gz: The first place that depends on "before" semantic is Connection class: class Connection(Connection):
def __init__(self, baseConnection, timeOrSerial, before=False):
'''creates a (historical) connection based on *baseConnection* that delivers state at or before *timeOrSerial*.
*baseConnection* must be an (open) ZODB connection. Its 'db' is used
as our 'db' as well.
*timeOrSerial* is either a time ('DateTime' or float/int) or a serial/transactionid ('string').
''' but it inherits from ZODB.Connection.Connection and passes timeOrSerial to that ZODB.Connection constructor as ts = None
if isinstance(timeOrSerial, bytes): ts = TimeStamp(timeOrSerial)
if isinstance(timeOrSerial, DateTime): timeOrSerial = timeOrSerial.timeTime()
if isinstance(timeOrSerial, (int, float)):
ts = TimeStamp(*(gmtime(timeOrSerial)[:5] + (timeOrSerial%60,)))
db = baseConnection.db()
c = super(Connection, self).__init__
if "before" in getargspec(c).args:
# this is a modern ZODB which directly supports historical connections
if not before:
# we must slightly increment "ts" as we need the state
# `at or before` while the connection gives us `before`.
ts = ts.laterThan(ts)
c(db, before=to_bytes(ts))
self.open() this will continue to work irregardless of whether ZODB internally uses loadBefore, loadBeforeEx or loadAt, because after any form of #323 ZODB.Connection stays fully backward compatible! The other place in dm.historical is This means that whatever we do - merge #323 with loadBeforeEx, or merge #323 with loadAt - dm.historical will continue to work without any change to its sources. Does all this maybe change anything in your view, please? Kirill |
Kirill Smelkov wrote at 2022-3-17 11:35 -0700:
***@***.***, thanks for feedback.
> Personally, I do not see the need to change anything.
> ...
To fix the data corruption bug is not the only motivation for me. As said in the description of #323 fixing data corruption for DemoStorage is only the first point. The second point is to save work for storages beside FileStorage to save time on every load by not _unneedingly_ retrieving next_serial from disk/database. This change does affect regular - not only deletion - records. And that's why a new method is intoduced to avoid backward compatibility breakage to loadBefore.
I have not yet seen a statement about your proposal
from the authors of those storages -- apart from some concerns
regarding `relstorage` (maybe meanwhile resolved).
We have discussed that ZEO uses the additional information
(you want to drop) for its cache consistency. You might have
a solution concept - but it is not yet implemented.
Regarding ZODB internals - all that is needed to switch them from loadBeforeEx back to loadAt is to semantically revert the following patch: 805bf36f4442. I can do that in practically no time, and we already know that after that amendment, the whole amount of changes to ZODB internals of whole #323 will stay almost the same. So in my view ZODB _internals_ are not keeping us to use "before" semantic.
I remember that you had (of course) implemented the
"before -> at" semantic adaptations. They have been the reason
for my initial objection -- the ZODB should either
consistently use "at" or "before" semantics, not switch between
the two. I advocated to stay with "before" semantics.
...
But I looked at dm.historical sources and ... as I see it, nothing there uses loadBefore at all.
...
This means that whatever we do - merge #323 with loadBeforeEx, or merge #323 with loadAt - dm.historical will continue to work without any change to its sources.
You are right: like most of my packages, `dm.historical` is very stable
and it was a long time ago I actively worked on it. I was wrong
to fear that your proposal could break something.
Does all this maybe change anything in your view, please?
No, it does not.
|
@d-maurer, thanks for feedback.
Well, when I'm speaking about NEO, I'm meaning to be the author of NEO/go version myself https://navytux.spb.ru/~kirr/neo I also contributed to, and have more-or-less thorough understanding of, NEO/py implementation.
This is valid point. Without rework of ZEO cache loadBefore cannot be deprecated and dropped. Will it be accepted if I implement what I had described in #323 (comment) and prove that it works well for ZEO and in general?
I agree that consistency is a good goal. But ZODB is already inconsistent: loadBefore works on tid as "before", while e.g. loadSerial accepts tid as "at". And e.g. ZODB.DB - its public interface - accepts both "at" and "before" in its We cannot change the public interface due to backward compatibility. But we can make the internals consistent. Currently they are already inconsistent in between loadBefore and loadSerial. With loadAt and loadSerial everything will be consistently using "at" semantic. Of course, if loadBefore gets deprecated, which, once again, depends on ZEO cache rework.
Thanks for confirming.
I'm sorry to hear that, but I understand your reasoning. It is fair to say that loadAt is not generally very useful until loadBefore cannot be deprecated for real because ZEO cache depends on it. Once again, I'm willing to fix ZEO cache not to depend on Kirill |
Kirill Smelkov wrote at 2022-3-18 02:06 -0700:
...
> We have discussed that ZEO uses the additional information
> (you want to drop) for its cache consistency. You might have
> a solution concept - but it is not yet implemented.
This is valid point. Without rework of ZEO cache loadBefore cannot be deprecated and dropped.
Will it be accepted if I implement what I had described in #323 (comment) and prove that it works well for ZEO and in general?
I cannot speak about whether it is accepted or not -- only about
whether I might approve it or not.
The general concept may work and I might approve an implementation.
However, I need to see the concrete implementation (and what it entails)
before my approval.
...
I agree that consistency is a good goal. But ZODB is already inconsistent: loadBefore works on tid as "before", while e.g. loadSerial accepts tid as "at". And e.g. ZODB.DB - its public interface - accepts both "at" and "before" in its `.open`.
We cannot change the public interface due to backward compatibility. But we can make the internals consistent. Currently they are already inconsistent in between loadBefore and loadSerial.
You are aware that your "at" (which acutally means "<= at") is
different from the "at" of `loadSerical` (which means "== at").
Otherwise, we would not need a `loadAt` but could use `loadSerial`).
I recognize that the ZODB uses "before" semantics for historical reasons:
formerly, the first invalidation serial after a connection open
defined the snapshot boundary amd the connection read the state
*before* that boundary. Now, the MVCC adapter maintains
information about the last transaction and the connection
loads "at" this transaction boundary.
For historical reasons this "at" is transformed into a "before".
With your `loadAt`,
the "before" must later be retransformed into "at" semantics.
I much dislike this double transformation -- especially, because
the `TimeStamp` does not provide an API for the back transformation.
If we should switch to `Storage.loadAt`, we should rework the
ZODB code to consistently use "at" semantics.
...
Once again, I'm willing to fix ZEO cache not to depend on `next_serial` sooner or later.
`ZEO` and `FileStorage` are the only storages I am currently really interested
in. Thus, my personal decisions depend on good support for them.
|
@d-maurer, thanks for feedback. It is all clear and reasonable. I understand and agree that FileStorage and ZEO should remain with good support, as those storages are historically the most widely used. And that it requires to have the concrete implementation to be able to review it and its consequences. So let's wait for me to do this next step. I do not promise it will be soon, but, hopefully, I should be able to do it. P.S. and you are right - loadSerial has |
So that it is generally possible to zodbrestore[1,2] zodbdumped[3]
database/transactions into DemoStorage. Because without this patch, when dump
contains deletion records, e.g.
it fails like this:
To be able to implement DemoStorage.deleteObject, we have to fix
FileStorage.deleteObject first: currently FileStorage.deleteObject raises
POSKeyError if an object is not present in the database, but for situation
where
it creates a problem because there is no way to add whiteout record for the
object into δ.
This change should be generally good; let me quote
https://lab.nexedi.com/kirr/neo/commit/543041a3 for why:
---- 8< ----
Even though currently it is not possible to create such data record via
FileStorage(py).deleteObject (because it raises POSKeyError if there is
no previous object revision), being able to use such data records is
semantically useful in overlayed DemoStorage settings, where δ part
marks an object that exists only in base with delete record whiteout.
It is also generally meaningful to be able to create "delete" record
even if object was not previously existing: "deleteObject" is actually
similar to "store" (and so should be better named as "storeDelete"). If
one wants to store deletion, there should not be a reason to reject it,
because deleteObject already has seatbelt in the form of oldserial, and
if the user calls deleteObject(oid, oldserial=z64), he/she is already
telling that "I know that this object does not exist in this storage
(oldserial=z64), but still please create a deletion record for it". Once
again this is useful in overlayed DemoStorage settings described above.
For the reference, such whiteout deletion records pass ZODB/scripts/fstest just fine.
---- 8< ----
DemoStorage.deleteObject implementation is straightforward, but builds on
loadAt[4] as precondition.
/cc @jimfulton, @jamadden, @perrinjerome
[1] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbrestore.py
[2] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19
[3] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py
[4] #323