New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-43671: Implement atomic collection prepend #989
Conversation
This transaction check is done a few places, and I need it for a sanity check outside Database in an upcoming commit.
Previously, setCollectionChain would sometimes throw unique index violation exceptions when there were concurrent calls to setCollectionChain. It now uses the same locking as prepend_collection_chain, so last write wins instead of throwing an exception. This also prevents potentially surprising interactions between prepend_collection_chain and setCollectionChain.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #989 +/- ##
==========================================
+ Coverage 88.94% 88.97% +0.03%
==========================================
Files 329 329
Lines 42476 42631 +155
Branches 8717 8743 +26
==========================================
+ Hits 37781 37933 +152
- Misses 3443 3445 +2
- Partials 1252 1253 +1 ☔ View full report in Codecov by Sentry. |
This gives better diagnostics if the test fails, because exceptions within the threads will be raised from the main thread when result() is called.
One issue this doesn't yet address is duplicates in the collection chain. Not sure what would make the most sense:
I'd say probably #1 unless you have other opinions? It's idempotent and makes sense in the context of find-first. |
I think that common use case in AP is when multiple clients do prepend concurrently. So 1 is probably the only useful option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Two minor comments.
starting_position: int, | ||
child_keys: Iterable[K], | ||
) -> None: | ||
position = itertools.count(starting_position) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using enumerate
inside the list comprehension plus an offset seems a little less error-prone. I don't know why that wasn't the case before.
|
||
def prepend_collection_chain( | ||
self, parent_collection_name: str, child_collection_names: list[str] | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to insert some kind of cache-invalidation or cache check on self._caching_context
here. I think
if self._caching_context.is_enabled:
raise <something>
would be fine, as I'm not sure we actually need to support modifying a collection while a caching context is active. Same for update_chain
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does handle updating the cache at the end already. But I like forbidding it entirely, that will let me remove a decent amount of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that setCollectionChain
is called inside a caching context from Butler.import_
. Whether that caching context is useful or not I can't really say.
So I'll do this in prepend_collection_chain but leave update_chain alone for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that making caching context a purely read-only thing would let us remove a decent amount of cache-maintenance code throughout, so might be worth exploring in a later ticket.
I'm updating this to handle duplicates according to strategy 1 and will make the other changes you suggested. |
Remove children from collection chains before prepending them, to ensure that there is only one copy of a child in the chain.
The expected use cases caching context do not require modifying collection chains, so avoid carrying around cache maintenance code that might never be used.
597ea62
to
ce81f8e
Compare
Added an atomic chained collection prepend method to Butler. It works by taking a row lock on the parent collection's row in the collections tables, which acts as a mutex for the modification of the collection_chain table. The existing setCollectionChain method now also uses this lock, so that it can interact safely with the new operation.
Checklist
doc/changes
configs/old_dimensions