Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't delay cache additions until after the transaction ends #3172

Merged
merged 5 commits into from Feb 16, 2024

Conversation

mwiencek
Copy link
Member

Problem

After searching for issues similar to MBS-13481 (where we load additional data onto an entity which is then entered into the cache, but where changes to that additional data do not invalidate the entity's cache entry), I believe that there are simply far too many uncovered cases which we'd have to handle.

While storing more data in the cache and having intelligent invalidation logic for all the linked data sounds desirable, it's too many changes to manage for now, so we need to restore the previous behavior.

Solution

This partially reverts 82c79f5 and 42be4bc.

I realized (later than I should have) that we don't need to delay cache additions until after the database transaction ends. We may have had to in a previous implementation I had, but no longer.

The post_txn_callbacks ran regardless of whether or not the commit succeeded, so an entity would always be cached in the state it was initially loaded in. The only difference now is when that occurs. There's no type of race condition that isn't already handled:

  • If the entity is deleted from the cache right after it's added, then there is no issue, just like before.

  • If the entity is added to the cache right after it's deleted, then there is still no issue, because our existing code to delete the added entry if there exists a recently_invalidated key for the same still applies.

Testing

The existing *EntityCache tests pass, and I've added a couple additional tests to ensure that neither release events nor release labels are entered into the cache with a release.

@mwiencek
Copy link
Member Author

I have a fix for the Data::Artist failures, but the Data::Series ones require further investigation.

The methods in question were copied from the `GIDRedirect` role in 8a841d6,
but I believe we can just use that role directly.
@mwiencek mwiencek force-pushed the mbs-13449-no-cache-add-delay branch 2 times, most recently from d43657e to 25bb8b7 Compare February 13, 2024 18:12
This will fix test failures in a subsequent commit.

It appears that we don't currently do this, but issues did not arise in the
tests due to cache additions being delayed (i.e., there being nothing to
invalidate).
`automatically_reorder` (invoked by `reorder_for_entities`) sorts series items
by "part of" relationship, which are loaded via our Perl data models, which in
turn will load entities on those relationships. Those entities may have been
updated in the current transaction and thus should be invalidated from the
cache before we attempt to sort the series items based on stale entity data.
(This will fix test failures in a subsequent commit.)
This partially reverts 82c79f5 and
42be4bc.

After searching for issues similar to MBS-13481 (where we load additional data
onto an entity which is then entered into the cache, but where changes to that
additional data do not invalidate the entity's cache entry), I believe that
there are simply far too many uncovered cases which we'd have to handle.

While storing more data in the cache and having intelligent invalidation logic
for all the linked data sounds desirable, it's too many changes to manage for
now, so we need to restore the previous behavior.

I realized (later than I should have) that we don't need to delay cache
additions until after the database transaction ends. We may have had to in a
previous implementation I had, but no longer.

The `post_txn_callbacks` ran regardless of whether or not the commit succeeded,
so an entity would always be cached in the state it was initially loaded in.
The only difference now is when that occurs. There's no type of race condition
that isn't already handled:

 * If the entity is deleted from the cache right after it's added, then there
   is no issue, just like before.

 * If the entity is added to the cache right after it's deleted, then there is
   still no issue, because our existing code to delete the added entry if there
   exists a `recently_invalidated` key for the same still applies.

The existing `*EntityCache` tests pass, and I've added a couple additional
tests to ensure that neither release events nor release labels are entered into
the cache with a release.
Copy link
Contributor

@yvanzo yvanzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the in-depth investigation of this cache issue even though it doesn’t come to the desired improvements. I agree that we cannot manage such changes at the moment.

Changes look good, commit messages are very explanatory, and I also quickly tested the website locally just in case.

Merging and deploying to both beta and production as requested on IRC. 🚢 🅱️ 🏭

@yvanzo yvanzo merged commit eb3fb78 into metabrainz:production Feb 16, 2024
3 checks passed
@mwiencek mwiencek deleted the mbs-13449-no-cache-add-delay branch February 16, 2024 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants