Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark IDC fetched records as read_only #274

Closed
dylanahsmith opened this issue Jun 8, 2016 · 10 comments
Closed

Mark IDC fetched records as read_only #274

dylanahsmith opened this issue Jun 8, 2016 · 10 comments

Comments

@dylanahsmith
Copy link
Contributor

cc @airhorns, @camilo

This was attempted before in #85, which gives a good explanation for the problem:

IdentityCache is a forever inconsistent datastore. It is a duplication of state across two different storage backends, and unless we somehow confirmed commit in both places before acknowledging, we can't guarantee that the two are in sync. The ruby process could die before it has a chance to issue the memcache DEL, the request could get lost if the connection is interrupted or any number of other things could prevent IDC from reflecting whats truly in the database.

This being the case, it is sketchy to write something you find in IDC back to the database, because you might be writing out of date data to the database. This is already the case with a MySQL insert in Rails cause you had to have read that data a finite amount of time ago, but that window is usually much smaller, and we have locking primitives and transactions to manage this. With IDC, we don't have that, and we don't have any way of knowing if the data is out of date without asking the database, which entirely defeats the purpose of IDC.

So, I suggest we return only readonly records from IDC. If you are gonna write a record, first fetch its values from the DB.

One thing I think we should do differently from the pull request, is that records returned from IDC methods when IdentityCache.should_cache? returns false should not be marked as read-only. The problem with marking those records as read-only is that it prevents us from sharing code to use for reading and writing, e.g. calculating some aggregation for a confirmation page which should be consistent with the calculation for the operation itself.

On the other hand, records fetched from the database on a cache miss when IdentityCache.should_cache? returns true should be marked as read-only. This way code that does the wrong thing will fail deterministically, rather than passing on a cache miss and failing on a cache hit.

We might also want to consider making this behaviour configurable, so that we don't get blocked on Shopify needed to be updated to work with this new behaviour, which was probably the main reason that the previous PR didn't get finished and merged. That way we can at least have identity cache do the right thing by default on the next release with breaking changes.

@airhorns
Copy link
Contributor

airhorns commented Jun 8, 2016

All excellent points, I agree! It's tempting to let records from the DB miss path be writable since they are technically up to date but I think determinism and sanity is sooo much more important. Good call.

@camilo
Copy link
Contributor

camilo commented Jun 9, 2016

I think @daniellaniyo can do this after she finishes with the unchain-ability of resutls

@eugeneius
Copy link
Contributor

One thing I think we should do differently from the pull request, is that records returned from IDC methods when IdentityCache.should_cache? returns false should not be marked as read-only.

In a test suite that uses transactions for isolation, IdentityCache.should_use_cache? is always false, as there's always at least one transaction open when your code is running. If we only mark records as readonly when IdentityCache.should_use_cache? is true, then code that updates a record fetched from IdentityCache could work fine in tests but blow up in production. 💥

@dylanahsmith
Copy link
Contributor Author

Thanks for pointing that out, since we have a IdentityCache.should_use_cache? monkey patch in Shopify which not only checks multiple active database connections for open transactions, but also ignores the transactional fixture transaction. Perhaps we should bring over that check into this repo so tests behave like production by default.

@eugeneius
Copy link
Contributor

That would be great! Ping me if you want me to test it first - we also use multiple databases in production.

@camilo
Copy link
Contributor

camilo commented Aug 31, 2016

@daniellaniyo I think we have a branch where @eugeneius can test now right?

@daniellaniyo
Copy link
Contributor

@camilo Yes! Actually the tests could be carried out on the current master branch and setting IdentityCache.fetch_read_only_records to true, or using the IdentityCache.with_fetch_read_only_records block around specific sections.

Also note that we haven't included the IdentityCache.should_use_cache? monkey patch from Shopify yet.

@eugeneius
Copy link
Contributor

Thanks @camilo / @daniellaniyo - but the IdentityCache.should_use_cache? monkey patch is actually the part I was most interested in testing!

I was working on a similar problem recently (preventing background jobs from being enqueued inside a transaction) - I've shared the approach I took there in #293.

@eugeneius
Copy link
Contributor

I just ran our application's test suite against master with my patch from #293 applied.

There were a bunch of failures, but it looks like they're all legitimate cases of fetching a record from the cache and updating it. I have some work to do before I can think about testing in production 😬

@camilo
Copy link
Contributor

camilo commented Mar 5, 2017

I will close this issue, since this is the default behaviour since a while ago.

@camilo camilo closed this as completed Mar 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants