-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add expire methods by key values #495
Conversation
0ebf37b
to
d4851b8
Compare
19e3685
to
a2f57ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the
cache_attribute
method can be complicated on the case that you expire it when you utilize anupsert_all
orupdate_all
method on ActiveRecord as they do not return an instance of ActiveRecord.
It isn't complicated because you don't have an ActiveRecord instance. The API is coupled to Active Record instances and Active Record callbacks because it provides the required data to both invalidate the old and new key if the cache index key columns change.
Even if everyone you tagged on the PR understands the problem, it is still important to make it clear what the scope of the problem that the PR actually addresses and it can be helpful context to anyone coming across the PR (e.g. through git archeology or issue/PR searching).
upsert_all
and update_all
are really just doing underlying database statements that don't load the rows. That doesn't mean we couldn't load the rows before those statements to get the Active Record objects. Was this considered as a way to get the old values?
Does your specific problem have a more limited scope, such as knowing that the cache indexes keys won't be modified?
We decided to add methods for this use case using
expire_by_key_value
and adding dynamic methods forexpire_<attribute>_by_<key>
to be able to expire the cache.
This doesn't really explain the solution. What are the thoughts behind that decision? Why did you add more than one way to expire these caches? How do you intend for these methods to be used to actually solve the problem?
The meta-programming defined methods to invalidate individual cache indexes seem like they could end up with code implicitly coupled to the set of cache indexes, such that a new one being added could result in it not being invalidated by existing code doing these manual cache invalidations. I like how the approach taken with expire_by_key_value
seems like it helps avoid this problem, although I still don't like how it hides the old key problem.
Feel free to answer questions in the form of documentation. I'm also very concerned about how this could be mis-used, so documentation that helps guide the library user would be helpful.
lib/identity_cache/query_api.rb
Outdated
@@ -19,6 +20,13 @@ def all_cached_associations # :nodoc: | |||
cached_has_manys.merge(cached_has_ones).merge(cached_belongs_tos) | |||
end | |||
|
|||
# Expire the cache by key values | |||
def expire_by_key_values(key_values) # :nodoc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# :nodoc
generally implies the code is internal (although, I think we should use # @api private
to make that clearer). However, in this case you actually intend for this to be public.
It doesn't seem like the callsite for this method will be clear enough for readers of these method calls. key_values
is a term used in the method name, yet no where else in the usage of Identity Cache, so it will be hard for the reader of the method call to know what this is referring to. The object it is called on is the Active Record model, so it isn't even in the context of Identity Cache, let alone in the context of a cache attribute or index.
lib/identity_cache/query_api.rb
Outdated
@@ -19,6 +20,13 @@ def all_cached_associations # :nodoc: | |||
cached_has_manys.merge(cached_has_ones).merge(cached_belongs_tos) | |||
end | |||
|
|||
# Expire the cache by key values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems quite useless, considering that it mostly just repeats the method name, with the addition of the cache
. It is quite clear that this library is all about caches.
Documentation could provide some very helpful context to these methods.
For instance, it isn't clear outside of the PR description that these PRs are for expiring the cache when doing an upsert_all
or update_all
. The code itself should answer questions about why the method exist without having to do git archeology to get this information.
Even the PR description doesn't explain when to use this method compared to the meta-programming defined ones, which would be helpful context on why this specific method actually exists. Also, how the method be used to actually solve your problem is not explained anywhere and would be important to document for developers that encounter the same problem to discover or be referred to.
a095e0f
to
94bb7ce
Compare
94bb7ce
to
c61f20e
Compare
def expire_by_key_value(key_values) | ||
missing_keys = missing_keys(key_values) | ||
unless missing_keys.empty? | ||
raise MissingKeyName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't just need the name of the key, so MissingKey seems like a more appropriate name, but then why not just use KeyError
?
assert_equal("foo", AssociatedRecord.fetch_name_by_id_and_item_id(1, 1)) | ||
end | ||
|
||
key_values = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be this hash that crashes the cop?
undefined method `loc' for nil:NilClass
/opt/hostedtoolcache/Ruby/3.0.1/x64/lib/ruby/gems/3.0.0/gems/rubocop-1.16.1/lib/rubocop/cop/layout/hash_alignment.rb:221:in `autocorrect_incompatible_with_other_cops?'
/opt/hostedtoolcache/Ruby/3.0.1/x64/lib/ruby/gems/3.0.0/gems/rubocop-1.16.1/lib/rubocop/cop/layout/hash_alignment.rb:206:in `on_hash'
Or maybe because ({ item_id: 1 })
does not need the {}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rubocop failure was a bug and the fix hasn't been released yet. I opened #496 to fix that.
# frozen_string_literal: true | ||
require "test_helper" | ||
|
||
class ExpireByKeyValuesTest < IdentityCache::TestCase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I believe that these feature based unit test files are an anti-pattern (#498). I just hadn't gotten around to actually refactoring the tests around the code
There are situations where performance is crucial. For those reasons it can be desired to utilize methods that do not instantiate active record objects, for example
pluck
for reading data andupsert_all
for writing data. Utilizing these gives the developer the flexibility to read or update the needed information without needing to take the time to instantiate a new Active Record instance for each of the objects as this instantiation can really add up on processing time.Following the
pluck
example utilizing the IDC feature ofcache_attribute
it is super useful in the case that we need just one value of one record. In the case where we need to update these records it might be preferred to utilizeupsert_all
which does not instantiate any ActiveRecords and we don't have a way to expire the cache. A possible way to accomplish this is to fetch the data before doing the update or update them through the ActiveRecord object, although the performance would be very poorly in whichupsert_all
is preferred.We are proposing on this PR a method to expire the cache when you don't have a direct access to an ActiveRecord instance. The method is attached to the
Attribute
class and we can access it like this:We decided to pass a hash for the case of when we have an attribute cached by multiple keys:
This also lets us have this case:
We made this the most flexible possible to be able to handle these different possibilities as well as to avoid expiring the cache and missing one key, the attribute class will raise an exception letting you know that there is a missing field.
If the update changes the value of the key, then pluck should be used to get the current values, and the expiration call should include both the new and old values of the keys.
There is also some discussion about these changes, the main questions that comes out from these changes are:
expire_<attribute>_by_<key>
? In my opinion it could add a possible problem where you only expire one key of the object and not the other one, but we believe it is a good discussion to have.NameOfScope::Item.cache_indexes.each
@Shopify/delivery-platform