-
Notifications
You must be signed in to change notification settings - Fork 21.9k
Recyclable cache keys #29092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recyclable cache keys #29092
Conversation
This will form the foundation of recyclable cache keys.
I don't like the resulting API I would suppose to be using here: Rails.cache.fetch(["post_preview", post], version: post.updated_at)
# or slightly shorter version:
Rails.cache.fetch(["post_preview", post], version: post) The need to explicitly pass version is sad. Rails should still need to maintain a nicer API: Rails.cache.fetch(["post_preview", post])
# with multiple objects too:
Rails.cache.fetch(["post_preview", post, post.author]) Supposing that any object passed as cache key to have a class Post
def cache_key
"post/#{id}"
end
def cache_version
updated_at
end
end |
This is a low-level API that is rarely if ever intended to be used
directly. The sugar will be supplied by CacheHelper, which will do the work
of dissecting an active record object to separate stable cache key from
version information.
…On Mon, May 15, 2017 at 4:31 PM, Bogdan Gusiev ***@***.***> wrote:
I don't like the resulting API I would suppose to be using here:
Rails.cache.fetch(["post_preview", post], version: post.updated_at)
# or slightly shorter version:
Rails.cache.fetch(["post_preview", post], version: post)
The need to explicitly pass version is sad.
Rails should still need to maintain a nicer API:
Rails.cache.fetch(["post_preview", post])# with multiple objects too:Rails.cache.fetch(["post_preview", post, post.author])
Supposing that any object passed as cache key to have a cache_version
method besides cache_key:
class Post
def cache_key
"post/#{id}"
end
def cache_version
updated_at
endend
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29092 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtespLk5YFxYecejaXGm0KZG_7MQhks5r6GHagaJpZM4NbLH7>
.
|
This API is documented and well maintained. There is no reason for anyone to avoid using this API directly. I use this API directly a lot because we have a project with heavy calculations that are outside of HTTP layer. Also this functionality has nothing to do with HTTP stack so there is no benefit from it being only available in the ActionView level. |
You're free to do that, and if you do, you can use it as it's been
documented. But this caching API should work for sorts of different
purposes, so I prefer to keep it low-level and not overly tied to any
larger of a set of APIs on the other side.
Anyway, once this is completed, you're always free to propose a different
kind of API. But it's not a direction I'm currently interested in pursuing
for this.
…On Mon, May 15, 2017 at 4:53 PM, Bogdan Gusiev ***@***.***> wrote:
This API is documented and well maintained. There is no reason for anyone
to avoid using this API directly.
What is an intended way to cache some heavy calculations in model if not
through this API?
I use this API directly a lot because we have a project with heavy
calculations that are outside of HTTP layer. Also this functionality has
nothing to do with HTTP stack so there is no benefit from it being only
available in the ActionView level.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29092 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtfnlMKi5UgIYEkNUo0C5jEQtDEYnks5r6GbugaJpZM4NbLH7>
.
|
Furthermore, any change to CacheStore has to be backwards compatible. If
you supply the same values and get a different key, that's no bueno. Using
:version explicitly is backwards compatible.
…On Mon, May 15, 2017 at 4:53 PM, Bogdan Gusiev ***@***.***> wrote:
This API is documented and well maintained. There is no reason for anyone
to avoid using this API directly.
What is an intended way to cache some heavy calculations in model if not
through this API?
I use this API directly a lot because we have a project with heavy
calculations that are outside of HTTP layer. Also this functionality has
nothing to do with HTTP stack so there is no benefit from it being only
available in the ActionView level.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29092 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtfnlMKi5UgIYEkNUo0C5jEQtDEYnks5r6GbugaJpZM4NbLH7>
.
|
Agree this is not 100% backward compatible. But if you would design the system this way, if you would start it from scratch, this is a good sign to review this as a direction to go. We can search for solutions if backward compatibility is the only concern. |
That's just one of the considerations. Anyway, feel free to fork this PR
and work on an alternative implementation. I've noted my skepticism to
having a lot of convention at this level of the stack, but always happy to
consider some real code, even if it's a long shot.
…On Mon, May 15, 2017 at 5:36 PM, Bogdan Gusiev ***@***.***> wrote:
Agree this is not 100% backward compatible. But if you would design the
system this way, if you would start it from scratch, this is a good sign to
review this as a direction to go. We can search for solutions if backward
compatibility is the only concern.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29092 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtWk-b7rCch8xhvGq6jR5Ej2Cfgqoks5r6HDvgaJpZM4NbLH7>
.
|
Just ask ol' daddy-o for some bigger smackeroos then, sonny! 😄 |
Are you tackling this as part of this PR or intending that for a later one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just found the docs a bit hard to read, 👍
Add a changelog entry too 😉
@@ -232,6 +232,11 @@ def mute | |||
# new value. After that all the processes will start getting the new value. | |||
# The key is to keep <tt>:race_condition_ttl</tt> small. | |||
# | |||
# Setting <tt>:version</tt> will verify that the cache stored in the <tt>name</tt> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me some tries to understand, so I took a dab at slimming it:
# Passing a <tt>:version</tt> verifies the cache stored under <tt>name</tt>
# is of the same version. nil is returned on mismatches despite contents.
# This feature is used to support recyclable cache keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like that.
@@ -307,17 +313,30 @@ def fetch(name, options = nil) | |||
# the cache with the given key, then that data is returned. Otherwise, | |||
# +nil+ is returned. | |||
# | |||
# As with fetch, the data is only returned if it has not expired per the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mention fetch
? Also confusing when put together with Fetches data from the cache
which is how the read
doc starts.
@@ -307,17 +313,30 @@ def fetch(name, options = nil) | |||
# the cache with the given key, then that data is returned. Otherwise, | |||
# +nil+ is returned. | |||
# | |||
# As with fetch, the data is only returned if it has not expired per the | |||
# <tt>:expires_in<tt> option, and, if a <tt>:version</tt> parameter is passed | |||
# to <tt>read</tt>, if it matches the <tt>:version</tt> it was written with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also took me some tries. How about:
# Any found entry is only returned if it has not expired per the
# <tt>:expires_in</tt> option and when passed a
# <tt>:version</tt> parameter if the entry's version matches that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it so.
elsif entry.mismatched?(options[:version]) | ||
if payload | ||
payload[:hit] = false | ||
payload[:mismatch] = "#{entry.version} != #{options[:version]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we'll be logging this in the Action View log subscribing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have a subscriber there to output this. It's a bit annoying that this is disconnected from the general instrumentation of read/write, because it means it has to be logged as a separate line. Would be much nicer if the read log line could say whether there was a version hit or not. Thoughts on how?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have a subscriber there to output this. It's a bit annoying that this is disconnected from the general instrumentation of read/write, because it means it has to be logged as a separate line. Would be much nicer if the read log line could say whether there was a version hit or not. Thoughts on how?
@dhh why does it have to be logged as a separate line?
I guess we could make this type of logging have tags as well, except they're tailing unlike the standard tagged logging.
Something like:
payload.tags[:hit] = false
payload.tags[:mismatch] = "#{entry.version} != #{options[:version}"
# Later: …cache/key… [hit: false][mismatch: 123 != 456]
Would require payload
being more than a Hash though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, right now we're not using this information. Don't think we need it at the moment given the fact that all fragment keys are now using versions. I'll nix it.
Could you have a look at the cache hit/miss issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd rather hit that issue out of the park than miss it 🤓
I don't intend to change that method. What the CacheHelper#vcache method I will propose shortly does is check arity on that method and call it with #cache_key(:without_version). That relies on the fact that the first parameter to AR::Base#cache_key specifies the timestamp field to be used and if there's no match, it won't include a timestamp. |
@kaspth Any thoughts on how we could make this work with multiget? I'm thinking that this low-level API could just be |
That sounds pretty sad because In the meanwhile the behavior of I would goal for a more idealistic solution at least long term and change the Apps relaying on Here is the way one can maintain the old behavior basically forever: module DeprecatedCacheKey
def cache_key
# old cache key implementation
end
def cache_version
nil
end
end
# To maintain the old behavior temporary
ActiveRecord::Base.send(:include, DeprecatedCacheKey)
# To migrate models one by one relaying on old behavior:
Post.send(:include, DeprecatedCacheKey) |
I think it's quite reasonable to leave
|
Although, we could also consider that if cache_version is present on the model, then cache_key does not include the version. That would be backwards compatible. |
When I see
That is a good step forward. We should definitely do that in case we go for the class ApplicationRecord < AR::Base
def cache_version
ActiveRecord::Base.cache_version(self)
end
end
# or maybe even
ApplicationRecord.cache_version = true |
I would bake #cache_version into ActiveModel and then perhaps provide a
top-level setting for making it return nil for backwards compatibility.
Let's see how far we can get with all this. If we can arrive at a place
where we have full backwards compatibility, full support for multiget etc,
then maybe we won't need a new vcache method.
…On Tue, May 16, 2017 at 2:05 PM, Bogdan Gusiev ***@***.***> wrote:
I think it's quite reasonable to leave CacheHelper#cache in place and use
a second helper, like CacheHelper#vcache that uses the new version-based
strategy. This would mean that people can adopt as they please.
When I see vcache besides cache, I imagine how I would deliver the
knowledge to the team of 20 people, where not all of them are Sr
developers. Ideally I want vcache to be the only thing used. I don't
imagine vcache and cache working together in apps that were generated on
5.2+. cache versioning: true looks like a better idea anyway.
Although, we could also consider that if cache_version is present on the
model, then cache_key does not include the version. That would be backwards
compatible.
That is a good step forward. We should definitely do that in case we go
for the cache_version method support.
The problem of canonic cache_version method in each app would be a
problem.
It can be simplified to the following:
class ApplicationRecord < AR::Base
def cache_version
ActiveRecord::Base.cache_version(self)
endend# or maybe evenApplicationRecord.cache_version = true
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29092 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKtU4UbCzMJl3e10HwsI7o5_fuP6mbks5r6ZDwgaJpZM4NbLH7>
.
|
@@ -547,6 +579,10 @@ def normalize_key(key, options) | |||
key | |||
end | |||
|
|||
def normalize_version(key, options) | |||
options[:version] || key.try(:cache_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to support more advanced array keys like:
fetch(["preview", post])
fetch(["preview", post, post.author])
There is an implementation in expand_cache_version
here: https://github.com/rails/rails/pull/29107/files#diff-438394335b9c1ce6ec4f67a407f50a42R89
I think the #cache_version as the toggle has legs. Just extended it to Active Record in a backwards compatible form 👍. |
|
||
## | ||
# :singleton-method: | ||
# Indicates whether to use a stable #cache_key method that is accompaigned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/accompaigned/accompanied/
@@ -85,6 +85,14 @@ def expand_cache_key(key, namespace = nil) | |||
expanded_cache_key | |||
end | |||
|
|||
def expand_cache_version(key) | |||
case | |||
when key.respond_to?(:cache_version) then key.cache_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to have a consistent behavior for the cache version. cache_key
can only be a string at the end (meaning that String is always put into cache store). Maybe we need the same for version. In this case we always need to call to_param
here.
def cache_key(*timestamp_names) | ||
if new_record? | ||
"#{model_name.cache_key}/new" | ||
else | ||
timestamp = if timestamp_names.any? | ||
max_updated_column_timestamp(timestamp_names) | ||
if cache_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When cache_key
is called with argument, the cache_version
setting should be ignored.
I am not sure why cache_key
has an argument: I have checked the code and there is no calls to cache_key
with arguments neither from AS::Cache nor from other places. I would deprecate the argument as part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument is for calling like <% cache [ person.cache_key(:bio_updated_at) ] %>
, but yes, not a common usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. From this usecase: when cache_key
is called with argument, the cache_version
setting should be ignored.
@@ -85,6 +85,14 @@ def expand_cache_key(key, namespace = nil) | |||
expanded_cache_key | |||
end | |||
|
|||
def expand_cache_version(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is currently public. I am not sure why would anyone call it explicitly. I would make it private. Otherwise it needs to be documented.
Current problem is that to use versioning via views, it goes CacheHelper#cache -> Fragments#write_fragment -> ActiveSupport::Cache.expand_cache_key -> ActiveSupport::Cache::Store#write. Well, in the expand_cache_key step we currently convert the key to a string, which means that ActiveSupport::Cache::Store#write can't introspect it for #cache_version. So need to retain the array-based key all the way from the top to the store, which I'm having some trouble trying to do in a backwards compatible manner. |
ActiveSupport 5.2.0 introduces the concept of [recyclable cache keys](rails/rails#29092), which prevents a bunch of unnecessary keys being created and then having to be evicted. This should reduce the memory overhead of keeping a Redis or Memcache server up in order to support a cache. This caused issues for us because the `options` are only being written into the cache key, not the actual entry, which is now required by Rails' cache store semantics. This should fix issues that people are having with rails 5.2.0.rc1 and redis-activesupport.
I noticed some unexpected behaviour related to this change, when used with Active Record relations (e.g. The cache_key implementation works great, but when I can work around it by using something like This is, by the way, without versioning switched on. It's probably an anti-pattern to use AR relations as cache keys in views(?), but this still caught me by surprise. It works and produces the right cache key, but with a pretty hefty and unexpected side-effect. Is there a plan to implement |
* After introducing cache versioning, even with cache versioning off there's a performance regression when passing an Active Record relation to cache * This happens in ActiveSupport::Cache inside `normalize_version` * This method would check if the relation responds to cache_version and if not, would recrusively normalize it with `to_a` * This would lead to the relation being retrieved from database and enumerated, causing the performance regression * This fix simply adds `cache_version` returning `nil` to Active Record relations * This is a temporary stopgap, until relation cache versioning is implemented. See rails#34378
This keeps the version seperate from Rails standard cache_key, which allows for better recycling of cached entries as things get updated freqently. This shouldn't impact our app right now, as we are only using our own manual cache keys for analytics. For more details on how recyclable cache keys work, see: * rails/rails#29092 * https://dzone.com/articles/cache-invalidation-complexity-rails-52-and-dalli-c
This keeps the version seperate from Rails standard cache_key, which allows for better recycling of cached entries as things get updated freqently. This shouldn't impact our app right now, as we are only using our own manual cache keys for analytics. For more details on how recyclable cache keys work, see: * rails/rails#29092 * https://dzone.com/articles/cache-invalidation-complexity-rails-52-and-dalli-c
Key-based cache expiration is an incredibly powerful, simple way to do away with the error-ridden ways of manual cache expiration, but it can also be highly wasteful and generate lots of cache trash.
This happens when you have keys which churn at high velocity, leaving the abandoned keys to be garbage collected with no hint to the fact that they will never be used again. If you cache write volume is high, and you turn over your entire cache allowance frequently enough, this results in cache trash crowding out less-frequently-accessed-but-still-valid keys. Which in turn leads to high cache miss rates.
We can solve this problem by making the keys stable by separating the explicit version. So you can keep a stable key, like "products/1" and an associated version, like "20170202145500", instead of the combined "projects/1-20170202145500" key we've been using so far. This means that no matter how frequently Product/1 is touched, it'll still only write to the same cache key. That's the recycling part here.
This approach is similar to how HTTP caching works. There's a cache key in the form of the URL and then there's a version component in form of the ETAG.