New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recyclable cache keys #29092

Merged
merged 42 commits into from May 18, 2017

Conversation

Projects
None yet
9 participants
@dhh
Member

dhh commented May 15, 2017

Key-based cache expiration is an incredibly powerful, simple way to do away with the error-ridden ways of manual cache expiration, but it can also be highly wasteful and generate lots of cache trash.

This happens when you have keys which churn at high velocity, leaving the abandoned keys to be garbage collected with no hint to the fact that they will never be used again. If you cache write volume is high, and you turn over your entire cache allowance frequently enough, this results in cache trash crowding out less-frequently-accessed-but-still-valid keys. Which in turn leads to high cache miss rates.

We can solve this problem by making the keys stable by separating the explicit version. So you can keep a stable key, like "products/1" and an associated version, like "20170202145500", instead of the combined "projects/1-20170202145500" key we've been using so far. This means that no matter how frequently Product/1 is touched, it'll still only write to the same cache key. That's the recycling part here.

This approach is similar to how HTTP caching works. There's a cache key in the form of the URL and then there's a version component in form of the ETAG.

Support versioned cache entries
This will form the foundation of recyclable cache keys.

@dhh dhh added the activesupport label May 15, 2017

@dhh dhh added this to the 5.2.0 milestone May 15, 2017

@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 15, 2017

Contributor

I don't like the resulting API I would suppose to be using here:

Rails.cache.fetch(["post_preview", post], version: post.updated_at)
# or slightly shorter version:
Rails.cache.fetch(["post_preview", post], version: post)

The need to explicitly pass version is sad.

Rails should still need to maintain a nicer API:

Rails.cache.fetch(["post_preview", post])
# with multiple objects too:
Rails.cache.fetch(["post_preview", post, post.author]) 

Supposing that any object passed as cache key to have a cache_version method besides cache_key:

class Post
  def cache_key
     "post/#{id}"
  end
  def cache_version
    updated_at
  end
end
Contributor

bogdan commented May 15, 2017

I don't like the resulting API I would suppose to be using here:

Rails.cache.fetch(["post_preview", post], version: post.updated_at)
# or slightly shorter version:
Rails.cache.fetch(["post_preview", post], version: post)

The need to explicitly pass version is sad.

Rails should still need to maintain a nicer API:

Rails.cache.fetch(["post_preview", post])
# with multiple objects too:
Rails.cache.fetch(["post_preview", post, post.author]) 

Supposing that any object passed as cache key to have a cache_version method besides cache_key:

class Post
  def cache_key
     "post/#{id}"
  end
  def cache_version
    updated_at
  end
end
@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 15, 2017

Member
Member

dhh commented May 15, 2017

@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 15, 2017

Contributor

This API is documented and well maintained. There is no reason for anyone to avoid using this API directly.
What is an intended way to cache some heavy calculations in model if not through this API?

I use this API directly a lot because we have a project with heavy calculations that are outside of HTTP layer. Also this functionality has nothing to do with HTTP stack so there is no benefit from it being only available in the ActionView level.

Contributor

bogdan commented May 15, 2017

This API is documented and well maintained. There is no reason for anyone to avoid using this API directly.
What is an intended way to cache some heavy calculations in model if not through this API?

I use this API directly a lot because we have a project with heavy calculations that are outside of HTTP layer. Also this functionality has nothing to do with HTTP stack so there is no benefit from it being only available in the ActionView level.

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 15, 2017

Member
Member

dhh commented May 15, 2017

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 15, 2017

Member
Member

dhh commented May 15, 2017

@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 15, 2017

Contributor

Agree this is not 100% backward compatible. But if you would design the system this way, if you would start it from scratch, this is a good sign to review this as a direction to go. We can search for solutions if backward compatibility is the only concern.

Contributor

bogdan commented May 15, 2017

Agree this is not 100% backward compatible. But if you would design the system this way, if you would start it from scratch, this is a good sign to review this as a direction to go. We can search for solutions if backward compatibility is the only concern.

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 15, 2017

Member
Member

dhh commented May 15, 2017

@kaspth

This comment has been minimized.

Show comment
Hide comment
@kaspth

kaspth May 15, 2017

Member

you turn over your entire cache allowance frequently enough

Just ask ol' daddy-o for some bigger smackeroos then, sonny! 😄

Member

kaspth commented May 15, 2017

you turn over your entire cache allowance frequently enough

Just ask ol' daddy-o for some bigger smackeroos then, sonny! 😄

@kaspth

This comment has been minimized.

Show comment
Hide comment
@kaspth

kaspth May 15, 2017

Member

Unfinished: Dealing with multi_get/fetch

Are you tackling this as part of this PR or intending that for a later one?

Member

kaspth commented May 15, 2017

Unfinished: Dealing with multi_get/fetch

Are you tackling this as part of this PR or intending that for a later one?

@kaspth

kaspth approved these changes May 15, 2017

Mostly just found the docs a bit hard to read, 👍

Add a changelog entry too 😉

Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 16, 2017

Contributor

@dhh do you plan to change ActiveRecord::Base#cache_key method? I think you can not move forward without it and it will be backward incompatible anyway (or at least generate the same type of problem as in my suggestion). Anyway I made the cache_version method in the way imagine it: #29107

Contributor

bogdan commented May 16, 2017

@dhh do you plan to change ActiveRecord::Base#cache_key method? I think you can not move forward without it and it will be backward incompatible anyway (or at least generate the same type of problem as in my suggestion). Anyway I made the cache_version method in the way imagine it: #29107

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member

I don't intend to change that method. What the CacheHelper#vcache method I will propose shortly does is check arity on that method and call it with #cache_key(:without_version). That relies on the fact that the first parameter to AR::Base#cache_key specifies the timestamp field to be used and if there's no match, it won't include a timestamp.

Member

dhh commented May 16, 2017

I don't intend to change that method. What the CacheHelper#vcache method I will propose shortly does is check arity on that method and call it with #cache_key(:without_version). That relies on the fact that the first parameter to AR::Base#cache_key specifies the timestamp field to be used and if there's no match, it won't include a timestamp.

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member

@kaspth Any thoughts on how we could make this work with multiget? I'm thinking that this low-level API could just be Rails.cache.fetch("a", "b", versions: [ 1, 2 ]). Then we'd need to do the right thing for cache: true in the render collection call.

Member

dhh commented May 16, 2017

@kaspth Any thoughts on how we could make this work with multiget? I'm thinking that this low-level API could just be Rails.cache.fetch("a", "b", versions: [ 1, 2 ]). Then we'd need to do the right thing for cache: true in the render collection call.

@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 16, 2017

Contributor

That sounds pretty sad because cache_key is only used for ActiveSupport::Cache implementation (correct me if I am wrong). It means that cache_key argument and timestamp will be an artifact we would need to support.

In the meanwhile the behavior of CacheHelper#cache will be changed anyway. It may affect Apps in the same way the change in CacheStore#fetch would.

I would goal for a more idealistic solution at least long term and change the cache_key method to remove the timestamp.

Apps relaying on cache_key to include timestamp would not be necessary as cache_key is designed to be defined per app per model.

Here is the way one can maintain the old behavior basically forever:

module DeprecatedCacheKey
  def cache_key
     # old cache key implementation
  end
  def cache_version
    nil
  end
end
# To maintain the old behavior temporary
ActiveRecord::Base.send(:include, DeprecatedCacheKey)
# To migrate models one by one relaying on old behavior: 
Post.send(:include, DeprecatedCacheKey)
Contributor

bogdan commented May 16, 2017

That sounds pretty sad because cache_key is only used for ActiveSupport::Cache implementation (correct me if I am wrong). It means that cache_key argument and timestamp will be an artifact we would need to support.

In the meanwhile the behavior of CacheHelper#cache will be changed anyway. It may affect Apps in the same way the change in CacheStore#fetch would.

I would goal for a more idealistic solution at least long term and change the cache_key method to remove the timestamp.

Apps relaying on cache_key to include timestamp would not be necessary as cache_key is designed to be defined per app per model.

Here is the way one can maintain the old behavior basically forever:

module DeprecatedCacheKey
  def cache_key
     # old cache key implementation
  end
  def cache_version
    nil
  end
end
# To maintain the old behavior temporary
ActiveRecord::Base.send(:include, DeprecatedCacheKey)
# To migrate models one by one relaying on old behavior: 
Post.send(:include, DeprecatedCacheKey)
@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member

I think it's quite reasonable to leave CacheHelper#cache in place and use a second helper, like CacheHelper#vcache that uses the new version-based strategy. This would mean that people can adopt as they please.

cache_key is used in a variety of ways, not just by ActiveSupport::Cache. I don't think blowing backwards compatibility is necessary or would buy us very much.

Member

dhh commented May 16, 2017

I think it's quite reasonable to leave CacheHelper#cache in place and use a second helper, like CacheHelper#vcache that uses the new version-based strategy. This would mean that people can adopt as they please.

cache_key is used in a variety of ways, not just by ActiveSupport::Cache. I don't think blowing backwards compatibility is necessary or would buy us very much.

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member

Although, we could also consider that if cache_version is present on the model, then cache_key does not include the version. That would be backwards compatible.

Member

dhh commented May 16, 2017

Although, we could also consider that if cache_version is present on the model, then cache_key does not include the version. That would be backwards compatible.

@bogdan

This comment has been minimized.

Show comment
Hide comment
@bogdan

bogdan May 16, 2017

Contributor

I think it's quite reasonable to leave CacheHelper#cache in place and use a second helper, like CacheHelper#vcache that uses the new version-based strategy. This would mean that people can adopt as they please.

When I see vcache besides cache, I imagine how I would deliver the knowledge to the team of 20 people, where not all of them are Sr developers. Ideally I want vcache to be the only thing used. I don't imagine vcache and cache working together in apps that were generated on 5.2+. cache versioning: true looks like a better idea anyway.

Although, we could also consider that if cache_version is present on the model, then cache_key does not include the version. That would be backwards compatible.

That is a good step forward. We should definitely do that in case we go for the cache_version method support.
The problem of canonic cache_version method in each app would be a problem.
It can be simplified to the following:

class ApplicationRecord < AR::Base
  def cache_version
    ActiveRecord::Base.cache_version(self)
  end
end
# or maybe even
ApplicationRecord.cache_version = true
Contributor

bogdan commented May 16, 2017

I think it's quite reasonable to leave CacheHelper#cache in place and use a second helper, like CacheHelper#vcache that uses the new version-based strategy. This would mean that people can adopt as they please.

When I see vcache besides cache, I imagine how I would deliver the knowledge to the team of 20 people, where not all of them are Sr developers. Ideally I want vcache to be the only thing used. I don't imagine vcache and cache working together in apps that were generated on 5.2+. cache versioning: true looks like a better idea anyway.

Although, we could also consider that if cache_version is present on the model, then cache_key does not include the version. That would be backwards compatible.

That is a good step forward. We should definitely do that in case we go for the cache_version method support.
The problem of canonic cache_version method in each app would be a problem.
It can be simplified to the following:

class ApplicationRecord < AR::Base
  def cache_version
    ActiveRecord::Base.cache_version(self)
  end
end
# or maybe even
ApplicationRecord.cache_version = true
@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member
Member

dhh commented May 16, 2017

dhh added some commits May 16, 2017

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 16, 2017

Member

I think the #cache_version as the toggle has legs. Just extended it to Active Record in a backwards compatible form 👍.

Member

dhh commented May 16, 2017

I think the #cache_version as the toggle has legs. Just extended it to Active Record in a backwards compatible form 👍.

Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
Show outdated Hide outdated activerecord/lib/active_record/integration.rb Outdated
Show outdated Hide outdated activesupport/lib/active_support/cache.rb Outdated
@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 17, 2017

Member

Current problem is that to use versioning via views, it goes CacheHelper#cache -> Fragments#write_fragment -> ActiveSupport::Cache.expand_cache_key -> ActiveSupport::Cache::Store#write. Well, in the expand_cache_key step we currently convert the key to a string, which means that ActiveSupport::Cache::Store#write can't introspect it for #cache_version. So need to retain the array-based key all the way from the top to the store, which I'm having some trouble trying to do in a backwards compatible manner.

Member

dhh commented May 17, 2017

Current problem is that to use versioning via views, it goes CacheHelper#cache -> Fragments#write_fragment -> ActiveSupport::Cache.expand_cache_key -> ActiveSupport::Cache::Store#write. Well, in the expand_cache_key step we currently convert the key to a string, which means that ActiveSupport::Cache::Store#write can't introspect it for #cache_version. So need to retain the array-based key all the way from the top to the store, which I'm having some trouble trying to do in a backwards compatible manner.

# Note, this method will return nil if ActiveRecord::Base.cache_versioning is set to
# +false+ (which it is by default until Rails 6.0).
def cache_version
try(:updated_at).try(:to_i) if cache_versioning

This comment has been minimized.

@bogdan

bogdan May 18, 2017

Contributor

As we convert all versions to string at the low level, it makes sense to call to_s here too so that direct call to cache_version would return the actual version value that will be put to the cache store.

@bogdan

bogdan May 18, 2017

Contributor

As we convert all versions to string at the low level, it makes sense to call to_s here too so that direct call to cache_version would return the actual version value that will be put to the cache store.

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 18, 2017

Member
Member

dhh commented May 18, 2017

@dhh

This comment has been minimized.

Show comment
Hide comment
@dhh

dhh May 18, 2017

Member

CI is complaining of this failure:

screen shot 2017-05-18 at 5 15 27 pm

I can't replicate that:

screen shot 2017-05-18 at 5 17 27 pm

So not going to let that hold up the merge.

Member

dhh commented May 18, 2017

CI is complaining of this failure:

screen shot 2017-05-18 at 5 15 27 pm

I can't replicate that:

screen shot 2017-05-18 at 5 17 27 pm

So not going to let that hold up the merge.

dhh added some commits May 18, 2017

@dhh dhh merged commit 75fa8dd into master May 18, 2017

0 of 2 checks passed

codeclimate Code Climate is analyzing this code.
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details
@@ -121,7 +121,7 @@ def config_when_updating
action_cable_config_exist = File.exist?("config/cable.yml")
rack_cors_config_exist = File.exist?("config/initializers/cors.rb")
assets_config_exist = File.exist?("config/initializers/assets.rb")
new_framework_defaults_5_1_exist = File.exist?("config/initializers/new_framework_defaults_5_1.rb")
new_framework_defaults_5_2_exist = File.exist?("config/initializers/new_framework_defaults_5_2.rb")

This comment has been minimized.

@mikeycgto

mikeycgto May 19, 2017

Contributor

new_framework_defaults_5_2_exist is now an unused variable. It is getting reported as a warning from the test:

rails/railties/lib/rails/generators/rails/app/app_generator.rb:124: warning: assigned but unused variable - new_framewo
rk_defaults_5_2_exist
@mikeycgto

mikeycgto May 19, 2017

Contributor

new_framework_defaults_5_2_exist is now an unused variable. It is getting reported as a warning from the test:

rails/railties/lib/rails/generators/rails/app/app_generator.rb:124: warning: assigned but unused variable - new_framewo
rk_defaults_5_2_exist

This comment has been minimized.

@dhh

dhh May 20, 2017

Member

Fixed in e4ce81c 👍

@dhh

dhh May 20, 2017

Member

Fixed in e4ce81c 👍

bogdanvlviv added a commit to bogdanvlviv/rails that referenced this pull request May 20, 2017

@Dorian

This comment has been minimized.

Show comment
Hide comment
@Dorian

Dorian May 26, 2017

Contributor

For people hoping on the discussion like, here is how the cache keys used to be generated: https://stackoverflow.com/a/23347985/407213

Contributor

Dorian commented May 26, 2017

For people hoping on the discussion like, here is how the cache keys used to be generated: https://stackoverflow.com/a/23347985/407213

@dhh dhh deleted the versioned-cache-entries branch Aug 4, 2017

@tomrossi7

This comment has been minimized.

Show comment
Hide comment
@tomrossi7

tomrossi7 Jan 3, 2018

I'm having trouble migrating to edge Rails and noticed my cache keys no longer include timestamps. Will the discussion on this thread require updates to the documentation on caching here?

tomrossi7 commented Jan 3, 2018

I'm having trouble migrating to edge Rails and noticed my cache keys no longer include timestamps. Will the discussion on this thread require updates to the documentation on caching here?

tubbo added a commit to redis-store/redis-activesupport that referenced this pull request Mar 7, 2018

Support recyclable cache keys
ActiveSupport 5.2.0 introduces the concept of
[recyclable cache keys](rails/rails#29092),
which prevents a bunch of unnecessary keys being created and then having
to be evicted. This should reduce the memory overhead of keeping a Redis
or Memcache server up in order to support a cache. However, this caused
issues for us because of an unneccessary override of `#write`. I can't
really see much of a difference between our implementation and the one
[baked into AS::Cache::Store](https://github.com/rails/rails/blob/master/activesupport/lib/active_support/cache.rb#L438-L448),
other than this `:race_condition_ttl` code which seems unnecessary since
`Cache::Store` supports it under the hood.

This should fix issues that people are having with rails 5.2.0.rc1 and
redis-activesupport.

@tubbo tubbo referenced this pull request Mar 7, 2018

Merged

Support recyclable cache keys #103

tubbo added a commit to redis-store/redis-activesupport that referenced this pull request Mar 13, 2018

Support recyclable cache keys
ActiveSupport 5.2.0 introduces the concept of
[recyclable cache keys](rails/rails#29092),
which prevents a bunch of unnecessary keys being created and then having
to be evicted. This should reduce the memory overhead of keeping a Redis
or Memcache server up in order to support a cache. This caused issues
for us because the `options` are only being written into the cache key,
not the actual entry, which is now required by Rails' cache store
semantics. This should fix issues that people are having with rails 5.2.0.rc1
and redis-activesupport.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment