Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #cache_key to ActiveRecord::Relation. #20884

Merged
merged 1 commit into from Aug 2, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 10 additions & 0 deletions activerecord/CHANGELOG.md
@@ -1,3 +1,13 @@
* Add `cache_key` to ActiveRecord::Relation.

Example:

@users = User.where("name like ?", "%Alberto%")
@users.cache_key
=> "/users/query-5942b155a43b139f2471b872ac54251f-3-20150714212107656125000"

*Alberto Fernández-Capel*

* Fix a bug where counter_cache doesn't always work with polymorphic
relations.

Expand Down
1 change: 1 addition & 0 deletions activerecord/lib/active_record.rb
Expand Up @@ -53,6 +53,7 @@ module ActiveRecord
autoload :Persistence
autoload :QueryCache
autoload :Querying
autoload :CollectionCacheKey
autoload :ReadonlyAttributes
autoload :RecordInvalid, 'active_record/validations'
autoload :Reflection
Expand Down
1 change: 1 addition & 0 deletions activerecord/lib/active_record/base.rb
Expand Up @@ -280,6 +280,7 @@ class Base
extend Explain
extend Enum
extend Delegation::DelegateCache
extend CollectionCacheKey

include Core
include Persistence
Expand Down
29 changes: 29 additions & 0 deletions activerecord/lib/active_record/collection_cache_key.rb
@@ -0,0 +1,29 @@
module ActiveRecord
module CollectionCacheKey

def collection_cache_key(collection = all, timestamp_column = :updated_at) # :nodoc:
query_signature = Digest::MD5.hexdigest(collection.to_sql)
key = "#{collection.model_name.cache_key}/query-#{query_signature}"

if collection.loaded?
size = collection.size
timestamp = collection.max_by(&timestamp_column).public_send(timestamp_column)
else
column_type = type_for_attribute(timestamp_column.to_s)
column = "#{connection.quote_table_name(collection.table_name)}.#{connection.quote_column_name(timestamp_column)}"

query = collection.select("COUNT(*) AS size", "MAX(#{column}) AS timestamp")
result = connection.select_one(query)

size = result["size"]
timestamp = column_type.deserialize(result["timestamp"])
end

if timestamp
"#{key}-#{size}-#{timestamp.utc.to_s(cache_timestamp_format)}"
else
"#{key}-#{size}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case where we would have no value for timestamp, but size is some value other than 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp column could have NULL values. In that case, it probably wouldn't be a good idea to create a cache key like this, but at least the method behaves somewhat better in that situation.

end
end
end
end
26 changes: 26 additions & 0 deletions activerecord/lib/active_record/relation.rb
Expand Up @@ -298,6 +298,32 @@ def many?
limit_value ? to_a.many? : size > 1
end

# Returns a cache key that can be used to identify the records fetched by
# this query. The cache key is built with a fingerprint of the sql query,
# the number of records matched by the query and a timestamp of the last
# updated record. When a new record comes to match the query, or any of
# the existing records is updated or deleted, the cache key changes.
#
# Product.where("name like ?", "%Cosmic Encounter%").cache_key
# => "products/query-1850ab3d302391b85b8693e941286659-1-20150714212553907087000"
#
# If the collection is loaded, the method will iterate through the records
# to generate the timestamp, otherwise it will trigger one SQL query like:
#
# SELECT COUNT(*), MAX("products"."updated_at") FROM "products" WHERE (name like '%Cosmic Encounter%')
#
# You can also pass a custom timestamp column to fetch the timestamp of the
# last updated record.
#
# Product.where("name like ?", "%Game%").cache_key(:last_reviewed_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to document that the user can override this behavior by implementing self.collection_cache_key on the class

#
# You can customize the strategy to generate the key on a per model basis
# overriding ActiveRecord::Base#collection_cache_key.
def cache_key(timestamp_column = :updated_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name is a bit misleading, it may be how the reason you want to use this data, but not what the data actually is (e.g. it doesn't refer to any actual cache stored by Rails). Not really sure what would be a better name, though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following the same name as ActiveRecord::Base#cache_key as it is essentially the same idea behind it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity perhaps rename timestamp_column to cache_key_column. For example tables without timestamp but which have autosequence id could use the id column, by defining:

def cache_key
  super(:id)
end

On other hand this would introduce discrepancy between definition of collection_cache_key. So another alternative would be to extract class method cache_key_column which returns :updated_at by default and which user could override.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure if cache_key_column is a better name. To me def cache_key(cache_key_column) sounds like we will fetch the key from a column, which is not exactly right.

@cache_keys ||= {}
@cache_keys[timestamp_column] ||= @klass.collection_cache_key(self, timestamp_column)
end

# Scope all queries to the current scope.
#
# Comment.where(post_id: 1).scoping do
Expand Down
70 changes: 70 additions & 0 deletions activerecord/test/cases/collection_cache_key_test.rb
@@ -0,0 +1,70 @@
require "cases/helper"
require "models/computer"
require "models/developer"
require "models/project"
require "models/topic"
require "models/post"
require "models/comment"

module ActiveRecord
class CollectionCacheKeyTest < ActiveRecord::TestCase
fixtures :developers, :projects, :developers_projects, :topics, :comments, :posts

test "collection_cache_key on model" do
assert_match(/\Adevelopers\/query-(\h+)-(\d+)-(\d+)\Z/, Developer.collection_cache_key)
end

test "cache_key for relation" do
developers = Developer.where(name: "David")
last_developer_timestamp = developers.order(updated_at: :desc).first.updated_at

assert_match /\Adevelopers\/query-(\h+)-(\d+)-(\d+)\Z/, developers.cache_key

/\Adevelopers\/query-(\h+)-(\d+)-(\d+)\Z/ =~ developers.cache_key
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_match doesn't set the global match variables.


assert_equal Digest::MD5.hexdigest(developers.to_sql), $1
assert_equal developers.count.to_s, $2
assert_equal last_developer_timestamp.to_s(ActiveRecord::Base.cache_timestamp_format), $3
end

test "it triggers at most one query" do
developers = Developer.where(name: "David")

assert_queries(1) { developers.cache_key }
assert_queries(0) { developers.cache_key }
end

test "it doesn't trigger any query if the relation is already loaded" do
developers = Developer.where(name: "David").load
assert_queries(0) { developers.cache_key }
end

test "relation cache_key changes when the sql query changes" do
developers = Developer.where(name: "David")
other_relation = Developer.where(name: "David").where("1 = 1")

assert_not_equal developers.cache_key, other_relation.cache_key
end

test "cache_key for empty relation" do
developers = Developer.where(name: "Non Existent Developer")
assert_match(/\Adevelopers\/query-(\h+)-0\Z/, developers.cache_key)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but can we add a test case that shows that the count portion is a specific expected value for a cache key, both with and without a where clause?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added these, to check each part of the cache key. Is that ok?


test "cache_key with custom timestamp column" do
topics = Topic.where("title like ?", "%Topic%")
last_topic_timestamp = topics(:fifth).written_on.utc.to_s(:nsec)
assert_match(last_topic_timestamp, topics.cache_key(:written_on))
end

test "cache_key with unknown timestamp column" do
topics = Topic.where("title like ?", "%Topic%")
assert_raises(ActiveRecord::StatementInvalid) { topics.cache_key(:published_at) }
end

test "collection proxy provides a cache_key" do
developers = projects(:active_record).developers
assert_match(/\Adevelopers\/query-(\h+)-(\d+)-(\d+)\Z/, developers.cache_key)
end
end
end