New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments API: fix caching issues #4744
Changes from all commits
1cb4973
d2a1a1a
a717463
3b510eb
c5cce26
46984b7
f3d0009
05d407b
99fafb8
4cb2a72
7c377b6
bb0c9c1
c0c3964
3700af9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,21 +3,42 @@ module V0 | |
class CommentsController < ApiController | ||
respond_to :json | ||
|
||
caches_action :index, | ||
cache_path: proc { |c| c.params.permit! }, | ||
expires_in: 10.minutes | ||
|
||
caches_action :show, | ||
cache_path: proc { |c| c.params.permit! }, | ||
expires_in: 10.minutes | ||
before_action :set_cache_control_headers, only: %i[index show] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this needs to be called explicitly to configure edge caching, see https://github.com/fastly/fastly-rails#headers |
||
|
||
def index | ||
article = Article.find(params[:a_id]) | ||
@comments = article.comments.roots.includes(:user).select(%i[id processed_html user_id ancestry]) | ||
|
||
@comments = article.comments.includes(:user).select(%i[id processed_html user_id ancestry]).arrange | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here we tell
It will then be navigated by the |
||
|
||
set_surrogate_key_header "comments", edge_cache_keys(@comments) | ||
end | ||
|
||
def show | ||
@comment = Comment.find(params[:id].to_i(26)) | ||
tree_with_root_comment = Comment.subtree_of(params[:id].to_i(26)). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that users are very large objects and we might be loading a lot of them here I tried to look for ways we could limit the user data we were pulling as well and came up with using a [36] pry(main)> Comment.subtree_of(30).joins(:user).select("users.name, users.twitter_username").select(%i[id processed_html user_id ancestry]).arrange
Comment Load (1.0ms) SELECT "comments".* FROM "comments" WHERE "comments"."id" = $1 LIMIT $2 [["id", 30], ["LIMIT", 1]]
Comment Load (5.8ms) SELECT users.name, users.twitter_username, "comments"."id", "comments"."processed_html", "comments"."user_id", "comments"."ancestry" FROM "comments" INNER JOIN "users" ON "users"."id" = "comments"."user_id" WHERE (("comments"."ancestry" LIKE '30/%' OR "comments"."ancestry" = '30') OR "comments"."id" = 30)
=> {#<Comment:0x00007f895ced4da0
id: 30,
ancestry: nil,
processed_html: "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body>\n<p>Cleanse lomo trust fund lumbersexual swag kombucha poutine.</p>\n\n</body></html>\n",
user_id: 6>=>{}}
[37] pry(main)> Comment.subtree_of(30).includes(:user).select(%i[id processed_html user_id ancestry]).arrange
Comment Load (0.4ms) SELECT "comments".* FROM "comments" WHERE "comments"."id" = $1 LIMIT $2 [["id", 30], ["LIMIT", 1]]
Comment Load (0.3ms) SELECT "comments"."id", "comments"."processed_html", "comments"."user_id", "comments"."ancestry" FROM "comments" WHERE (("comments"."ancestry" LIKE '30/%' OR "comments"."ancestry" = '30') OR "comments"."id" = 30)
User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 [["id", 6]]
=> {#<Comment:0x00007f896ab54578
id: 30,
ancestry: nil,
processed_html: "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body>\n<p>Cleanse lomo trust fund lumbersexual swag kombucha poutine.</p>\n\n</body></html>\n",
user_id: 6>=>{}}
[38] pry(main)> There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we user There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't use Notice how Rails issues a separate SQL query for each user: Processing by Api::V0::CommentsController#index as JSON
Parameters: {"a_id"=>"13"}
Article Load (0.5ms) SELECT "articles".* FROM "articles" WHERE "articles"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
Comment Load (0.7ms) SELECT "comments"."id", "comments"."processed_html", "comments"."user_id", "comments"."ancestry" FROM "comments" INNER JOIN "users" ON "users"."id" = "comments"."user_id" WHERE "comments"."commentable_id" = $1 AND "comments"."commentable_type" = $2 [["commentable_id", 13], ["commentable_type", "Article"]]
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
User Load (0.5ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 9], ["LIMIT", 1]]
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 6], ["LIMIT", 1]]
CACHE User Load (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 6], ["LIMIT", 1]]
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
CACHE User Load (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
CACHE User Load (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
CACHE User Load (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
CACHE User Load (0.0ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]] it even signals the N+1 in the logs, thanks to the bullet gem:
Unfortunately I can't find a way to use, directly from Rails, Rails AFAIK doesn't support select on preloaded relationships, see this comment by a member of the Rails team rails/rails#15185 (comment) - The reason being how I was somehow able, thanks to a StackOverflow answer to come up with the following: article = Article.find 13
ActiveRecord::Associations::Preloader.new.preload(article.comments.select(%i[id processed_html user_id ancestry]), :user, select: %i[id name username]).first.send(:owners) but: the API is not public, it forces us to use preload in all cases and it returns an array instead of a The only other way I can think of is to cheat, with something like: @users = User.where(id: article.comments.pluck(:user_id)).select(%i[id name username twitter_username github_username website_url])
@comments = article.comments.select(%i[id processed_html user_id ancestry]).arrange and then in the view code we pass along the user = @users.find { |u| u.id == comment.user_id }
json.partial! "comment", comment: comment, user: user I made it work locally. The queries are: Processing by Api::V0::CommentsController#index as JSON
Parameters: {"a_id"=>"13"}
Article Load (0.6ms) SELECT "articles".* FROM "articles" WHERE "articles"."id" = $1 LIMIT $2 [["id", 13], ["LIMIT", 1]]
(0.4ms) SELECT "comments"."user_id" FROM "comments" WHERE "comments"."commentable_id" = $1 AND "comments"."commentable_type" = $2 [["commentable_id", 13], ["commentable_type", "Article"]]
Comment Load (0.3ms) SELECT "comments"."id", "comments"."processed_html", "comments"."user_id", "comments"."ancestry" FROM "comments" WHERE "comments"."commentable_id" = $1 AND "comments"."commentable_type" = $2 [["commentable_id", 13], ["commentable_type", "Article"]]
User Load (0.7ms) SELECT "users"."id", "users"."name", "users"."username", "users"."twitter_username", "users"."github_username", "users"."website_url" FROM "users" WHERE "users"."id" IN ($1, $2, $3, $4, $5, $6, $7, $8, $9) [["id", 6], ["id", 9], ["id", 6], ["id", 3], ["id", 13], ["id", 13], ["id", 13], ["id", 13], ["id", 13]] We have one more query (to get all the user ids to from comments), but we do indeed have smaller user objects in memory. Let me know what you think. It would work in a similar way for the Processing by Api::V0::CommentsController#show as JSON
Parameters: {"id"=>"1a"}
Comment Load (0.2ms) SELECT "comments".* FROM "comments" WHERE "comments"."id" = $1 LIMIT $2 [["id", 36], ["LIMIT", 1]]
(0.4ms) SELECT "comments"."user_id" FROM "comments" WHERE (("comments"."ancestry" LIKE '6/35/36/%' OR "comments"."ancestry" = '6/35/36') OR "comments"."id" = 36)
Comment Load (0.6ms) SELECT "comments"."id", "comments"."processed_html", "comments"."user_id", "comments"."ancestry" FROM "comments" WHERE (("comments"."ancestry" LIKE '6/35/36/%' OR "comments"."ancestry" = '6/35/36') OR "comments"."id" = 36)
User Load (0.8ms) SELECT "users"."id", "users"."name", "users"."username", "users"."twitter_username", "users"."github_username", "users"."website_url" FROM "users" WHERE "users"."id" IN ($1, $2, $3, $4) [["id", 13], ["id", 13], ["id", 13], ["id", 13]] we would have one more query but can build a custom user object There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm I think the hack might be worth it just knowing how big our User objects are and that if we load a lot of them we are going to load a lot of data. What do you both think? I'm game for either way I do worry about memory though a bit with the initial approach. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's hard to make a decision about those two without testing them in production On one side we have "3 queries and more memory" and on the other we have "4 queries and less memory than option 1". Instictively I would lean on the second BUT:
Either way, I would very much like to benchmark how this endpoint performs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets leave it how it is and we can address it later if performance becomes an issue. No need to optimize prematurely |
||
includes(:user). | ||
select(%i[id processed_html user_id ancestry]). | ||
arrange | ||
|
||
# being only one tree we know that the root comment is the first (and only) key | ||
@comment = tree_with_root_comment.keys.first | ||
@comments = tree_with_root_comment[@comment] | ||
Comment on lines
+23
to
+24
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here we use
The whole tree will then be navigated by the |
||
|
||
set_surrogate_key_header "comments", edge_cache_keys(tree_with_root_comment) | ||
end | ||
|
||
private | ||
|
||
# ancestry wraps a single or multiple trees of comments into a single hash, | ||
# in the case of an article comments, the hash has multiple keys (the root comments), | ||
# in the case of a comment and its descendants, the hash has only one key. | ||
# Either way, we need to use recursion to extract all the comment cache keys | ||
# collecting both the keys of each level of root comments and their descendants | ||
# NOTE: the objects are already loaded in memory by "ancestry", | ||
# so no additional SQL query is performed during this extraction, avoiding N+1s | ||
def edge_cache_keys(comments_trees) | ||
comments_trees.keys.flat_map do |comment| | ||
Array.wrap(comment.record_key) + edge_cache_keys(comments_trees[comment]) | ||
end | ||
end | ||
end | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
json.type_of "comment" | ||
|
||
json.id_code comment.id_code_generated | ||
json.body_html comment.processed_html | ||
|
||
json.partial! "api/v0/shared/user", user: comment.user |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
json.partial! "comment", comment: comment | ||
|
||
# recursively render the comment subtree | ||
json.children do | ||
json.partial! "comments_with_children", comments: children | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
json.array! comments.keys do |root_comment| | ||
# ancestry organizes root comments and their descendants in a hash structure | ||
json.partial! "comment_with_children", comment: root_comment, children: comments[root_comment] | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,43 +1 @@ | ||
json.array! @comments do |comment| | ||
json.type_of "comment" | ||
|
||
json.id_code comment.id_code_generated | ||
json.body_html comment.processed_html | ||
json.user do | ||
json.name comment.user.name | ||
json.username comment.user.username | ||
json.twitter_username comment.user.twitter_username | ||
json.github_username comment.user.github_username | ||
json.website_url comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(comment.user).get(90) | ||
end | ||
|
||
json.children comment.children.order(score: :desc).each do |children_comment| | ||
json.id_code children_comment.id_code_generated | ||
json.body_html children_comment.processed_html | ||
json.user do | ||
json.name children_comment.user.name | ||
json.username children_comment.user.username | ||
json.twitter_username children_comment.user.twitter_username | ||
json.github_username children_comment.user.github_username | ||
json.website_url children_comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(children_comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(children_comment.user).get(90) | ||
end | ||
|
||
json.children children_comment.children.order(score: :desc).each do |grandchild_comment| | ||
json.id_code grandchild_comment.id_code_generated | ||
json.body_html grandchild_comment.processed_html | ||
json.user do | ||
json.name grandchild_comment.user.name | ||
json.username grandchild_comment.user.username | ||
json.twitter_username grandchild_comment.user.twitter_username | ||
json.github_username grandchild_comment.user.github_username | ||
json.website_url grandchild_comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(grandchild_comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(grandchild_comment.user).get(90) | ||
end | ||
end | ||
end | ||
end | ||
json.partial! "comments_with_children", comments: @comments |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,38 +1 @@ | ||
json.type_of "comment" | ||
json.id_code @comment.id_code_generated | ||
json.body_html @comment.processed_html | ||
json.user do | ||
json.name @comment.user.name | ||
json.username @comment.user.username | ||
json.twitter_username @comment.user.twitter_username | ||
json.github_username @comment.user.github_username | ||
json.website_url @comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(@comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(@comment.user).get(90) | ||
end | ||
json.children @comment.children.order("score DESC").each do |children_comment| | ||
json.id_code children_comment.id_code_generated | ||
json.body_html children_comment.processed_html | ||
json.user do | ||
json.name children_comment.user.name | ||
json.username children_comment.user.username | ||
json.twitter_username children_comment.user.twitter_username | ||
json.github_username children_comment.user.github_username | ||
json.website_url children_comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(children_comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(children_comment.user).get(90) | ||
end | ||
json.children children_comment.children.order("score DESC").each do |grandchild_comment| | ||
json.id_code grandchild_comment.id_code_generated | ||
json.body_html grandchild_comment.processed_html | ||
json.user do | ||
json.name grandchild_comment.user.name | ||
json.username grandchild_comment.user.username | ||
json.twitter_username grandchild_comment.user.twitter_username | ||
json.github_username grandchild_comment.user.github_username | ||
json.website_url grandchild_comment.user.processed_website_url | ||
json.profile_image ProfileImage.new(grandchild_comment.user).get(640) | ||
json.profile_image_90 ProfileImage.new(grandchild_comment.user).get(90) | ||
end | ||
end | ||
end | ||
json.partial! "comment_with_children", comment: @comment, children: @comments |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,12 +6,12 @@ def json_response | |
end | ||
|
||
let_it_be(:article) { create(:article) } | ||
let_it_be(:root_comment) { create(:comment, commentable: article) } | ||
let_it_be(:child_comment) { create(:comment, commentable: article, parent: root_comment) } | ||
let_it_be(:grandchild_comment) { create(:comment, commentable: article, parent: child_comment) } | ||
let_it_be(:great_grandchild_comment) { create(:comment, commentable: article, parent: grandchild_comment) } | ||
|
||
describe "GET /api/comments" do | ||
before do | ||
create(:comment, commentable: article) | ||
end | ||
|
||
it "returns not found if wrong article id" do | ||
get "/api/comments?a_id=gobbledygook" | ||
expect(response).to have_http_status(:not_found) | ||
|
@@ -23,69 +23,87 @@ def json_response | |
end | ||
|
||
it "does not include children comments in the root list" do | ||
# create child comment | ||
create(:comment, commentable: article, parent: article.comments.first) | ||
|
||
get "/api/comments?a_id=#{article.id}" | ||
expected_ids = article.comments.roots.map(&:id_code_generated) | ||
expect(json_response.map { |cm| cm["id_code"] }).to match_array(expected_ids) | ||
end | ||
|
||
it "includes children comments in the children list" do | ||
# create child comment | ||
parent_comment = article.comments.first | ||
child_comment = create(:comment, commentable: article, parent: parent_comment) | ||
|
||
get "/api/comments?a_id=#{article.id}" | ||
comment_with_children = json_response.detect { |cm| cm["id_code"] == parent_comment.id_code_generated } | ||
comment_with_children = json_response.detect { |cm| cm["id_code"] == root_comment.id_code_generated } | ||
expect(comment_with_children["children"][0]["id_code"]).to eq(child_comment.id_code_generated) | ||
end | ||
|
||
it "includes granchildren comments in the children-children list" do | ||
# create granchild comment | ||
root_comment = article.comments.first | ||
child_comment = create(:comment, commentable: article, parent: root_comment) | ||
granchild_comment = create(:comment, commentable: article, parent: child_comment) | ||
it "includes grandchildren comments in the children-children list" do | ||
get "/api/comments?a_id=#{article.id}" | ||
|
||
comment_with_descendants = json_response.detect { |cm| cm["id_code"] == root_comment.id_code_generated } | ||
expect(comment_with_descendants["children"][0]["children"][0]["id_code"]).to eq(grandchild_comment.id_code_generated) | ||
end | ||
|
||
it "includes great-grandchildren comments in the children-children-children list" do | ||
get "/api/comments?a_id=#{article.id}" | ||
|
||
comment_with_descendants = json_response.detect { |cm| cm["id_code"] == root_comment.id_code_generated } | ||
expect(comment_with_descendants["children"][0]["children"][0]["id_code"]).to eq(granchild_comment.id_code_generated) | ||
json_great_grandchild_id_code = comment_with_descendants["children"][0]["children"][0]["children"][0]["id_code"] | ||
expect(json_great_grandchild_id_code).to eq(great_grandchild_comment.id_code_generated) | ||
end | ||
|
||
it "sets the correct edge caching surrogate key for all the comments" do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added test for the surrogate keys |
||
sibling_root_comment = create(:comment, commentable: article) | ||
|
||
get "/api/comments?a_id=#{article.id}" | ||
|
||
expected_key = [ | ||
"comments", sibling_root_comment.record_key, | ||
root_comment.record_key, child_comment.record_key, | ||
grandchild_comment.record_key, great_grandchild_comment.record_key | ||
].to_set | ||
expect(response.headers["surrogate-key"].split.to_set).to eq(expected_key) | ||
end | ||
end | ||
|
||
describe "GET /api/comments/:id" do | ||
let_it_be(:comment) { create(:comment, commentable: article) } | ||
|
||
it "returns not found if wrong comment id" do | ||
get "/api/comments/foobar" | ||
expect(response).to have_http_status(:not_found) | ||
end | ||
|
||
it "returns the comment" do | ||
get "/api/comments/#{comment.id_code_generated}" | ||
expect(json_response["id_code"]).to eq(comment.id_code_generated) | ||
get "/api/comments/#{root_comment.id_code_generated}" | ||
expect(json_response["id_code"]).to eq(root_comment.id_code_generated) | ||
end | ||
|
||
it "includes children comments in the children list" do | ||
# create child comment | ||
child_comment = create(:comment, commentable: article, parent: comment) | ||
get "/api/comments/#{root_comment.id_code_generated}" | ||
|
||
get "/api/comments/#{comment.id_code_generated}" | ||
comment_with_children = json_response | ||
expect(comment_with_children["children"][0]["id_code"]).to eq(child_comment.id_code_generated) | ||
end | ||
|
||
it "includes granchildren comments in the children-children list" do | ||
# create granchild comment | ||
root_comment = comment | ||
child_comment = create(:comment, commentable: article, parent: root_comment) | ||
granchild_comment = create(:comment, commentable: article, parent: child_comment) | ||
it "includes grandchildren comments in the children-children list" do | ||
get "/api/comments/#{root_comment.id_code_generated}" | ||
|
||
comment_with_descendants = json_response | ||
expect(comment_with_descendants["children"][0]["children"][0]["id_code"]).to eq(grandchild_comment.id_code_generated) | ||
end | ||
|
||
get "/api/comments/#{comment.id_code_generated}" | ||
it "includes great-grandchildren comments in the children-children-children list" do | ||
get "/api/comments/#{root_comment.id_code_generated}" | ||
|
||
comment_with_descendants = json_response | ||
expect(comment_with_descendants["children"][0]["children"][0]["id_code"]).to eq(granchild_comment.id_code_generated) | ||
json_great_grandchild_id_code = comment_with_descendants["children"][0]["children"][0]["children"][0]["id_code"] | ||
expect(json_great_grandchild_id_code).to eq(great_grandchild_comment.id_code_generated) | ||
end | ||
|
||
it "sets the correct edge caching surrogate key for all the comments" do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added test for the surrogate keys |
||
get "/api/comments/#{root_comment.id_code_generated}" | ||
|
||
expected_key = [ | ||
"comments", root_comment.record_key, child_comment.record_key, | ||
grandchild_comment.record_key, great_grandchild_comment.record_key | ||
].to_set | ||
expect(response.headers["surrogate-key"].split.to_set).to eq(expected_key) | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After having simplified the queries by effectively making only 1 query for all comments, we might not need these.
Since
optimization is the root of all evil ™
and since we effectively have edge caching as well, I decided to go ahead and remove server side caching.If we see it's a huge deal, we can add it back, perhaps using the regular cache object. See the issue description