-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve API controllers efficiency by selecting only columns needed for serialization #5805
Improve API controllers efficiency by selecting only columns needed for serialization #5805
Conversation
@@ -70,6 +75,25 @@ def me | |||
|
|||
private | |||
|
|||
INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll notice this pattern in the entire PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I really like about https://github.com/Netflix/fast_jsonapi, it forces you to be more explicit about the attributes that get serialized:
https://github.com/Netflix/fast_jsonapi#serializer-definition
We already have the gem installed, but only use it for a handful of serializers (see app/serializers/
, they are all related to the Stackbit webhooks).
Anyway, your effort here seems to also be about reducing memory consumption when instantiating objects pre-serialization, not only about reducing payload size over the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fast_jsonapi
is in the back of my mind for API v1, as are JSON Schema and OpenAPI with things like https://github.com/interagent/committee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fast JSON has been something I have been dying to use since working with Elasticsearch requires so much serialization I think it could really speed things up when we hit higher volumes.
existing_user_repos = Set.new(existing_user_repos) | ||
|
||
existing_user_repos = current_user.github_repos.where(featured: true). | ||
distinct.pluck(:github_id_code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.distinct.pluck
tells the DB to select only unique values, so we don't have to use Set.new
or .to_set
in memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL! 🔥
unless fetched_repo | ||
render json: "error: Could not find Github repo", status: :not_found | ||
return | ||
end | ||
|
||
@repo = GithubRepo.find_or_create(fetched_repo_params(fetched_repo)) | ||
repo = GithubRepo.find_or_create(fetched_repo_params(fetched_repo)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to store it in an instance variable, as it's only used in this method
client = create_octokit_client | ||
|
||
fetched_repo = fetch_repo(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passed arguments over instance variables is a way of life :D
# this is fully documented in the API docs | ||
# see <https://github.com/thepracticaldev/dev.to/issues/4206> for more details | ||
json.tag_list article.cached_tag_list_array | ||
json.tags article.cached_tag_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this note because I had to keep these two attributes out of the partial, some of us are familiar with this discrepancy but it's still a good idea to document it
# /api/articles and /api/articles/:id have opposite representations | ||
# of `tag_list` and `tags and we can't align them without breaking the API, | ||
# this is fully documented in the API docs | ||
# see <https://github.com/thepracticaldev/dev.to/issues/4206> for more details | ||
json.tag_list @article.cached_tag_list | ||
json.tags @article.cached_tag_list_array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this note because I had to keep these two attributes out of the partial, some of us are familiar with this discrepancy but it's still a good idea to document it
@@ -8,4 +8,3 @@ json.tags listing.tag_list | |||
json.category listing.category | |||
json.processed_html listing.processed_html | |||
json.published listing.published | |||
json.slug listing.slug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this as json.slug
is already serialized at the top of the file
@@ -0,0 +1,5 @@ | |||
json.array! follows do |follow| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shared partial
get :users | ||
get :tags | ||
get :organizations | ||
get :podcasts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this have been changed into namespaces because we are using none of the CRUD operations created by resources
. We don't need to generate all those routes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks FANTASTIC! There is a lot here and I only was able to give it a quick pass as its time for me to make my way to the airport for Mohonk but once I give it another thorough look I think this is going to pay off big time.
error_not_found unless @chat_channel.has_member?(current_user) | ||
end | ||
|
||
SHOW_ATTRIBUTES_FOR_SERIALIZATION = %i[id description channel_name].freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOVE these constants!
existing_user_repos = Set.new(existing_user_repos) | ||
|
||
existing_user_repos = current_user.github_repos.where(featured: true). | ||
distinct.pluck(:github_id_code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL! 🔥
@@ -36,7 +42,7 @@ def valid_user | |||
user | |||
end | |||
|
|||
def organization_article?(reaction) | |||
def org_article?(reaction) | |||
reaction.reactable_type == "Article" && reaction.reactable.organization_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update this to check for the actual organization rather than just the id in the event that the organization is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if we want to do it in the same PR but yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding you too that so I dont see it in HB in the future 😉
json.published_at @article.published_at&.utc&.iso8601 | ||
json.last_comment_at @article.last_comment_at&.utc&.iso8601 | ||
json.published_timestamp @article.published_timestamp | ||
json.partial! "article", article: @article |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super awesome refactor with definitely save us from getting out of sync at some point
expect(json_response.first["slug"]).to eq(ClassifiedListing.last.slug) | ||
expect(json_response.first["user"]).to include("username") | ||
expect(json_response.first["user"]["username"]).not_to be_empty | ||
expect(response.parsed_body.size).to eq(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsed_body
is seriously one of my new favorite shortcuts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahah learned it from you :D
@@ -70,6 +75,25 @@ def me | |||
|
|||
private | |||
|
|||
INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I really like about https://github.com/Netflix/fast_jsonapi, it forces you to be more explicit about the attributes that get serialized:
https://github.com/Netflix/fast_jsonapi#serializer-definition
We already have the gem installed, but only use it for a handful of serializers (see app/serializers/
, they are all related to the Stackbit webhooks).
Anyway, your effort here seems to also be about reducing memory consumption when instantiating objects pre-serialization, not only about reducing payload size over the wire.
positive_reactions_count created_at edited_at last_comment_at published | ||
updated_at | ||
].freeze | ||
private_constant :INDEX_ATTRIBUTES_FOR_SERIALIZATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the singular name, private_constant
does take a list of constants to make private. Maybe this makes the repetition a bit more bearable.
private_constant :INDEX_ATTRIBUTES_FOR_SERIALIZATION,
:SHOW_ATTRIBUTES_FOR_SERIALIZATION,
:ME_ATTRIBUTES_FOR_SERIALIZATION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an awesome PR, thanks for this @rhymes 🎉
In general — and we talked about this before — I think we do a bit too much AR directly in the controller, whereas in the past I tended towards query objects as outlined in this post: https://codeclimate.com/blog/7-ways-to-decompose-fat-activerecord-models/
But that's a wider discussion since it introduces a whole new pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
@@ -70,6 +75,25 @@ def me | |||
|
|||
private | |||
|
|||
INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fast JSON has been something I have been dying to use since working with Elasticsearch requires so much serialization I think it could really speed things up when we hit higher volumes.
@@ -36,7 +42,7 @@ def valid_user | |||
user | |||
end | |||
|
|||
def organization_article?(reaction) | |||
def org_article?(reaction) | |||
reaction.reactable_type == "Article" && reaction.reactable.organization_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding you too that so I dont see it in HB in the future 😉
One thing we could already do is to use Oj as a serializer (which we're using for "fast json" already) for In that project I have a # see https://precompile.com/2015/07/25/rails-activesupport-json.html
module ActiveSupport
module JSON
module Encoding
# Use Oj as JSON encoder
class OjEncoder < JSONGemEncoder
def encode(value)
::Oj.dump(value.as_json, mode: :compat)
end
end
end
end
end
ActiveSupport.json_encoder = ActiveSupport::JSON::Encoding::OjEncoder
as soon as we merge this I'll send a PR for that |
What type of PR is this? (check all applicable)
Description
The common thread here is avoiding
SELECT *
as much as possible. Especially with tables likeusers
andarticles
that have respectively 150 and 90 columns:thus I went through the code and did the following:
.select(attributes)
for all queries I could add it forThis should have us some memory