Improve API controllers efficiency by selecting only columns needed for serialization#5805
Conversation
|
|
||
| private | ||
|
|
||
| INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
you'll notice this pattern in the entire PR
There was a problem hiding this comment.
This is something I really like about https://github.com/Netflix/fast_jsonapi, it forces you to be more explicit about the attributes that get serialized:
https://github.com/Netflix/fast_jsonapi#serializer-definition
We already have the gem installed, but only use it for a handful of serializers (see app/serializers/, they are all related to the Stackbit webhooks).
Anyway, your effort here seems to also be about reducing memory consumption when instantiating objects pre-serialization, not only about reducing payload size over the wire.
There was a problem hiding this comment.
fast_jsonapi is in the back of my mind for API v1, as are JSON Schema and OpenAPI with things like https://github.com/interagent/committee
There was a problem hiding this comment.
fast JSON has been something I have been dying to use since working with Elasticsearch requires so much serialization I think it could really speed things up when we hit higher volumes.
| existing_user_repos = Set.new(existing_user_repos) | ||
|
|
||
| existing_user_repos = current_user.github_repos.where(featured: true). | ||
| distinct.pluck(:github_id_code) |
There was a problem hiding this comment.
.distinct.pluck tells the DB to select only unique values, so we don't have to use Set.new or .to_set in memory
| end | ||
|
|
||
| @repo = GithubRepo.find_or_create(fetched_repo_params(fetched_repo)) | ||
| repo = GithubRepo.find_or_create(fetched_repo_params(fetched_repo)) |
There was a problem hiding this comment.
no need to store it in an instance variable, as it's only used in this method
| client = create_octokit_client | ||
|
|
||
| fetched_repo = fetch_repo(client) |
There was a problem hiding this comment.
passed arguments over instance variables is a way of life :D
| # this is fully documented in the API docs | ||
| # see <https://github.com/thepracticaldev/dev.to/issues/4206> for more details | ||
| json.tag_list article.cached_tag_list_array | ||
| json.tags article.cached_tag_list |
There was a problem hiding this comment.
added this note because I had to keep these two attributes out of the partial, some of us are familiar with this discrepancy but it's still a good idea to document it
| # /api/articles and /api/articles/:id have opposite representations | ||
| # of `tag_list` and `tags and we can't align them without breaking the API, | ||
| # this is fully documented in the API docs | ||
| # see <https://github.com/thepracticaldev/dev.to/issues/4206> for more details | ||
| json.tag_list @article.cached_tag_list | ||
| json.tags @article.cached_tag_list_array |
There was a problem hiding this comment.
added this note because I had to keep these two attributes out of the partial, some of us are familiar with this discrepancy but it's still a good idea to document it
| json.category listing.category | ||
| json.processed_html listing.processed_html | ||
| json.published listing.published | ||
| json.slug listing.slug |
There was a problem hiding this comment.
removed this as json.slug is already serialized at the top of the file
| @@ -0,0 +1,5 @@ | |||
| json.array! follows do |follow| | |||
| get :users | ||
| get :tags | ||
| get :organizations | ||
| get :podcasts |
There was a problem hiding this comment.
this have been changed into namespaces because we are using none of the CRUD operations created by resources. We don't need to generate all those routes
mstruve
left a comment
There was a problem hiding this comment.
This looks FANTASTIC! There is a lot here and I only was able to give it a quick pass as its time for me to make my way to the airport for Mohonk but once I give it another thorough look I think this is going to pay off big time.
| error_not_found unless @chat_channel.has_member?(current_user) | ||
| end | ||
|
|
||
| SHOW_ATTRIBUTES_FOR_SERIALIZATION = %i[id description channel_name].freeze |
| existing_user_repos = Set.new(existing_user_repos) | ||
|
|
||
| existing_user_repos = current_user.github_repos.where(featured: true). | ||
| distinct.pluck(:github_id_code) |
|
|
||
| def organization_article?(reaction) | ||
| def org_article?(reaction) | ||
| reaction.reactable_type == "Article" && reaction.reactable.organization_id |
There was a problem hiding this comment.
Should we update this to check for the actual organization rather than just the id in the event that the organization is removed?
There was a problem hiding this comment.
Don't know if we want to do it in the same PR but yes!
There was a problem hiding this comment.
Holding you too that so I dont see it in HB in the future 😉
| json.published_at @article.published_at&.utc&.iso8601 | ||
| json.last_comment_at @article.last_comment_at&.utc&.iso8601 | ||
| json.published_timestamp @article.published_timestamp | ||
| json.partial! "article", article: @article |
There was a problem hiding this comment.
Super awesome refactor with definitely save us from getting out of sync at some point
| expect(json_response.first["slug"]).to eq(ClassifiedListing.last.slug) | ||
| expect(json_response.first["user"]).to include("username") | ||
| expect(json_response.first["user"]["username"]).not_to be_empty | ||
| expect(response.parsed_body.size).to eq(7) |
There was a problem hiding this comment.
parsed_body is seriously one of my new favorite shortcuts!
There was a problem hiding this comment.
ahah learned it from you :D
|
|
||
| private | ||
|
|
||
| INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
This is something I really like about https://github.com/Netflix/fast_jsonapi, it forces you to be more explicit about the attributes that get serialized:
https://github.com/Netflix/fast_jsonapi#serializer-definition
We already have the gem installed, but only use it for a handful of serializers (see app/serializers/, they are all related to the Stackbit webhooks).
Anyway, your effort here seems to also be about reducing memory consumption when instantiating objects pre-serialization, not only about reducing payload size over the wire.
| positive_reactions_count created_at edited_at last_comment_at published | ||
| updated_at | ||
| ].freeze | ||
| private_constant :INDEX_ATTRIBUTES_FOR_SERIALIZATION |
There was a problem hiding this comment.
Despite the singular name, private_constant does take a list of constants to make private. Maybe this makes the repetition a bit more bearable.
private_constant :INDEX_ATTRIBUTES_FOR_SERIALIZATION,
:SHOW_ATTRIBUTES_FOR_SERIALIZATION,
:ME_ATTRIBUTES_FOR_SERIALIZATION
citizen428
left a comment
There was a problem hiding this comment.
I think this is an awesome PR, thanks for this @rhymes 🎉
In general — and we talked about this before — I think we do a bit too much AR directly in the controller, whereas in the past I tended towards query objects as outlined in this post: https://codeclimate.com/blog/7-ways-to-decompose-fat-activerecord-models/
But that's a wider discussion since it introduces a whole new pattern.
|
|
||
| private | ||
|
|
||
| INDEX_ATTRIBUTES_FOR_SERIALIZATION = %i[ |
There was a problem hiding this comment.
fast JSON has been something I have been dying to use since working with Elasticsearch requires so much serialization I think it could really speed things up when we hit higher volumes.
|
|
||
| def organization_article?(reaction) | ||
| def org_article?(reaction) | ||
| reaction.reactable_type == "Article" && reaction.reactable.organization_id |
There was a problem hiding this comment.
Holding you too that so I dont see it in HB in the future 😉
One thing we could already do is to use Oj as a serializer (which we're using for "fast json" already) for In that project I have a # see https://precompile.com/2015/07/25/rails-activesupport-json.html
module ActiveSupport
module JSON
module Encoding
# Use Oj as JSON encoder
class OjEncoder < JSONGemEncoder
def encode(value)
::Oj.dump(value.as_json, mode: :compat)
end
end
end
end
end
ActiveSupport.json_encoder = ActiveSupport::JSON::Encoding::OjEncoder
as soon as we merge this I'll send a PR for that |
What type of PR is this? (check all applicable)
Description
The common thread here is avoiding
SELECT *as much as possible. Especially with tables likeusersandarticlesthat have respectively 150 and 90 columns:thus I went through the code and did the following:
.select(attributes)for all queries I could add it forThis should have us some memory