New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batched_method for custom batch preloading in ActiveRecord #42010
base: main
Are you sure you want to change the base?
Conversation
7f85235
to
3bd6d00
Compare
3bd6d00
to
03e0a11
Compare
Often with Rails models, we end up iterating over a collection or relation and calling the same method repeatedly. For these cases, `includes` comes in really handy, but it has the limitation that it only works on associations. More complex logic needs separate one-off code to load in a way that will still avoid N+1s. We write a lot of code like this in our applications, and it became clear that a lot of these shared the same type of interface. There would be a class method that took in an `Array` of objects to load, and returned a `Hash` where the keys of the hash are the elements in the `Array` and the values are the results, which would be loaded all at once. What would it look like to build this pattern into Rails? This commit proposes an interface called `batched_method` which is a DSL allowing for the definition of methods in models which play nice with `includes` & `preload`, but can have any definition. The idea is that you'd define a batch method like: ``` ruby class Post < ActiveRecord::Base # An implementation which takes in an Array[Post], and returns # a Hash[Post]=>Array[Comment] batch_method(:featured_comments) do |posts| comments_by_post_id = Comment.featured.where(post: posts).group_by(&:post_id) posts.index_with { |post| comments_by_post_id[post.id] } end end ``` By writing the above implementation, you get the ability to call `featured_comments` on a `Post` object like a normal method, but you can also use `includes(:featured_comments)` and it loads them all the first time that a single one is accessed (!). I've been playing with this idea over in https://github.com/seejohnrun/prelude (see README for more detail on motivations) for a bit and really think that this would be a nice addition to Rails to help application authors think more in terms of batching when iterating over model objects. I've taken inspiration from over there, but rewritten the implementation to be clearer. There are a bunch of ways that this can get more robust in the future, but for this commit I've included a few neat additional features: * The ability to define a `batch_size` on individual batch method definitions so that they load elements in batches only up to a certain size. * The ability to define batched methods that take arguments. This is useful for common patterns like authentication checks where you want to batch things like `post.editable_by?(current_user)`. * Compatibility with batch methods that happen to return `Hash#default_proc`. Thanks for reading and I appreciate any feedback!
03e0a11
to
e991400
Compare
Interesting! There's definitely something about the singular records access and the many-records access/preload. Do we have more examples to look at? As it stands to me I don't quite get what other cases we'd see, or how the API would be used in more advanced cases. I worry it's a little too invention-heavy versus leaning on extraction. That's essentially what played out, when I tried to scope out some other possible expressions for this API. Lots of places to go, but what do we want? I'll share these in case the naming is useful: Comment.associated_with(posts: @posts).preload_into :featured_comments, with: -> { featured }
# `many` extracts a Relation-like from a Relation or Array.
# Auto-composes the alias name of `featured_comments` onto Post.
Post.many(@posts).aliased_preload :comments, :featured
Post.many(@posts).aliased_preload :comments, :featured_comments, -> { featured.reverse_chronologically }
# Allow more user-API reflection on associations for ad-hoc tweaks.
# Mostly extracted from the lone `featured` example, so may not hold for more advanced examples.
Post.comments_association(@posts).preload :featured
# posts.each { |post| post.comments.featured.each { |comment| } }
Post.preload_on @posts, :comments, :featured, only: ->(comment) { Current.user.has_access_to? comment }
Post.preload_on @posts, :featured_comments, -> { Comment.featured.reverse_chronologically } |
@kaspth Thanks! :) I am happy to provide a few more examples of behavior we've seen here, sorry if a few of these are a bit more indirect I'm trying not to show any real code. I'll also update the description above to include more of the same as far as reasoning. Example 1We have a collection of items and want to check for each if the current user is able to edit them. The authentication check happens in a separate system not controlled by the Rails app so we want a way to look up multiple at once and unwrap them per-model. With this API: batch_method :can_edit? do |items, current_user|
authentication_entries = AuthenticationService.can_many(object_type: items.first.class, object_ids: items.map(&:id), actor_type: 'User', actor_id: current_user.id)
authentication_entries_by_object_id = authentication_entries.index_by(&:object_id)
items.index_with { |item| authentication_entries_by_object_id[item.id].allowed? }
end Example 2Sometimes the logic used inside of a method like batch_method(:featured_comments) do |posts|
comments_by_post_id = Comment.featured.where(post: posts).group_by(&:post_id)
posts.index_with { |post| comments_by_post_id[post.id].select(&:not_flagged?) }
end Of course we could include the additional checks in the view, but that's not probably where they belong since in the case of a single-model object we'd certainly want them inside of the model. Example 3Sometimes it's more performant to pass multiple blocks that need rendering to a library at once rather than pass them individually. In this example we want to be able to batch calls to a syntax highlighter: batch_method(:rendered_syntax_html) do |entries|
entry_raw_syntaxes = entries.map(&:syntax)
rendered_syntaxes = SyntaxHighlight.highlight_batch(entry_raw_syntaxes)
entries.index_with.with_index { |*, i| rendered_syntaxes[i] }
end Example 4Sometimes I have a more complex query involving potentially custom SQL or interactions with multiple tables and it's just more convenient to define it myself. batch_method(:some_value) do |entries|
results = release_orders.select('user_id, sum(total) as lifetime_total').where(entry_id: entries).order('lifetime_total desc').limit(5)
results_by_entry_id = results.index_by(&:entry_id)
entries.index_with { |entry| results_by_entry_id[entry.id] }
end |
@kaspth Another thing to note is that while this API isn't something we're actively using internally, the pattern of a method that returns a Hash of record to result is something we do all over our codebase for a variety of reasons like those described above. I'm just trying to work through what it would be like to standardize on a single approach since a lot of times we're relying on memoization inside of service objects to accomplish the above. Something like: def rendered_syntax_html(entry)
rendered_syntax_htmls_by_entry[entry]
end
private
def rendered_syntax_htmls_by_entry
return @rendered_syntax_htmls_by_entry if @rendered_syntax_htmls_by_entry
entry_raw_syntaxes = entries.map(&:syntax)
rendered_syntaxes = SyntaxHighlight.highlight_batch(entry_raw_syntaxes)
@rendered_syntax_htmls_by_entry = entries.index_with.with_index { |*, i| rendered_syntaxes[i] }
end Or worse, building the call to the batch method into our views and destructuring it as we iterate. |
In order to bring this library more in line with rails/rails#42010 we're removing the patch for now so that all behavior on iteration is opt-in for Relation just like for Array. Co-authored-by: John Hawthorn <john@hawthorn.email>
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Summary
Often with Rails models, we end up iterating over a collection or
relation and calling the same method repeatedly. For these cases,
includes
comes in really handy, but it has the limitation that itonly works on associations. More complex logic needs separate one-off
code to load in a way that will still avoid N+1s.
We write a lot of code like this in our applications, and it became
clear that a lot of these shared the same type of interface. There would
be a class method that took in an
Array
of objects to load, andreturned a
Hash
where the keys of the hash are the elements in theArray
and the values are the results, which would be loaded all at once.What would it look like to build this pattern into Rails?
This commit proposes an interface called
batched_method
which is a DSLallowing for the definition of methods in models which play nice
with
includes
&preload
, but can have any definition. The idea isthat you'd define a batch method like:
By writing the above implementation, you get the ability to call
featured_comments
on aPost
object like a normal method, but you canalso use
includes(:featured_comments)
and it loads them all the firsttime that a single one is accessed (!).
I've been playing with this idea over in https://github.com/seejohnrun/prelude
(see README for more detail on motivations) for a bit and really think that
this would be a nice addition to Rails to help application authors think more
in terms of batching when iterating over model objects. I've taken
inspiration from over there, but rewritten the implementation to be clearer.
There are a bunch of ways that this can get more robust in the future,
but for this commit I've included a few neat additional features:
The ability to define a
batch_size
on individual batch methoddefinitions so that they load elements in batches only up to a certain
size.
The ability to define batched methods that take arguments. This is
useful for common patterns like authentication checks where you want
to batch things like
post.editable_by?(current_user)
.Compatibility with batch methods that happen to return a
Hash
with#default_proc
.Thanks for reading and I appreciate any feedback!
cc / @eileencodes & @jhawthorn who I've chatted with about this idea previously