Skip to content

Commit

Permalink
Add batched_method for custom batch preloading in ActiveRecord
Browse files Browse the repository at this point in the history
Often with Rails models, we end up iterating over a collection or
relation and calling the same method repeatedly. For these cases,
`includes` comes in really handy, but it has the limitation that it
only works on associations. More complex logic needs separate one-off
code to load in a way that will still avoid N+1s.

We write a lot of code like this in our applications, and it became
clear that a lot of these shared the same type of interface. There would
be a class method that took in an `Array` of objects to load, and
returned a `Hash` where the keys of the hash are the elements in the
`Array` and the values are the results, which would be loaded all at once.
What would it look like to build this pattern into Rails?

This commit proposes an interface called `batched_method` which is a DSL
allowing for the definition of methods in models which play nice
with `includes` & `preload`, but can have any definition. The idea is
that you'd define a batch method like:

``` ruby
class Post < ActiveRecord::Base
  # An implementation which takes in an Array[Post], and returns
  # a Hash[Post]=>Array[Comment]
  batch_method(:featured_comments) do |posts|
    comments_by_post_id = Comment.featured.where(post: posts).group_by(&:post_id)
    posts.index_with { |post| comments_by_post_id[post.id] }
  end
end
```

By writing the above implementation, you get the ability to call
`featured_comments` on a `Post` object like a normal method, but you can
also use `includes(:featured_comments)` and it loads them all the first
time that a single one is accessed (!).

I've been playing with this idea over in https://github.com/seejohnrun/prelude
(see README for more detail on motivations) for a bit and really think that
this would be a nice addition to Rails to help application authors think more
in terms of batching when iterating over model objects. I've taken
inspiration from over there, but rewritten the implementation to be clearer.

There are a bunch of ways that this can get more robust in the future,
but for this commit I've included a few neat additional features:

* The ability to define a `batch_size` on individual batch method
  definitions so that they load elements in batches only up to a certain
  size.

* The ability to define batched methods that take arguments. This is
  useful for common patterns like authentication checks where you want
  to batch things like `post.editable_by?(current_user)`.

* Compatibility with batch methods that happen to return `Hash#default_proc`.

Thanks for reading and I appreciate any feedback!
  • Loading branch information
seejohnrun committed Apr 19, 2021
1 parent 9a263e9 commit e991400
Show file tree
Hide file tree
Showing 8 changed files with 359 additions and 0 deletions.
20 changes: 20 additions & 0 deletions activerecord/CHANGELOG.md
@@ -1,3 +1,23 @@
* Allow creating batched methods on ActiveRecord models via `batch_method`.

Useful for creating definitions for how to batch load values across a
series of records in order to avoid N+1s.

``` ruby
class Post < ActiveRecord::Base
batch_method(:featured_comments) do
comments_by_post_id = Comment.featured.where(post: posts).group_by(&:post_id)
posts.index_with { |post| comments_by_post_id[post.id] }
end
end

Post.first.featured_comments # use on a single record

Post.all.includes(:featured_comments).each { } # load all in a single call
```

*John Crepezzi*

* Add setting for enumerating column names in SELECT statements.

Adding a column to a PostgresSQL database, for example, while the application is running can
Expand Down
10 changes: 10 additions & 0 deletions activerecord/lib/active_record/associations/preloader/branch.rb
Expand Up @@ -74,7 +74,17 @@ def runnable_loaders
def grouped_records
h = {}
polymorphic_parent = !root? && parent.polymorphic?

batched_record_batches_by_class = Hash.new { |h, k| h[k] = ActiveRecord::BatchedMethods::Batch.new(k) }

source_records.each do |record|
batched_method = record.class.batched_methods[association]
if batched_method
batch = batched_record_batches_by_class[record.class]
record.batched_method_batch = batch
next
end

reflection = record.class._reflect_on_association(association)
next if polymorphic_parent && !reflection || !record.association(association).klass
(h[reflection] ||= []) << record
Expand Down
2 changes: 2 additions & 0 deletions activerecord/lib/active_record/base.rb
Expand Up @@ -11,6 +11,7 @@
require "active_record/attributes"
require "active_record/type_caster"
require "active_record/database_configurations"
require "active_record/batched_methods"

module ActiveRecord #:nodoc:
# = Active Record
Expand Down Expand Up @@ -328,6 +329,7 @@ class Base
include SignedId
include Suppressor
include Encryption::EncryptableRecord
include BatchedMethods
end

ActiveSupport.run_load_hooks(:active_record, Base)
Expand Down
37 changes: 37 additions & 0 deletions activerecord/lib/active_record/batched_methods.rb
@@ -0,0 +1,37 @@
# frozen_string_literal: true

require "active_record/batched_methods/batch"
require "active_record/batched_methods/method"

module ActiveRecord::BatchedMethods
extend ActiveSupport::Concern

class_methods do
# Define a batched method. This method will be available on instances
# of this class and return auto-memoized results.
def batch_method(name, batch_size: nil, &block)
batched_methods[name] = Method.new(block, batch_size: batch_size)

define_method(name) do |*args|
batched_method_batch.result_for(name, args, self)
end
end

def batched_methods # :nodoc:
@batched_methods ||= {}
end
end

# Associate this instance with a batch which is will use for batched loading
def batched_method_batch=(batch) # :nodoc:
@batched_method_batch = batch
batch.add(self)
end

private
# Get the current batch, returning a batch of one element if not set
def batched_method_batch
return @batched_method_batch if @batched_method_batch
self.batched_method_batch = Batch.new(self.class)
end
end
59 changes: 59 additions & 0 deletions activerecord/lib/active_record/batched_methods/batch.rb
@@ -0,0 +1,59 @@
# frozen_string_literal: true

require "active_record/batched_methods/type_mismatch"

module ActiveRecord::BatchedMethods
# Represents a batch of things that will be preloaded together
class Batch # :nodoc:
def initialize(klass)
@klass = klass
@entries = Set.new
@result_sets = {}
end

def add(entry)
unless entry.is_a?(@klass)
raise TypeMismatch.new("Cannot add object of type #{entry.class} to batch of #{@klass}")
end

@entries << entry
end

def result_for(name, args, entry)
results = result_set_for(name, args, entry)
results = perform_for(name, args, entry) unless results

results[entry]
end

private
# Get the hash that contains the result for a given entry.
#
# Note: It's important to maintain seprate hashes here instead of merging
# since the hash _may_ be defined using Hash#default_proc
def result_set_for(name, args, entry)
@result_sets.dig(name, args, entry)
end

# Perform the batched method for the given name & entry
def perform_for(name, args, entry)
# Determine the slice to run which is either all, or an appropriate
# slice containing the given entry
method = @klass.batched_methods.fetch(name)
batch_size = method.batch_size
slice = batch_size ?
@entries.each_slice(batch_size).detect { |b| b.include?(entry) } :
@entries

# Call the method with the slice and add the appropriate references to @result_sets
slice_results = method.call(slice, *args)
slice.each do |object|
@result_sets[name] ||= {}
@result_sets[name][args] ||= {}
@result_sets[name][args][object] = slice_results
end

slice_results
end
end
end
17 changes: 17 additions & 0 deletions activerecord/lib/active_record/batched_methods/method.rb
@@ -0,0 +1,17 @@
# frozen_string_literal: true

module ActiveRecord::BatchedMethods
# Represents the runnable representation of a batched method
class Method # :nodoc:
attr_reader :batch_size

extend Forwardable

def_delegator :@block, :call

def initialize(block, batch_size:)
@block = block
@batch_size = batch_size
end
end
end
@@ -0,0 +1,6 @@
# frozen_string_literal: true

module ActiveRecord::BatchedMethods
class TypeMismatch < StandardError
end
end
208 changes: 208 additions & 0 deletions activerecord/test/cases/batch_method_test.rb
@@ -0,0 +1,208 @@
# frozen_string_literal: true

require "cases/helper"
require "models/account"
require "models/aircraft"

class BatchMethodTest < ActiveRecord::TestCase
def test_can_call_batch_method_on_single_object
klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
instances.each.with_object({}) { |k, h| h[k] = 42 }
end
end

assert_equal 42, klass.new.interesting_number
end

def test_can_call_batch_method_on_single_object_with_default_proc
klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
Hash.new { |h, k| h[k] = 42 }
end
end

assert_equal 42, klass.new.interesting_number
end

def test_memoizes_batched_method_calls
call_count = 0

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
call_count += 1
instances.each.with_object({}) { |k, h| h[k] = 42 }
end
end

instance = klass.new
2.times { instance.interesting_number }

assert_equal 1, call_count
end

def test_combines_batched_method_calls_with_default_proc
call_instances = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
call_instances << instances
Hash.new { |h, k| h[k] = 42 }
end
end

batch = ActiveRecord::BatchedMethods::Batch.new(klass)
instances = 2.times.map { klass.new }
instances.each { |k| k.batched_method_batch = batch }

assert_equal [42, 42], instances.map(&:interesting_number)
assert_equal [instances.to_set], call_instances # single call
end

def test_preload_with_batched_methods
call_instances = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
call_instances = [instances]
Hash.new { |h, k| h[k] = 42 }
end
end

instances = 2.times.map { klass.create }

scope = klass.where(id: instances.map(&:id)).preload(:interesting_number)

assert_equal [42, 42], scope.map(&:interesting_number)
assert_equal [instances.to_set], call_instances # single call
end

def test_includes_with_batched_methods
call_instances = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number) do |instances|
call_instances = [instances]
Hash.new { |h, k| h[k] = 42 }
end
end

instances = 2.times.map { klass.create }

scope = klass.where(id: instances.map(&:id)).includes(:interesting_number)
scope.first # greedy load

assert_equal [42, 42], scope.map(&:interesting_number)
assert_equal [instances.to_set], call_instances # single call
end

def test_allows_setting_batch_size
call_instances = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number, batch_size: 1) do |instances|
call_instances << instances
instances.each.with_object({}) { |k, h| h[k] = 42 }
end
end

batch = ActiveRecord::BatchedMethods::Batch.new(klass)
instances = 2.times.map { klass.new }
instances.each { |k| k.batched_method_batch = batch }

2.times do
assert_equal [42, 42], instances.map(&:interesting_number)
end

assert_equal instances.map { |k| [k] }, call_instances # single call
end

def test_allows_setting_batch_size_with_default_proc
call_instances = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:interesting_number, batch_size: 1) do |instances|
call_instances << instances
Hash.new { |h, k| h[k] = 42 }
end
end

batch = ActiveRecord::BatchedMethods::Batch.new(klass)
instances = 2.times.map { klass.new }
instances.each { |k| k.batched_method_batch = batch }

2.times do
assert_equal [42, 42], instances.map(&:interesting_number)
end

assert_equal instances.map { |k| [k] }, call_instances # single call
end

def test_allows_passing_arguments
klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:incremented_number, batch_size: 1) do |instances, *args|
instances.each.with_object({}) { |k, h| h[k] = args[0] + 1 }
end
end

assert_equal 1, klass.new.incremented_number(0)
end

def test_allows_batching_by_arguments
call_instances_with_arguments = []

klass = Class.new(ActiveRecord::Base) do
self.table_name = "funny_jokes"

batch_method(:incremented_number) do |instances, *args|
call_instances_with_arguments << [instances, args]
instances.each.with_object({}) { |k, h| h[k] = args[0] + 1 }
end
end

batch = ActiveRecord::BatchedMethods::Batch.new(klass)
instances = 2.times.map { klass.new }
instances.each { |k| k.batched_method_batch = batch }

instances.each do |instance|
2.times do |i|
assert_equal i + 1, instance.incremented_number(i)
end
end

expected = 2.times.map { |i| [instances.to_set, [i]] }
assert_equal expected, call_instances_with_arguments
end

def test_raises_error_when_mixing_types_in_batch
klass1 = Account
klass2 = Aircraft

batch = ActiveRecord::BatchedMethods::Batch.new(klass1)
klass1.new.batched_method_batch = batch

raised_error = assert_raises(ActiveRecord::BatchedMethods::TypeMismatch) do
klass2.new.batched_method_batch = batch
end

assert_equal "Cannot add object of type #{klass2} to batch of #{klass1}", raised_error.message
end
end

0 comments on commit e991400

Please sign in to comment.