New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up homogeneous AR lists / reduce allocations #33223

Merged
merged 4 commits into from Jun 26, 2018

Conversation

Projects
None yet
4 participants
@tenderlove
Member

tenderlove commented Jun 25, 2018

This commit speeds up allocating homogeneous lists of AR objects. We
can know if the result set contains an STI column before initializing
every AR object, so this change pulls the "does this result set contain
an STI column?" test up, then uses a specialized instantiation function.
This way we only have to check for an STI column once rather than N
times.

This change also introduces a new initialization function that is meant
for use when allocating AR objects that come from the database. Doing
this allows us to eliminate one hash allocation per AR instance.

Here is a benchmark:

require 'active_record'
require 'benchmark/ips'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string :name
    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

2000.times do
  User.create!(name: "Gorby")
end

Benchmark.ips do |x|
  x.report("find") do
    User.limit(2000).to_a
  end
end

Results:

Before:

[aaron@TC activerecord (master)]$ be ruby -I lib:~/git/allocation_tracer/lib speed.rb
Warming up --------------------------------------
                find     5.000  i/100ms
Calculating -------------------------------------
                find     56.192  (± 3.6%) i/s -    285.000  in   5.080940s

After:

[aaron@TC activerecord (homogeneous-allocation)]$ be ruby -I lib:~/git/allocation_tracer/lib speed.rb
Warming up --------------------------------------
                find     7.000  i/100ms
Calculating -------------------------------------
                find     72.204  (± 2.8%) i/s -    364.000  in   5.044592s
Speed up homogeneous AR lists / reduce allocations
This commit speeds up allocating homogeneous lists of AR objects.  We
can know if the result set contains an STI column before initializing
every AR object, so this change pulls the "does this result set contain
an STI column?" test up, then uses a specialized instantiation function.
This way we only have to check for an STI column once rather than N
times.

This change also introduces a new initialization function that is meant
for use when allocating AR objects that come from the database.  Doing
this allows us to eliminate one hash allocation per AR instance.

Here is a benchmark:

```ruby
require 'active_record'
require 'benchmark/ips'

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.verbose = false

ActiveRecord::Schema.define do
  create_table :users, force: true do |t|
    t.string :name
    t.timestamps null: false
  end
end

class User < ActiveRecord::Base; end

2000.times do
  User.create!(name: "Gorby")
end

Benchmark.ips do |x|
  x.report("find") do
    User.limit(2000).to_a
  end
end
```

Results:

Before:

```
[aaron@TC activerecord (master)]$ be ruby -I lib:~/git/allocation_tracer/lib speed.rb
Warming up --------------------------------------
                find     5.000  i/100ms
Calculating -------------------------------------
                find     56.192  (± 3.6%) i/s -    285.000  in   5.080940s
```

After:

```
[aaron@TC activerecord (homogeneous-allocation)]$ be ruby -I lib:~/git/allocation_tracer/lib speed.rb
Warming up --------------------------------------
                find     7.000  i/100ms
Calculating -------------------------------------
                find     72.204  (± 2.8%) i/s -    364.000  in   5.044592s
```
@@ -349,6 +349,26 @@ def init_with(coder)
self
end
##
# Initializer used for instantiating objects that have been read from the
# database. +attributes+ should be an attributes object, and unlike the

This comment has been minimized.

@utilum

utilum Jun 26, 2018

Contributor

double space after # database. redundant?

This comment has been minimized.

@tenderlove

tenderlove Jun 26, 2018

Member

I'm not sure. We learned double space after punctuation in school because everything was fixed width font at that time. These days we don't need it because most stuff (except code) isn't fixed width. Since code is fixed width, I go with double space but I'm not sure what our style guide is in this case. @fxn?

tenderlove added some commits Jun 26, 2018

Merge branch 'master' into homogeneous-allocation
* master:
  Call initialize after allocate
  Remove `ActiveSupport::Concern` from `ActiveRecord::Aggregations`
  Add example for no_touching? in active_record/no_touching for api docs [ci skip]
  Generate a new key for each service test
define attribute methods in `init_from_db`
Now that `allocate` is removed, we need to define attribute methods in
all "init" methods.
@rafaelfranca

Some questions about the documentation, but the implementation looks good.

instantiate_instance_of(klass, attributes, column_types, &block)
end
# Given a class, an attributes hash, +instantiate+ returns a new instance

This comment has been minimized.

@rafaelfranca

rafaelfranca Jun 26, 2018

Member

I think it is +instantiate_instance_of+ instead of instantiate here. But I guess we can remove that part of the documentation if we rephrase it.

@@ -43,6 +43,11 @@ def initialize(columns, rows, column_types = {})
@column_types = column_types
end
# Does this result set include the column named +name+ ?

This comment has been minimized.

@rafaelfranca

rafaelfranca Jun 26, 2018

Member

To follow the same convention on this same class we should do:

# Returns true if this result set include the column named +name+

This comment has been minimized.

@tenderlove

tenderlove Jun 26, 2018

Member

@rafaelfranca how about Returns truthy if this result set include the column named +name+. I don't want to guarantee the return value is true or false if we decide to change implementation (I remember there was some controversy because we changed a method to return a truthy value). @fxn thoughts?

This comment has been minimized.

@rafaelfranca

rafaelfranca Jun 26, 2018

Member

true in our guideline means truthy. +true+ means the instance of TrueClass.

http://guides.rubyonrails.org/api_documentation_guidelines.html#booleans

This comment has been minimized.

@fxn

fxn Jun 26, 2018

Member

I am not really happy with our convention (that I defined myself many years ago), which is to use fixed-width font for the singletons, and regular font to mean "true" in the Ruby sense of the word (i.e., "truthy"). My thoughts by then were that Ruby defines what is true and false, not what is truthy and falsey, so I stuck with that jargon for respect.

In retrospect, that choice was too subtle and time has shown it doesn't communicate well in practice. It has also the problem that in sample code like

user.blank? # => true

you cannot distinguish if that is regular or fixed-width, therefore people assume the singleton true sometimes and that has made us change some return contracts, to be stricter.

I suggested to change this convention in Basecamp a while back, but it was agreed to leave it like it is.

# Given a class, an attributes hash, +instantiate+ returns a new instance
# of the class. Accepts only keys as strings.
def instantiate_instance_of(klass, attributes, column_types = {}, &block)

This comment has been minimized.

@rafaelfranca

rafaelfranca Jun 26, 2018

Member

Do we want/expect people to call this method on their applications? If not this should be # :nodoc:

This comment has been minimized.

@tenderlove

tenderlove Jun 26, 2018

Member

I don't think we want people to call this. I'll :nodoc: it.

This comment has been minimized.

@rafaelfranca

@tenderlove tenderlove merged commit 3165131 into master Jun 26, 2018

3 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@tenderlove tenderlove deleted the homogeneous-allocation branch Jun 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment