New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Memory Allocation when using .pluck #33074

Merged
merged 1 commit into from Jun 19, 2018

Conversation

Projects
None yet
5 participants
@lsylvester
Contributor

lsylvester commented Jun 7, 2018

In https://samsaffron.com/archive/2018/06/01/an-analysis-of-memory-bloat-in-active-record-5-2, it was shown that using ActiveRecord::Relation#pluck would allocate 5 objects per row retrieved.

This PR optimises ActiveRecord::Result#cast_values to avoid creating temporary arrays, reducing the number of objects allocated to 1 per row retrieved.

Benchmark script is at https://github.com/lsylvester/rails/blob/pluck-optimize-bench/activerecord/bench.rb (pluck! is the new optimised version in that branch).

The new version is up to 1.5 times faster depending on the database.

Result

Fetching gem metadata from https://rubygems.org/..........
Resolving dependencies...
Using concurrent-ruby 1.0.5
Using i18n 1.0.1
Using minitest 5.11.3
Using thread_safe 0.3.6
Using tzinfo 1.2.5
Using activesupport 6.0.0.alpha from source at `../activesupport`
Using activemodel 6.0.0.alpha from source at `../activemodel`
Using activerecord 6.0.0.alpha from source at `.`
Using benchmark-ips 2.7.2
Using bundler 1.16.1
Using memory_profiler 0.9.10
Using mysql2 0.5.1
Using pg 1.0.0
Using sqlite3 1.3.13

==================================== MySQL =====================================

Warming up --------------------------------------
               pluck     8.000  i/100ms
              pluck!     9.000  i/100ms
Calculating -------------------------------------
               pluck     80.451  (± 3.7%) i/s -    408.000  in   5.077212s
              pluck!     83.794  (± 3.6%) i/s -    423.000  in   5.056341s

Comparison:
              pluck!:       83.8 i/s
               pluck:       80.5 i/s - same-ish: difference falls within error

==================================== PLUCK =====================================
Total allocated: 1146280 bytes (25100 objects)
Total retained:  560 bytes (2 objects)


==================================== PLUCK! ====================================
Total allocated: 306240 bytes (5099 objects)
Total retained:  560 bytes (2 objects)



================================== PostreSQL ===================================

Warming up --------------------------------------
               pluck    19.000  i/100ms
              pluck!    29.000  i/100ms
Calculating -------------------------------------
               pluck    192.043  (± 5.2%) i/s -    969.000  in   5.058218s
              pluck!    291.951  (± 5.8%) i/s -      1.479k in   5.082570s

Comparison:
              pluck!:      292.0 i/s
               pluck:      192.0 i/s - 1.52x  slower

==================================== PLUCK =====================================
Total allocated: 1084976 bytes (25093 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK! ====================================
Total allocated: 244936 bytes (5092 objects)
Total retained:  80 bytes (1 objects)



==================================== SQLite ====================================

Warming up --------------------------------------
               pluck    14.000  i/100ms
              pluck!    19.000  i/100ms
Calculating -------------------------------------
               pluck    146.033  (± 4.8%) i/s -    742.000  in   5.093060s
              pluck!    186.104  (± 8.1%) i/s -    931.000  in   5.040595s

Comparison:
              pluck!:      186.1 i/s
               pluck:      146.0 i/s - 1.27x  slower

==================================== PLUCK =====================================
Total allocated: 1104480 bytes (25090 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK! ====================================
Total allocated: 264440 bytes (5089 objects)
Total retained:  80 bytes (1 objects)

@sikachu

I have one suggestion so we don't have to track index ourselves. Please let me know what you think.

activerecord/lib/active_record/result.rb Outdated
result = rows.map do |values|
types.zip(values).map { |type, value| type.deserialize(value) }
length = columns.length
rows.each do |values|

This comment has been minimized.

@sikachu

sikachu Jun 7, 2018

Member

I think given the problem is pretty much maps -> zip -> maps here, right? I don't like that either.

However, I don't think we really need to track the index ourselves given we have each_with_index, so I'd like to propose this patch:

    def cast_values!(type_overrides = {}) # :nodoc:
      rows.each do |values|
        columns.each_with_index do |column_name, index|
          values[index] = column_type(column_name, type_overrides).deserialize(values[index])
        end
      end
      columns.one? ? rows.map!(&:first) : rows
    end

I ran this on your branch to compare the number of objects and seeing the similar amount, so I think this might be an alternative. What do you think?

This comment has been minimized.

@sikachu

sikachu Jun 7, 2018

Member

(Apparently I forgot to save the file, and so I was running the benchmark against wrong code ... anyway, I updated the result gist and the code suggestion above. Still imilar numbers of object count.)

@matthewd

This comment has been minimized.

Member

matthewd commented Jun 7, 2018

Thanks for PRing this!

Even as a nodoc method, I don't think it's reasonable for AR::Result#cast_values to destroy the callee, so I think I'd rather we stick to map over mutation.

The best version of that I've come up with so far is:

    def cast_values(type_overrides = {}) # :nodoc:
      if columns.one?
        # Separated to avoid allocating an array per row

        type = column_type(columns.first, type_overrides)

        rows.map do |(value)|
          type.deserialize(value)
        end
      else
        types = columns.map { |name| column_type(name, type_overrides) }

        rows.map do |values|
          Array.new(values.size) { |i| types[i].deserialize(values[i]) }
        end
      end
    end

which unsurprisingly benches well in the .one? case: (pluck? is mine)

================================== PostreSQL ===================================

Warming up --------------------------------------
               pluck    22.000  i/100ms
              pluck!    37.000  i/100ms
              pluck?    46.000  i/100ms
Calculating -------------------------------------
               pluck    270.390  (± 3.0%) i/s -      1.364k in   5.049198s
              pluck!    383.141  (± 3.9%) i/s -      1.924k in   5.030707s
              pluck?    484.953  (± 2.1%) i/s -      2.438k in   5.029679s

Comparison:
              pluck?:      485.0 i/s
              pluck!:      383.1 i/s - 1.27x  slower
               pluck:      270.4 i/s - 1.79x  slower

==================================== PLUCK =====================================
Total allocated: 1084976 bytes (25093 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK? ====================================
Total allocated: 284936 bytes (5092 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK! ====================================
Total allocated: 244936 bytes (5092 objects)
Total retained:  80 bytes (1 objects)

But only averagely if we get a second column in the mix:

================================== PostreSQL ===================================

Warming up --------------------------------------
               pluck    21.000  i/100ms
              pluck!    30.000  i/100ms
              pluck?    24.000  i/100ms
Calculating -------------------------------------
               pluck    203.789  (± 3.4%) i/s -      1.029k in   5.056230s
              pluck!    294.373  (± 4.4%) i/s -      1.470k in   5.005102s
              pluck?    231.154  (± 1.7%) i/s -      1.176k in   5.089200s

Comparison:
              pluck!:      294.4 i/s
              pluck?:      231.2 i/s - 1.27x  slower
               pluck:      203.8 i/s - 1.44x  slower

==================================== PLUCK =====================================
Total allocated: 1285504 bytes (30105 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK? ====================================
Total allocated: 485504 bytes (10105 objects)
Total retained:  80 bytes (1 objects)


==================================== PLUCK! ====================================
Total allocated: 245464 bytes (5104 objects)
Total retained:  80 bytes (1 objects)
@lsylvester

This comment has been minimized.

Contributor

lsylvester commented Jun 7, 2018

@matthewd I really like your idea about separating out the one case, so I will add that in.

Regarding the destroying the callee, we could use have a destructive and not destructive versions of this method (cast_values and cast_values!) so that we can still get the best perfomance for multiple columns. Do you think that that would acceptable? As far as I can tell the only case where a result would be reused would be the result of the query_cache, which seems to be calling dup everywhere, so destroying the result from pluck shouldn't be a problem.

@sikachu I modelled the code (regarding each_with_index vs while) of the hash_rows method which I assumed was optimal, but I will run my own benchmarks to see if it makes any difference once the the approach is finalised.

@lsylvester

This comment has been minimized.

Contributor

lsylvester commented Jun 7, 2018

@matthewd I have tested it further and you were right that mutating the rows of the Result is dangerous. While the @rows is duped in initialize_copy the arrays in @rows are the same objects and so this could have some unindented side effects when the query cache is used.

@lsylvester

This comment has been minimized.

Contributor

lsylvester commented Jun 7, 2018

@matthewd I have updated the PR as per your suggested code, as I couldn't see a way to improve it. Thanks for your help.

@matthewd

This comment has been minimized.

Member

matthewd commented Jun 7, 2018

My concern was originally only for the architectural aspect of having such a self-destruct method... if you found a real failure it caused, I wonder if it's possible/worthwhile to turn that into a test?

@lsylvester

This comment has been minimized.

Contributor

lsylvester commented Jun 8, 2018

Hi @matthewd,

Here is a test that would fail the result rows were mutated.

  def test_pluck_with_type_cast_does_not_corrupt_the_query_cache
    topic = topics(:first)
    relation = Topic.where(id: topic.id)
    assert_queries 1 do
      Topic.cache do
        kind = relation.select(:written_on).load.first.read_attribute_before_type_cast(:written_on).class
        relation.pluck(:written_on)
        assert_kind_of kind, relation.select(:written_on).load.first.read_attribute_before_type_cast(:written_on)
      end
    end
  end

With the error

Failure:
CalculationsTest#test_pluck_with_type_cast_does_not_corrupt_the_query_cache [.../rails/activerecord/test/cases/calculations_test.rb:648]:
Expected 2003-07-16 14:28:11 UTC to be a kind of String, not Time.

The test would stop being effective if the type casting for the particular column was moved to the database, and might be ineffective for some of the adapters already.

Do you think it is worth adding in a test like this as part of this PR? I would need to find a way to be sure that they typecasting isn't being done at the database level as it would render the test ineffective.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Jun 8, 2018

I think that even that the test will be ineffective in some adapters if they start to do the type cast at the database level it is good to make sure we don't introduce a breakage like this if we change the code to mutate the object. So I'd add that test.

@rafaelfranca rafaelfranca self-assigned this Jun 8, 2018

@lsylvester

This comment has been minimized.

Contributor

lsylvester commented Jun 19, 2018

@rafaelfranca I have added the test as requested.

@rafaelfranca rafaelfranca merged commit 897946f into rails:master Jun 19, 2018

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@pschambacher

This comment has been minimized.

Contributor

pschambacher commented Jul 2, 2018

Is there a plan to backport this to Rails 5.x?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment