New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Array allocation by mutating when possible #33806

Closed
wants to merge 47 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@schneems
Member

schneems commented Sep 6, 2018

Summary

This PR uses the Rubocop feature I implemented that looks for cases where arrays are allocated and thrown away for example:

array = ["a", "b", "c"]
array.compact.flatten.map { |x| x.downcase }

Can be replaced with:

array = ["a", "b", "c"]
array.compact!
array.flatten!
array.map! { |x| x.downcase }
array

Other Information

In testing with CodeTriage this change saw 999283 bytes to 977094 bytes reduction in memory (2.22%) which roughly corresponds to a 2.22% decrease in execution time.

Since the optimization was applied across the board instead of only to hotspots different applications will see fairly different effects by this change based on the APIs that they're using.

schneems added some commits Aug 28, 2018

Decrease array allocation in
lib/active_record/associations/collection_association.rb
@matthewd

These look pretty ugly to me... I'm quite unexcited about making a global switch to this style just for the sake of a handful of maybe-hot paths.

roughly corresponds to a 2.22% decrease in execution time

Can we please actually measure that? I'm both surprised and confused by the claim that short-term object allocations correspond directly to runtime outside of microbenchmarks that do nothing else.

(And even then I'd like to understand whether there's a small handful of these that account for the bulk of the difference. This feels like a rubocop rule that's more suited to "btw there's an uglier-but-very-slightly-faster alternative you might want to consider here, depending on how the trade-offs balance out" notification, not a bulk change & enforced rule.)

@schneems

This comment has been minimized.

Show comment
Hide comment
@schneems

schneems Sep 6, 2018

Member

I wouldn't add the rule. This isn't all the locations it's just what stood out as maybe a thing people used. About 80 of 100 violations.

For measuring things, again it's all about the app. Most of these are user facing APIs.

These are called once on every request:

actionpack/lib/action_dispatch/middleware/remote_ip.rb - per action
actionview/lib/action_view/helpers/asset_tag_helper.rb javascript and css tags
activesupport/lib/active_support/tagged_logging.rb # push_tags called by rack logger

These are called 1 times per invocation and they can be executed N times per request:

actionview/lib/action_view/digestor.rb # per every `cache` call
actionview/lib/action_view/helpers/form_options_helper.rb # this is really slow as is 
actionview/lib/action_view/template/resolver.rb # find_template_paths: once per every unique template render in the request (may be more in other methods)
activemodel/lib/active_model/validations/confirmation.rb # every User.validators call
activerecord/lib/active_record/log_subscriber.rb # every SQL call i.e. 100.times.each { User.first }
activerecord/lib/active_record/relation/finder_methods.rb # Call find with more than 1 argument User.find(1, 2)
activerecord/lib/active_record/timestamp.rb #   max_updated_column_timestamp  User.first.cache_key(:created_at, :updated_at)
activesupport/lib/active_support/cache.rb # expanded_version cache call in view
activesupport/lib/active_support/core_ext/object/to_query.rb # Hash#to_query calling cache in view
activesupport/lib/active_support/multibyte/chars.rb # 'Café’.mb_chars.reverse

These can be called N times:

activerecord/lib/active_record/associations/collection_association.rb # ids_writer: user.post_ids = [1,2,3]
activerecord/lib/active_record/associations/collection_association.rb # find_by_scan: calling find on a record collection association that uses inverse_of
activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb # insert_fixtures_set ActiveRecord::ConnectionAdapters::DatabaseStatements#insert_fixtures_set
activerecord/lib/active_record/fixtures.rb # loading fixtures without :all

These are called on boot:

actionview/lib/action_view/dependency_tracker.rb - if you have a cache block
activesupport/lib/active_support/inflector/inflections.rb - called once on boot
railties/lib/rails/paths.rb # optimized section is called repeatedly

I'm not sure how to trigger these:

activerecord/lib/arel/select_manager.rb
activemodel/lib/active_model/attribute_methods.rb
activemodel/lib/active_model/type/helpers/accepts_multiparameter_time.rb
activerecord/lib/active_record/aggregations.rb
activerecord/lib/active_record/connection_adapters/postgresql/schema_statements.rb
activerecord/lib/active_record/core.rb 
activerecord/lib/active_record/nested_attributes.rb 
assign_nested_attributes_for_collection_association
activerecord/lib/active_record/reflection.rb # source_reflection_names
activerecord/lib/active_record/relation/predicate_builder.rb # class.references
aggregations.rb activerecord/lib/active_record/relation.rb # references_eager_loaded_tables? call joins I think 
activerecord/lib/active_record/relation.rb # tables_in_string
activerecord/lib/active_record/relation/query_methods.rb # build_arel: group by and then calling to_arel
activerecord/lib/active_record/relation/query_methods.rb # convert_join_strings_to_ast: Join query? build_join_query
activesupport/lib/active_support/duration/iso8601_parser.rb # validate! parse a date```

Member

schneems commented Sep 6, 2018

I wouldn't add the rule. This isn't all the locations it's just what stood out as maybe a thing people used. About 80 of 100 violations.

For measuring things, again it's all about the app. Most of these are user facing APIs.

These are called once on every request:

actionpack/lib/action_dispatch/middleware/remote_ip.rb - per action
actionview/lib/action_view/helpers/asset_tag_helper.rb javascript and css tags
activesupport/lib/active_support/tagged_logging.rb # push_tags called by rack logger

These are called 1 times per invocation and they can be executed N times per request:

actionview/lib/action_view/digestor.rb # per every `cache` call
actionview/lib/action_view/helpers/form_options_helper.rb # this is really slow as is 
actionview/lib/action_view/template/resolver.rb # find_template_paths: once per every unique template render in the request (may be more in other methods)
activemodel/lib/active_model/validations/confirmation.rb # every User.validators call
activerecord/lib/active_record/log_subscriber.rb # every SQL call i.e. 100.times.each { User.first }
activerecord/lib/active_record/relation/finder_methods.rb # Call find with more than 1 argument User.find(1, 2)
activerecord/lib/active_record/timestamp.rb #   max_updated_column_timestamp  User.first.cache_key(:created_at, :updated_at)
activesupport/lib/active_support/cache.rb # expanded_version cache call in view
activesupport/lib/active_support/core_ext/object/to_query.rb # Hash#to_query calling cache in view
activesupport/lib/active_support/multibyte/chars.rb # 'Café’.mb_chars.reverse

These can be called N times:

activerecord/lib/active_record/associations/collection_association.rb # ids_writer: user.post_ids = [1,2,3]
activerecord/lib/active_record/associations/collection_association.rb # find_by_scan: calling find on a record collection association that uses inverse_of
activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb # insert_fixtures_set ActiveRecord::ConnectionAdapters::DatabaseStatements#insert_fixtures_set
activerecord/lib/active_record/fixtures.rb # loading fixtures without :all

These are called on boot:

actionview/lib/action_view/dependency_tracker.rb - if you have a cache block
activesupport/lib/active_support/inflector/inflections.rb - called once on boot
railties/lib/rails/paths.rb # optimized section is called repeatedly

I'm not sure how to trigger these:

activerecord/lib/arel/select_manager.rb
activemodel/lib/active_model/attribute_methods.rb
activemodel/lib/active_model/type/helpers/accepts_multiparameter_time.rb
activerecord/lib/active_record/aggregations.rb
activerecord/lib/active_record/connection_adapters/postgresql/schema_statements.rb
activerecord/lib/active_record/core.rb 
activerecord/lib/active_record/nested_attributes.rb 
assign_nested_attributes_for_collection_association
activerecord/lib/active_record/reflection.rb # source_reflection_names
activerecord/lib/active_record/relation/predicate_builder.rb # class.references
aggregations.rb activerecord/lib/active_record/relation.rb # references_eager_loaded_tables? call joins I think 
activerecord/lib/active_record/relation.rb # tables_in_string
activerecord/lib/active_record/relation/query_methods.rb # build_arel: group by and then calling to_arel
activerecord/lib/active_record/relation/query_methods.rb # convert_join_strings_to_ast: Join query? build_join_query
activesupport/lib/active_support/duration/iso8601_parser.rb # validate! parse a date```

@ahorek

This comment has been minimized.

Show comment
Hide comment
@ahorek

ahorek Sep 6, 2018

This should be changed only on places that are proven to be bottlenecks.
Partial evaluation may eliminate the overhead of chaining and I hope mjit can do something simmilar in the future.

ahorek commented Sep 6, 2018

This should be changed only on places that are proven to be bottlenecks.
Partial evaluation may eliminate the overhead of chaining and I hope mjit can do something simmilar in the future.

@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Sep 6, 2018

With regards to ugly code, what about modifying the rule to only flag invocations where chaining can be preserved because the method never returns nil? e.g. map!, sort!, sort_by!, reverse!

jonathanhefner commented Sep 6, 2018

With regards to ugly code, what about modifying the rule to only flag invocations where chaining can be preserved because the method never returns nil? e.g. map!, sort!, sort_by!, reverse!

@schneems schneems closed this Sep 21, 2018

@schneems

This comment has been minimized.

Show comment
Hide comment
@schneems

schneems Sep 21, 2018

Member

@jonathanhefner that's a good idea. I wonder if it's possible to allow rubocop to apply autocorrect, but only for those cases.

Closing this now as it can't be merged in wholesale. If future work happens in this space we can reference this PR.

Member

schneems commented Sep 21, 2018

@jonathanhefner that's a good idea. I wonder if it's possible to allow rubocop to apply autocorrect, but only for those cases.

Closing this now as it can't be merged in wholesale. If future work happens in this space we can reference this PR.

@mockdeep

This comment has been minimized.

Show comment
Hide comment
@mockdeep

mockdeep Oct 11, 2018

Another maybe slightly less ugly version might be to toss in a .tap:

@connection_identifier = connection_gid identifiers.map { |id| instance_variable_get("@#{id}") }.tap(&:compact!)

mockdeep commented Oct 11, 2018

Another maybe slightly less ugly version might be to toss in a .tap:

@connection_identifier = connection_gid identifiers.map { |id| instance_variable_get("@#{id}") }.tap(&:compact!)
@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Oct 11, 2018

@mockdeep Good thinking, but in microbenchmarks I've found that the overhead of tap dwarfs the performance benefits of bang methods. For example:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("Array#compact"){|array| array.compact }
  job.report("Array#compact!"){|array| array.compact! }
  job.report("Array#tap(compact!)"){|array| array.tap(&:compact!) }
  job.compare!
end

Results in:

       Array#compact!:   2231626.2 i/s
        Array#compact:   1982646.8 i/s - 1.13x slower
  Array#tap(compact!):   1320809.5 i/s - 1.69x slower

jonathanhefner commented Oct 11, 2018

@mockdeep Good thinking, but in microbenchmarks I've found that the overhead of tap dwarfs the performance benefits of bang methods. For example:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("Array#compact"){|array| array.compact }
  job.report("Array#compact!"){|array| array.compact! }
  job.report("Array#tap(compact!)"){|array| array.tap(&:compact!) }
  job.compare!
end

Results in:

       Array#compact!:   2231626.2 i/s
        Array#compact:   1982646.8 i/s - 1.13x slower
  Array#tap(compact!):   1320809.5 i/s - 1.69x slower
@mockdeep

This comment has been minimized.

Show comment
Hide comment
@mockdeep

mockdeep Oct 11, 2018

Oh, interesting. Well nevermind that then.

mockdeep commented Oct 11, 2018

Oh, interesting. Well nevermind that then.

@schneems

This comment has been minimized.

Show comment
Hide comment
@schneems

schneems Oct 11, 2018

Member
Member

schneems commented Oct 11, 2018

@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Oct 12, 2018

@schneems Did you mean the reverse? It looks like using an actual block is indeed faster, but still not enough to overcome the overhead of tap:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("compact"){|array| array.compact }
  job.report("compact!"){|array| array.compact! }
  job.report("tap(&:compact!)"){|array| array.tap(&:compact!) }
  job.report("tap{|a| a.compact! }"){|array| array.tap{|a| a.compact! } }
  job.compare!
end

Results:

              compact!:   2201732.2 i/s
               compact:   1994791.3 i/s - 1.10x slower
  tap{|a| a.compact! }:   1682900.8 i/s - 1.31x slower
       tap(&:compact!):   1309621.1 i/s - 1.68x slower

jonathanhefner commented Oct 12, 2018

@schneems Did you mean the reverse? It looks like using an actual block is indeed faster, but still not enough to overcome the overhead of tap:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("compact"){|array| array.compact }
  job.report("compact!"){|array| array.compact! }
  job.report("tap(&:compact!)"){|array| array.tap(&:compact!) }
  job.report("tap{|a| a.compact! }"){|array| array.tap{|a| a.compact! } }
  job.compare!
end

Results:

              compact!:   2201732.2 i/s
               compact:   1994791.3 i/s - 1.10x slower
  tap{|a| a.compact! }:   1682900.8 i/s - 1.31x slower
       tap(&:compact!):   1309621.1 i/s - 1.68x slower
@schneems

This comment has been minimized.

Show comment
Hide comment
Member

schneems commented Oct 12, 2018

@jonathanhefner

This comment has been minimized.

Show comment
Hide comment
@jonathanhefner

jonathanhefner Oct 12, 2018

@schneems I believe the reason you're seeing "same-ish" is because the dup is being included in the benchmark along with the actual method calls, and it's dominating the measurement. To confirm if that's the case, try benchmarking the non-bang variants (i.e. just array.flatten.uniq.compact) and see if you actually get a similar or higher i/s. If so, the dup is probably the culprit.

The gem I'm using to benchmark is one that I wrote to exclude dup from the measurements, so that's why I'm seeing a difference.

Here's flatten + uniq + compact with that gem:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("normal"){|array| array.flatten.uniq.compact }
  job.report("bang!"){|array| array.flatten!; array.uniq!; array.compact! }
  job.report("tap(&:bang!)"){|array| array.tap(&:flatten!).tap(&:uniq!).tap(&:compact!) }
  job.report("tap{|a| a.bang! * 3 }"){|array| array.tap{|a| a.flatten!; a.uniq!; a.compact! } }
  job.report("tap{|a| a.bang! } * 3"){|array| array.tap{|a| a.flatten! }.tap{|a| a.uniq! }.tap{|a| a.compact! } }
  job.compare!
end

Results:

                  bang!:    172013.5 i/s
  tap{|a| a.bang! * 3 }:    165071.0 i/s - 1.04x slower
                 normal:    160889.3 i/s - 1.07x slower
  tap{|a| a.bang! } * 3:    158352.6 i/s - 1.09x slower
           tap(&:bang!):    145242.9 i/s - 1.18x slower

jonathanhefner commented Oct 12, 2018

@schneems I believe the reason you're seeing "same-ish" is because the dup is being included in the benchmark along with the actual method calls, and it's dominating the measurement. To confirm if that's the case, try benchmarking the non-bang variants (i.e. just array.flatten.uniq.compact) and see if you actually get a similar or higher i/s. If so, the dup is probably the culprit.

The gem I'm using to benchmark is one that I wrote to exclude dup from the measurements, so that's why I'm seeing a difference.

Here's flatten + uniq + compact with that gem:

require "benchmark/inputs"

INPUTS = [
  [1, 2, 3],
  [1, nil, 2, nil, 3, nil],
  [nil, nil, nil],
]

Benchmark.inputs(INPUTS) do |job|
  job.dup_inputs = true
  job.report("normal"){|array| array.flatten.uniq.compact }
  job.report("bang!"){|array| array.flatten!; array.uniq!; array.compact! }
  job.report("tap(&:bang!)"){|array| array.tap(&:flatten!).tap(&:uniq!).tap(&:compact!) }
  job.report("tap{|a| a.bang! * 3 }"){|array| array.tap{|a| a.flatten!; a.uniq!; a.compact! } }
  job.report("tap{|a| a.bang! } * 3"){|array| array.tap{|a| a.flatten! }.tap{|a| a.uniq! }.tap{|a| a.compact! } }
  job.compare!
end

Results:

                  bang!:    172013.5 i/s
  tap{|a| a.bang! * 3 }:    165071.0 i/s - 1.04x slower
                 normal:    160889.3 i/s - 1.07x slower
  tap{|a| a.bang! } * 3:    158352.6 i/s - 1.09x slower
           tap(&:bang!):    145242.9 i/s - 1.18x slower
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment