Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fewer object allocations in Encryption::Properties #45531

Merged
merged 1 commit into from
Jul 6, 2022

Conversation

djmb
Copy link
Contributor

@djmb djmb commented Jul 6, 2022

Summary

delegate_missing_to and Enumerable#find both allocate objects. When selecting a large number of encrypted values with can lead to a significant number of allocations.

We can avoid these by:

  1. Not delegating the most commonly called methods
  2. Caching the results of the ALLOWED_VALUE_CLASSES lookup

The benchmark loads 1000 encrypted values - for it I saw a 15% faster with a 30% reduction in allocations.

Other Information


require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  # Activate the gem you are reporting the issue against.
  gem "activerecord", "~> 7.0.0"
  gem "sqlite3"
  gem "benchmark-ips"
end

require "active_record"
require 'benchmark/ips'

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Encryption.configure \
  primary_key: "test master key",
  deterministic_key: "test deterministic key",
  key_derivation_salt: "testing key derivation salt",
  support_unencrypted_data: true

ActiveRecord::Schema.define do
  create_table :comments, force: true do |t|
    t.string :message, length: 1000
  end
end

class Comment < ActiveRecord::Base
  encrypts :message
end

srand(123456)
1000.times { Comment.create!(message: 100.times.map { ("A".."Z").to_a.sample }.join) }

if ENV['OPTIMIZED']
  module ActiveRecord
    module EncryptionPropertiesAvoidAllocations
      extend ActiveSupport::Concern

      ALLOWED_VALUE_CLASSES_CACHE = Set.new

      if %w{delegation all}.include?(ENV['OPTIMIZED'])
        # Delegation involves object creation. Defining the commonly called methods
        # can saves a many allocations when loading large number of encrypted attributes.
        def each(*args, &block)
          data.each(*args, &block)
        end

        def [](*args)
          data.[](*args)
        end

        def key?(*args)
          data.key?(*args)
        end
      end

      if %w{find all}.include?(ENV['OPTIMIZED'])
        # find also involves allocations, so we can cache the results which classes
        # are valid to avoid the call
        def validate_value_type(value)
          unless ALLOWED_VALUE_CLASSES_CACHE.include?(value.class)
            unless ActiveRecord::Encryption::Properties::ALLOWED_VALUE_CLASSES.find { |klass| value.is_a?(klass) }
              raise ActiveRecord::Encryption::Errors::ForbiddenClass, "Can't store a #{value.class}, only properties of type #{ALLOWED_VALUE_CLASSES.inspect} are allowed"
            end
            ALLOWED_VALUE_CLASSES_CACHE << value.class
          end
        end
      end
    end
  end

  ActiveRecord::Encryption::Properties.prepend(ActiveRecord::EncryptionPropertiesAvoidAllocations);
end

def allocation_count
  before = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - before
end

allocated_objects = allocation_count { Comment.pluck(:message) }
puts "Optimization: #{ENV['OPTIMIZED'] || "none"}, allocations: #{allocated_objects}"

Benchmark.ips do |x|
  x.config(time: 30)
  x.report("default")                 { Comment.pluck(:message) }
  x.report("delegation_optimization") { Comment.pluck(:message) }
  x.report("find_optimization")       { Comment.pluck(:message) }
  x.report("all_optimization")        { Comment.pluck(:message) }
  x.hold! "temp_results"
  x.compare!
end

Results:

$ ruby encryption_properties_benchmark.rb; OPTIMIZED=delegation ruby encryption_properties_benchmark.rb; OPTIMIZED=find ruby encryption_properties_benchmark.rb; OPTIMIZED=all ruby encryption_properties_benchmark.rb
...snip...
Optimization: none, allocations: 72238
Warming up --------------------------------------
             default     7.000  i/100ms
Calculating -------------------------------------
             default     71.004  (± 4.2%) i/s -      2.128k in  30.033905s

...snip...
Optimization: delegation, allocations: 62313
Warming up --------------------------------------
delegation_optimization
                         7.000  i/100ms
Calculating -------------------------------------
delegation_optimization
                         80.773  (± 6.2%) i/s -      2.415k in  30.038937s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       80.8 i/s
             default:       71.0 i/s - 1.14x  (± 0.00) slower

...snip...
Optimization: find, allocations: 60311
Warming up --------------------------------------
   find_optimization     7.000  i/100ms
Calculating -------------------------------------
   find_optimization     77.581  (± 3.9%) i/s -      2.324k in  30.027488s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       80.8 i/s
   find_optimization:       77.6 i/s - same-ish: difference falls within error
             default:       71.0 i/s - 1.14x  (± 0.00) slower

...snip...
Optimization: all, allocations: 50320
Warming up --------------------------------------
    all_optimization     8.000  i/100ms
Calculating -------------------------------------
    all_optimization     84.011  (± 8.3%) i/s -      2.496k in  30.008758s

Comparison:
    all_optimization:       84.0 i/s
delegation_optimization:       80.8 i/s - same-ish: difference falls within error
   find_optimization:       77.6 i/s - same-ish: difference falls within error
             default:       71.0 i/s - 1.18x  (± 0.00) slower

data.each(*args, &block)
end

def [](*args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This *args also allocate. I suspect you'd have similar performance by just adding it to delegate :==, to: :data above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @byroot, I've confirmed that also works

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Please re-run your benchmark though, as I need to see if the gain is big enough to justify the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the benchmark results below.

def validate_value_type(value)
unless ALLOWED_VALUE_CLASSES.find { |klass| value.is_a?(klass) }
raise ActiveRecord::Encryption::Errors::ForbiddenClass, "Can't store a #{value.class}, only properties of type #{ALLOWED_VALUE_CLASSES.inspect} are allowed"
unless ALLOWED_VALUE_CLASSES_CACHE.include?(value.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think you need that cache. We can make a few assumptions here:

  • Neither of the accepted classes are frequently subclassed, except maybe Numeric.
  • Most value will be valid.

Based on this:

require 'set'
require 'benchmark/ips'

class Message
end

ALLOWED_VALUE_CLASSES = [String, Message, Numeric, TrueClass, FalseClass, Symbol, NilClass]
ALLOWED_VALUE_CLASSES_SET = ALLOWED_VALUE_CLASSES.to_set

Benchmark.ips do |x|
  x.report("any?") { ALLOWED_VALUE_CLASSES.any? { |k| nil.is_a?(k) } }
  x.report("Array#include") { ALLOWED_VALUE_CLASSES.include?(nil.class) || ALLOWED_VALUE_CLASSES.any? { |k| nil.is_a?(k) } }
  x.report("Set#include") { ALLOWED_VALUE_CLASSES_SET.include?(nil.class) || ALLOWED_VALUE_CLASSES.any? { |k| nil.is_a?(k) } }
  x.compare!(order: :baseline)
end
Warming up --------------------------------------
                any?   304.205k i/100ms
       Array#include   925.406k i/100ms
         Set#include   968.835k i/100ms
Calculating -------------------------------------
                any?      3.059M (± 0.2%) i/s -     15.514M in   5.072052s
       Array#include      9.245M (± 0.3%) i/s -     46.270M in   5.005165s
         Set#include      9.663M (± 0.2%) i/s -     48.442M in   5.013071s

Comparison:
                any?:  3058823.3 i/s
         Set#include:  9663114.7 i/s - 3.16x  (± 0.00) faster
       Array#include:  9244619.7 i/s - 3.02x  (± 0.00) faster

So in short, that array is small enough that even searching for the last element is pretty much in the same ballpark than using a set.

So I think:

ALLOWED_VALUE_CLASSES.include?(value.class) || ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) } 

Should do just fine. You may just want to add the common numerics to the list: Integer and Float.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that, I've update that now. I also added BigDecimal to the list.

@djmb djmb force-pushed the reduce-encrypted-properties-allocations branch from 7fc87c0 to 50a00d3 Compare July 6, 2022 13:38
@@ -54,7 +54,7 @@ def []=(key, value)
end

def validate_value_type(value)
unless ALLOWED_VALUE_CLASSES.find { |klass| value.is_a?(klass) }
unless ALLOWED_VALUE_CLASSES.include?(value.class) && ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unless ALLOWED_VALUE_CLASSES.include?(value.class) && ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) }
unless ALLOWED_VALUE_CLASSES.include?(value.class) || ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh sorry, I'll add a test case for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed now with and I've added a test for it.

@djmb
Copy link
Contributor Author

djmb commented Jul 6, 2022

@byroot, @casperisfine - I've updated this to use your suggestions and confirmed that it works as well as the original:


require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org/"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  # Activate the gem you are reporting the issue against.
  gem "activerecord", "~> 7.0.0"
  gem "sqlite3"
  gem "benchmark-ips"
end

require "active_record"
require 'benchmark/ips'

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Encryption.configure \
  primary_key: "test master key",
  deterministic_key: "test deterministic key",
  key_derivation_salt: "testing key derivation salt",
  support_unencrypted_data: true

ActiveRecord::Schema.define do
  create_table :comments, force: true do |t|
    t.string :message, length: 1000
  end
end

class Comment < ActiveRecord::Base
  encrypts :message
end

srand(123456)
1000.times { Comment.create!(message: 100.times.map { ("A".."Z").to_a.sample }.join) }

if ENV['OPTIMIZED']
  module ActiveRecord
    module EncryptionPropertiesAvoidAllocations
      extend ActiveSupport::Concern

      ALLOWED_VALUE_CLASSES = [String, ActiveRecord::Encryption::Message, Numeric, Float, Integer, BigDecimal, TrueClass, FalseClass, Symbol, NilClass]

      if %w{delegation all}.include?(ENV['OPTIMIZED'])
        delegate :each, :[], :key?, to: :data
      end

      if %w{find all}.include?(ENV['OPTIMIZED'])
        # find also involves allocations, so we can cache the results which classes
        # are valid to avoid the call
        def validate_value_type(value)
          unless ALLOWED_VALUE_CLASSES.include?(value.class) || ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) }
            raise ActiveRecord::Encryption::Errors::ForbiddenClass, "Can't store a #{value.class}, only properties of type #{ALLOWED_VALUE_CLASSES.inspect} are allowed"
          end
        end
      end
    end
  end

  ActiveRecord::Encryption::Properties.prepend(ActiveRecord::EncryptionPropertiesAvoidAllocations);
end

def allocation_count
  before = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - before
end

allocated_objects = allocation_count { Comment.pluck(:message) }
puts "Optimization: #{ENV['OPTIMIZED'] || "none"}, allocations: #{allocated_objects}"

Benchmark.ips do |x|
  x.config(time: 30)
  x.report("default")                 { Comment.pluck(:message) }
  x.report("delegation_optimization") { Comment.pluck(:message) }
  x.report("find_optimization")       { Comment.pluck(:message) }
  x.report("all_optimization")        { Comment.pluck(:message) }
  x.hold! "temp_results"
  x.compare!
end

Results:

$ ruby encryption_properties_benchmark.rb; OPTIMIZED=delegation ruby encryption_properties_benchmark.rb; OPTIMIZED=find ruby encryption_properties_benchmark.rb; OPTIMIZED=all ruby encryption_properties_benchmark.rb
...snip...
Optimization: none, allocations: 72238
Warming up --------------------------------------
             default     6.000  i/100ms
Calculating -------------------------------------
             default     70.643  (± 4.2%) i/s -      2.118k in  30.052046s

...snip...
Optimization: delegation, allocations: 62313
Warming up --------------------------------------
delegation_optimization
                         7.000  i/100ms
Calculating -------------------------------------
delegation_optimization
                         75.785  (± 4.0%) i/s -      2.275k in  30.086061s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       75.8 i/s
             default:       70.6 i/s - same-ish: difference falls within error

...snip...
Optimization: find, allocations: 60306
Warming up --------------------------------------
   find_optimization     7.000  i/100ms
Calculating -------------------------------------
   find_optimization     74.179  (± 4.0%) i/s -      2.226k in  30.082991s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       75.8 i/s
   find_optimization:       74.2 i/s - same-ish: difference falls within error
             default:       70.6 i/s - same-ish: difference falls within error

...snip...
Optimization: all, allocations: 50315
Warming up --------------------------------------
    all_optimization     8.000  i/100ms
Calculating -------------------------------------
    all_optimization     84.689  (± 5.9%) i/s -      2.528k in  30.008722s

Comparison:
    all_optimization:       84.7 i/s
delegation_optimization:       75.8 i/s - 1.12x  (± 0.00) slower
   find_optimization:       74.2 i/s - 1.14x  (± 0.00) slower
             default:       70.6 i/s - 1.20x  (± 0.00) slower

`delegate_missing_to` and `Enumerable#find` both allocate objects. When
selecting a large number of encrypted values with can lead to a
significant number of allocations.

```

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  # Activate the gem you are reporting the issue against.
  gem "activerecord", "~> 7.0.0"
  gem "sqlite3"
  gem "benchmark-ips"
end

require "active_record"
require 'benchmark/ips'

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Encryption.configure \
  primary_key: "test master key",
  deterministic_key: "test deterministic key",
  key_derivation_salt: "testing key derivation salt",
  support_unencrypted_data: true

ActiveRecord::Schema.define do
  create_table :comments, force: true do |t|
    t.string :message, length: 1000
  end
end

class Comment < ActiveRecord::Base
  encrypts :message
end

srand(123456)
1000.times { Comment.create!(message: 100.times.map { ("A".."Z").to_a.sample }.join) }

if ENV['OPTIMIZED']
  module ActiveRecord
    module EncryptionPropertiesAvoidAllocations
      extend ActiveSupport::Concern

      ALLOWED_VALUE_CLASSES = [String, ActiveRecord::Encryption::Message, Numeric, TrueClass, FalseClass, Symbol, NilClass, Float, Integer]

      if %w{delegation all}.include?(ENV['OPTIMIZED'])
        delegate :each, :[], :key?, to: :data
      end

      if %w{find all}.include?(ENV['OPTIMIZED'])
        # find also involves allocations, so we can cache the results which classes
        # are valid to avoid the call
        def validate_value_type(value)
          unless ALLOWED_VALUE_CLASSES.include?(value.class) || ALLOWED_VALUE_CLASSES.any? { |klass| value.is_a?(klass) }
            raise ActiveRecord::Encryption::Errors::ForbiddenClass, "Can't store a #{value.class}, only properties of type #{ALLOWED_VALUE_CLASSES.inspect} are allowed"
          end
        end
      end
    end
  end

  ActiveRecord::Encryption::Properties.prepend(ActiveRecord::EncryptionPropertiesAvoidAllocations);
end

def allocation_count
  before = GC.stat(:total_allocated_objects)
  yield
  GC.stat(:total_allocated_objects) - before
end

allocated_objects = allocation_count { Comment.pluck(:message) }
puts "Optimization: #{ENV['OPTIMIZED'] || "none"}, allocations: #{allocated_objects}"

Benchmark.ips do |x|
  x.config(time: 30)
  x.report("default")                 { Comment.pluck(:message) }
  x.report("delegation_optimization") { Comment.pluck(:message) }
  x.report("find_optimization")       { Comment.pluck(:message) }
  x.report("all_optimization")        { Comment.pluck(:message) }
  x.hold! "temp_results"
  x.compare!
end
```

Results:
```
$ ruby encryption_properties_benchmark.rb; OPTIMIZED=delegation ruby encryption_properties_benchmark.rb; OPTIMIZED=find ruby encryption_properties_benchmark.rb; OPTIMIZED=all ruby encryption_properties_benchmark.rb
...snip...
Optimization: none, allocations: 72238
Warming up --------------------------------------
             default     6.000  i/100ms
Calculating -------------------------------------
             default     70.643  (± 4.2%) i/s -      2.118k in  30.052046s

...snip...
Optimization: delegation, allocations: 62313
Warming up --------------------------------------
delegation_optimization
                         7.000  i/100ms
Calculating -------------------------------------
delegation_optimization
                         75.785  (± 4.0%) i/s -      2.275k in  30.086061s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       75.8 i/s
             default:       70.6 i/s - same-ish: difference falls within error

...snip...
Optimization: find, allocations: 60306
Warming up --------------------------------------
   find_optimization     7.000  i/100ms
Calculating -------------------------------------
   find_optimization     74.179  (± 4.0%) i/s -      2.226k in  30.082991s

Pausing here -- run Ruby again to measure the next benchmark...

Comparison:
delegation_optimization:       75.8 i/s
   find_optimization:       74.2 i/s - same-ish: difference falls within error
             default:       70.6 i/s - same-ish: difference falls within error

...snip...
Optimization: all, allocations: 50315
Warming up --------------------------------------
    all_optimization     8.000  i/100ms
Calculating -------------------------------------
    all_optimization     84.689  (± 5.9%) i/s -      2.528k in  30.008722s

Comparison:
    all_optimization:       84.7 i/s
delegation_optimization:       75.8 i/s - 1.12x  (± 0.00) slower
   find_optimization:       74.2 i/s - 1.14x  (± 0.00) slower
             default:       70.6 i/s - 1.20x  (± 0.00) slower

```

Overall it's about 15% faster with a 30% reduction in allocations.
@djmb djmb force-pushed the reduce-encrypted-properties-allocations branch from 50a00d3 to 0279c03 Compare July 6, 2022 13:57
@byroot
Copy link
Member

byroot commented Jul 6, 2022

Ok, so normally we have a "no micro optimization PRs" policy.

The improvement here isn't huge, but probably enough to qualify as not a micro-optimization, and the change is also really simple.

@byroot byroot merged commit 939d4ee into rails:main Jul 6, 2022
@djmb djmb deleted the reduce-encrypted-properties-allocations branch July 6, 2022 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants