Skip to content

Conversation

vinistock
Copy link
Contributor

Summary

This PR switches from stringifying attributes early on and invokes to_s whenever necessary within the processing loops.

I also took the opportunity to change if !condition to unless condition.

Benchmark script

# frozen_string_literal: true

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  gem "rails", github: "rails/rails", require: "rails/all"
  gem "benchmark-ips"
end

module ActiveModel
  module AttributeAssignment
    def fast_assign_attributes(new_attributes)
      if !new_attributes.respond_to?(:stringify_keys)
        raise ArgumentError, "When assigning attributes, you must pass a hash as an argument, #{new_attributes.class} passed."
      end
      return if new_attributes.empty?

      _fast_assign_attributes(sanitize_for_mass_assignment(new_attributes))
    end

    alias attributes= assign_attributes

    private
    def _fast_assign_attributes(attributes)
      attributes.each do |k, v|
        _fast_assign_attribute(k, v)
      end
    end

    def _fast_assign_attribute(k, v)
      setter = :"#{k}="
      if respond_to?(setter)
        public_send(setter, v)
      else
        raise UnknownAttributeError.new(self, k.to_s)
      end
    end
  end
end

module ActiveRecord
  module AttributeAssignment
    def _fast_assign_attributes(attributes)
      multi_parameter_attributes  = {}
      nested_parameter_attributes = {}

      attributes.each do |k, v|
        key = k.to_s

        if key.include?("(")
          multi_parameter_attributes[key] = attributes.delete(k)
        elsif v.is_a?(Hash)
          nested_parameter_attributes[key] = attributes.delete(k)
        end
      end
      super(attributes)

      assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
      assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
    end
  end
end

class Post
  include ActiveModel::AttributeAssignment
  attr_accessor :title, :author, :published, :score, :tags, :content
end

post = Post.new
attributes = {
  title: "My post",
  author: "John",
  published: true,
  score: 500,
  tags: "Software development,Web development",
  content: "random text" * 100
}

action_controller_params = ActionController::Parameters.new(attributes)
action_controller_params.permit!

SCENARIOS = {
  "Simple hash" => attributes,
  "ActionController params" => action_controller_params
}

SCENARIOS.each_pair do |name, value|
  puts
  puts " #{name} ".center(80, "=")
  puts

  Benchmark.ips do |x|
    x.report("assign_attributes")      { post.assign_attributes(value) }
    x.report("fast_assign_attributes") { post.fast_assign_attributes(value) }
    x.compare!
  end
end

Results

================================= Simple hash ==================================

Warming up --------------------------------------
   assign_attributes    26.067k i/100ms
fast_assign_attributes
                        36.034k i/100ms
Calculating -------------------------------------
   assign_attributes    280.621k (± 2.1%) i/s -      1.408M in   5.018190s
fast_assign_attributes
                        384.638k (± 3.8%) i/s -      1.946M in   5.067378s

Comparison:
fast_assign_attributes:   384637.9 i/s
   assign_attributes:   280620.9 i/s - 1.37x  slower


=========================== ActionController params ============================

Warming up --------------------------------------
   assign_attributes     5.406k i/100ms
fast_assign_attributes
                         7.209k i/100ms
Calculating -------------------------------------
   assign_attributes     59.000k (± 7.5%) i/s -    297.330k in   5.069136s
fast_assign_attributes
                         78.789k (± 8.0%) i/s -    396.495k in   5.063842s

Comparison:
fast_assign_attributes:    78788.7 i/s
   assign_attributes:    59000.3 i/s - 1.34x  slower

@vinistock
Copy link
Contributor Author

@rafaelfranca please let me know what you think. I believe this addresses the issues in my previous attempt, while keeping the performance enhancement.

super(new_attributes)

assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other versions in no particular order.

Take 1:

def _assign_attributes(attributes)
  multi_parameter_attributes  = {}
  nested_parameter_attributes = {}

  super \
    attributes.dup.delete_if do |key, value|
      key = key.to_s

      if key.include?("(")
        multi_parameter_attributes[key] = value
      elsif value.is_a?(Hash)
        nested_parameter_attributes[key] = value
      end
    end
  
  assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
  assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
end

Take 2:

def _assign_attributes(attributes)
  multi_parameter_attributes  = {}
  nested_parameter_attributes = {}
  new_attributes = {}

  attributes.each do |key, value|
    key       = key.to_s
    hash      = multi_parameter_attributes  if key.include?("(")
    hash    ||= nested_parameter_attributes if value.is_a?(Hash)
    hash    ||= new_attributes
    hash[key] = value
  end

  super(new_attributes)
  
  assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
  assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
end

Take 1 taken a little further with the extra idea that nested or multi parameter attributes are passed rarely:

def _assign_attributes(attributes)
  multi_parameter_attributes = nested_parameter_attributes = nil

  super \
    attributes.dup.delete_if do |key, value|
      key = key.to_s

      if key.include?("(")
        (multi_parameter_attributes ||= {})[key] = value
      elsif value.is_a?(Hash)
        (nested_parameter_attributes ||= {})[key] = value
      end
    end

  assign_nested_parameter_attributes(nested_parameter_attributes) if nested_parameter_attributes&.any?
  assign_multiparameter_attributes(multi_parameter_attributes) if multi_parameter_attributes&.any?
end

Copy link
Contributor Author

@vinistock vinistock Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I like take 1 better.

EDIT: I played around with take 1, but we need to return true or false inside the delete_if block to make sure the correct attributes are removed from the dupped hash. In the end, it looks a little messier than the current implementation.

def _assign_attributes(attributes)
multi_parameter_attributes = {}
nested_parameter_attributes = {}
new_attributes = attributes.dup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just no use attributes.stringify_keys?

Suggested change
new_attributes = attributes.dup
attributes = attributes.stringify_keys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that loop through the attributes twice? Once for turning the keys into strings and another one in the each block (which is precisely what I attempted to avoid in the PR).

@kamipo
Copy link
Member

kamipo commented Feb 7, 2020

Personally I'd not prefer this change since this makes _assign_attributes in Active Record a bit slower (add extra stringify_keys), which is directly used to avoid extra stringify_keys in assign_attributes.

def populate_with_current_scope_attributes # :nodoc:
return unless self.class.scope_attributes?
attributes = self.class.scope_attributes
_assign_attributes(attributes) if attributes.any?
end

def initialize_attributes(record, except_from_scope_attributes = nil) #:nodoc:
except_from_scope_attributes ||= {}
skip_assign = [reflection.foreign_key, reflection.type].compact
assigned_keys = record.changed_attribute_names_to_save
assigned_keys += except_from_scope_attributes.keys.map(&:to_s)
attributes = scope_for_create.except!(*(assigned_keys - skip_assign))
record.send(:_assign_attributes, attributes) if attributes.any?
set_inverse_instance(record)
end

@vinistock
Copy link
Contributor Author

vinistock commented Feb 7, 2020

@kamipo I agree that if we call stringify_keys in _assign_attributes it will make it slower. However, do you feel like the penalty would be too big if we continue to just to_s inside the each block?

I can benchmark these two methods and check the differences.

@vinistock
Copy link
Contributor Author

Benchmarking populate_with_current_scope_attributes comparing using the original _assign_attributes vs the new.

Warming up --------------------------------------
populate_with_current_scope_attributes
                        34.216k i/100ms
fast_populate_with_current_scope_attributes
                        34.797k i/100ms
Calculating -------------------------------------
populate_with_current_scope_attributes
                        394.070k (± 3.7%) i/s -      1.985M in   5.044532s
fast_populate_with_current_scope_attributes
                        394.044k (± 2.2%) i/s -      1.983M in   5.036018s

Comparison:
populate_with_current_scope_attributes:   394069.9 i/s
fast_populate_with_current_scope_attributes:   394043.8 i/s - same-ish: difference falls within error

@vinistock
Copy link
Contributor Author

Benchmark for initialize_attributes. Again, please let me know your thoughts and whether you believe we should go with the refactor or drop it.

Warming up --------------------------------------
        initialize_attributes     1.352k i/100ms
   fast_initialize_attributes     1.350k i/100ms
Calculating -------------------------------------
        initialize_attributes     13.725k (± 3.0%) i/s -     68.952k in   5.028806s
   fast_initialize_attributes     13.455k (± 4.2%) i/s -     67.500k in   5.025927s

Comparison:
        initialize_attributes:    13724.8 i/s
   fast_initialize_attributes:    13455.4 i/s - same-ish: difference falls within error

@eugeneius
Copy link
Member

I think @kamipo's main objection is that _assign_attributes now duplicates its argument, which causes unnecessary allocations. This could be fixed by moving the dup call to assign_attributes:

diff --git a/activerecord/lib/active_record/attribute_assignment.rb b/activerecord/lib/active_record/attribute_assignment.rb
index cd14e66c5b..facdac2380 100644
--- a/activerecord/lib/active_record/attribute_assignment.rb
+++ b/activerecord/lib/active_record/attribute_assignment.rb
@@ -6,11 +6,14 @@ module ActiveRecord
   module AttributeAssignment
     include ActiveModel::AttributeAssignment

+    def assign_attributes(new_attributes)
+      super(new_attributes.dup)
+    end
+
     private
-      def _assign_attributes(attributes)
+      def _assign_attributes(new_attributes)
         multi_parameter_attributes  = {}
         nested_parameter_attributes = {}
-        new_attributes = attributes.dup

         new_attributes.each do |k, v|
           key = k.to_s

We're still calling to_s on each key, but given that the internal callers of _assign_attributes always pass hashes with string keys, this seems fine to me.

@eugeneius
Copy link
Member

Can you show the impact of this change on Active Record's assign_attributes method? The benchmark in the description only covers Active Model, and given that the patch moves work from AM to AR, it's not exactly surprising that the new implementation performs better. 😅

@vinistock
Copy link
Contributor Author

I've benchmarked one method that makes use of assign_attributes, which is AR's initialize. Here is the script and its results.

I had to run the script twice separately for the patched and the non-patched version (since having two version of initialize is not quite trivial).

Let me know if you'd like to see other benchmarks and other means of making sure that the change is desired.

Script

# frozen_string_literal: true

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  gem "rails", github: "rails/rails", require: "rails/all"
  gem "benchmark-ips"
  gem "sqlite3"
end

module ActiveModel
  module AttributeAssignment
    def fast_assign_attributes(new_attributes)
      unless new_attributes.respond_to?(:each_pair)
        raise ArgumentError, "When assigning attributes, you must pass a hash as an argument, #{new_attributes.class} passed."
      end
      return if new_attributes.empty?

      _fast_assign_attributes(sanitize_for_mass_assignment(new_attributes))
    end

    alias attributes= assign_attributes

    private
    def _fast_assign_attributes(attributes)
      attributes.each do |k, v|
        _fast_assign_attribute(k, v)
      end
    end

    def _fast_assign_attribute(k, v)
      setter = :"#{k}="
      if respond_to?(setter)
        public_send(setter, v)
      else
        raise UnknownAttributeError.new(self, k.to_s)
      end
    end
  end
end

module ActiveRecord
  module AttributeAssignment
    def fast_assign_attributes(attributes)
      super(attributes.dup)
    end

    def _fast_assign_attributes(attributes)
      multi_parameter_attributes  = {}
      nested_parameter_attributes = {}

      attributes.each do |k, v|
        key = k.to_s

        if key.include?("(")
          multi_parameter_attributes[key] = attributes.delete(k)
        elsif v.is_a?(Hash)
          nested_parameter_attributes[key] = attributes.delete(k)
        end
      end
      super(attributes)

      assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
      assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
    end
  end

  module Core
    def initialize(attributes = nil)
      @new_record = true
      @attributes = self.class._default_attributes.deep_dup

      init_internals
      initialize_internals_callback

      assign_attributes(attributes) if attributes

      yield self if block_given?
      _run_initialize_callbacks
    end
  end
end

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
    t.string :title
    t.string :author
    t.boolean :published
    t.integer :score
    t.string :tags
    t.text :content
  end

  create_table :comments, force: true do |t|
    t.text :content
    t.references :post
  end
end

class Post < ActiveRecord::Base
  include ActiveModel::AttributeAssignment
  attr_accessor :title, :author, :published, :score, :tags, :content
  has_many :comments
end

class Comment < ActiveRecord::Base
  belongs_to :post
end

attributes = {
  title: "My post",
  author: "John",
  published: true,
  score: 500,
  tags: "Software development,Web development",
  content: "random text" * 100
}

Benchmark.ips do |x|
  x.report("initialize")      { Post.new(attributes) }
  x.compare!
end

Results

Warming up --------------------------------------
          initialize     3.557k i/100ms
Calculating -------------------------------------
          initialize     38.786k (± 3.2%) i/s -    195.635k in   5.049337s

Warming up --------------------------------------
          fast_initialize     3.435k i/100ms
Calculating -------------------------------------
          fast_initialize     39.031k (± 3.1%) i/s -    195.795k in   5.021418s

@rafaelfranca
Copy link
Member

rafaelfranca commented Feb 11, 2020

Can we measure _assign_attributes directly as well? Passing keys as strings and symbols. There is no extra allocation when _assign_attributes is called with string keys, but there is an extra method call, for each item in the hash.

@vinistock
Copy link
Contributor Author

@rafaelfranca good call, the more symbols used for attributes the slower _assign_attributes gets. The script and results are below. I feel like this is not going to have the desired improvement.

Benchmark

# frozen_string_literal: true

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  gem "rails", "6.0.2.1", require: "rails/all"
  gem "benchmark-ips"
  gem "sqlite3"
end

module ActiveModel
  module AttributeAssignment
    def fast_assign_attributes(new_attributes)
      unless new_attributes.respond_to?(:each_pair)
        raise ArgumentError, "When assigning attributes, you must pass a hash as an argument, #{new_attributes.class} passed."
      end
      return if new_attributes.empty?

      _fast_assign_attributes(sanitize_for_mass_assignment(new_attributes))
    end

    alias attributes= assign_attributes

    private
    def _fast_assign_attributes(attributes)
      attributes.each do |k, v|
        _fast_assign_attribute(k, v)
      end
    end

    def _fast_assign_attribute(k, v)
      setter = :"#{k}="
      if respond_to?(setter)
        public_send(setter, v)
      else
        raise UnknownAttributeError.new(self, k.to_s)
      end
    end
  end
end

module ActiveRecord
  module AttributeAssignment
    def fast_assign_attributes(attributes)
      super(attributes.dup)
    end

    def _fast_assign_attributes(attributes)
      multi_parameter_attributes  = {}
      nested_parameter_attributes = {}

      attributes.each do |k, v|
        key = k.to_s

        if key.include?("(")
          multi_parameter_attributes[key] = attributes.delete(k)
        elsif v.is_a?(Hash)
          nested_parameter_attributes[key] = attributes.delete(k)
        end
      end
      super(attributes)

      assign_nested_parameter_attributes(nested_parameter_attributes) unless nested_parameter_attributes.empty?
      assign_multiparameter_attributes(multi_parameter_attributes) unless multi_parameter_attributes.empty?
    end
  end
end

ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
    t.string :title
    t.string :author
    t.boolean :published
    t.integer :score
    t.string :tags
    t.text :content
    t.integer :another_field
  end

  create_table :comments, force: true do |t|
    t.text :content
    t.references :post
  end
end

class Post < ActiveRecord::Base
  include ActiveModel::AttributeAssignment
  attr_accessor :title, :author, :published, :score, :tags, :content, :another_field
  has_many :comments
end

class Comment < ActiveRecord::Base
  belongs_to :post
end

SCENARIOS = {
  "All symbols" => {
    title: "My post",
    author: "John",
    published: true,
    score: 500,
    tags: "Software development,Web development",
    content: "random text" * 100,
    another_field: 55555
  },
  "Mixed" => {
    "title" => "My post",
    author: "John",
    "published" => true,
    score: 500,
    "tags" => "Software development,Web development",
    content: "random text" * 100,
    "another_field" => 55555
  },
  "All strings" => {
    "title" => "My post",
    "author" => "John",
    "published" => true,
    "score" => 500,
    "tags" => "Software development,Web development",
    "content" => "random text" * 100,
    "another_field" => 55555
  }
}

SCENARIOS.each_pair do |name, value|
  puts
  puts " #{name} ".center(80, "=")
  puts

  string_attributes = value.stringify_keys
  post = Post.new

  Benchmark.ips do |x|
    x.report("assign_attributes")      { post.send(:_assign_attributes, string_attributes) }
    x.report("fast_assign_attributes") { post.send(:_fast_assign_attributes, value) }
    x.compare!
  end
end

Results

================================= All symbols ==================================

Warming up --------------------------------------
   assign_attributes    12.286k i/100ms
fast_assign_attributes
                        10.839k i/100ms
Calculating -------------------------------------
   assign_attributes    127.639k (± 2.7%) i/s -    638.872k in   5.009241s
fast_assign_attributes
                        111.649k (± 1.9%) i/s -    563.628k in   5.049950s

Comparison:
   assign_attributes:   127639.1 i/s
fast_assign_attributes:   111649.4 i/s - 1.14x  slower


==================================== Mixed =====================================

Warming up --------------------------------------
   assign_attributes    12.322k i/100ms
fast_assign_attributes
                        11.444k i/100ms
Calculating -------------------------------------
   assign_attributes    129.648k (± 1.8%) i/s -    653.066k in   5.038969s
fast_assign_attributes
                        118.436k (± 2.0%) i/s -    595.088k in   5.026683s

Comparison:
   assign_attributes:   129647.7 i/s
fast_assign_attributes:   118436.3 i/s - 1.09x  slower


================================= All strings ==================================

Warming up --------------------------------------
   assign_attributes    12.187k i/100ms
fast_assign_attributes
                        12.243k i/100ms
Calculating -------------------------------------
   assign_attributes    127.221k (± 5.1%) i/s -    645.911k in   5.092112s
fast_assign_attributes
                        129.694k (± 2.0%) i/s -    648.879k in   5.005173s

Comparison:
fast_assign_attributes:   129694.2 i/s
   assign_attributes:   127221.4 i/s - same-ish: difference falls within error

@rafaelfranca
Copy link
Member

The "All strings" case is the one we care about, since there is no difference I'm merging this PR.

@rafaelfranca rafaelfranca merged commit df186bd into rails:master Feb 13, 2020
@vinistock vinistock deleted the stop_stringifying_during_attribute_assignment branch February 13, 2020 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants