Permalink
Browse files

Make Active Record emit significantly smaller YAML

This reduces the size of a YAML encoded Active Record object by ~80%
depending on the number of columns. There were a number of wasteful
things that occurred when we encoded the objects before that have
resulted in numerous wins

- We were emitting the result of `attributes_before_type_cast` as a hack
  to work around some laziness issues
- The name of an attribute was emitted multiple times, since the
  attribute objects were in a hash keyed by the name. We now store them
  in an array instead, and reconstruct the hash using the name
- The types were included for every attribute. This would use backrefs
  if multiple objects were encoded, but really we don't need to include
  it at all unless it differs from the type at the class level. (The
  only time that will occur is if the field is the result of a custom
  select clause)
- `original_attribute:` was included over and over and over again since
  the ivar is almost always `nil`. We've added a custom implementation
  of `encode_with` on the attribute objects to ensure we don't write the
  key when the field is `nil`.

This isn't without a cost though. Since we're no longer including the
types, an object can find itself in an invalid state if the type changes
on the class after serialization. This is the same as 4.1 and earlier,
but I think it's worth noting.

I was worried that I'd introduce some new state bugs as a result of
doing this, so I've added an additional test that asserts mutation not
being lost as the result of YAML round tripping.

Fixes #25145
  • Loading branch information...
sgrif committed May 31, 2016
1 parent b351061 commit c4cb6862babd2665a65056e205c2a5fd17a5d99d
@@ -108,6 +108,22 @@ def hash
[self.class, name, value_before_type_cast, type].hash
end
def init_with(coder)
@name = coder["name"]
@value_before_type_cast = coder["value_before_type_cast"]
@type = coder["type"]
@original_attribute = coder["original_attribute"]
@value = coder["value"] if coder.map.key?("value")
end
def encode_with(coder)
coder["name"] = name
coder["value_before_type_cast"] = value_before_type_cast if value_before_type_cast
coder["type"] = type if type
coder["original_attribute"] = original_attribute if original_attribute
coder["value"] = value if defined?(@value)
end
protected
attr_reader :original_attribute
@@ -201,6 +217,10 @@ def value_for_database
def initialized?
false
end
def with_type(type)
self.class.new(name, type)
end
end
private_constant :FromDatabase, :FromUser, :Null, :Uninitialized, :WithCastValue
end
@@ -1,7 +1,10 @@
require 'active_record/attribute_set/builder'
require 'active_record/attribute_set/yaml_encoder'
module ActiveRecord
class AttributeSet # :nodoc:
delegate :each_value, to: :attributes
def initialize(attributes)
@attributes = attributes
end
@@ -22,7 +22,7 @@ def build_from_database(values = {}, additional_types = {})
end
class LazyAttributeHash # :nodoc:
delegate :transform_values, :each_key, to: :materialize
delegate :transform_values, :each_key, :each_value, to: :materialize
def initialize(types, values, additional_types)
@types = types
@@ -0,0 +1,39 @@
module ActiveRecord
class AttributeSet
# Attempts to do more intelligent YAML dumping of an
# ActiveRecord::AttributeSet to reduce the size of the resulting string
class YAMLEncoder
def initialize(default_types)
@default_types = default_types
end
def encode(attribute_set, coder)
coder['concise_attributes'] = attribute_set.each_value.map do |attr|
if attr.type.equal?(default_types[attr.name])
attr.with_type(nil)
else
attr
end
end
end
def decode(coder)
if coder['attributes']
coder['attributes']
else
attributes_hash = Hash[coder['concise_attributes'].map do |attr|
if attr.type.nil?
attr = attr.with_type(default_types[attr.name])
end
[attr.name, attr]
end]
AttributeSet.new(attributes_hash)
end
end
protected
attr_reader :default_types
end
end
end
@@ -338,7 +338,7 @@ def initialize(attributes = nil)
# post.title # => 'hello world'
def init_with(coder)
coder = LegacyYamlAdapter.convert(self.class, coder)
@attributes = coder['attributes']
@attributes = self.class.yaml_encoder.decode(coder)
init_internals
@@ -404,11 +404,9 @@ def initialize_dup(other) # :nodoc:
# Post.new.encode_with(coder)
# coder # => {"attributes" => {"id" => nil, ... }}
def encode_with(coder)
# FIXME: Remove this when we better serialize attributes
coder['raw_attributes'] = attributes_before_type_cast
coder['attributes'] = @attributes
self.class.yaml_encoder.encode(@attributes, coder)
coder['new_record'] = new_record?
coder['active_record_yaml_version'] = 1
coder['active_record_yaml_version'] = 2
end
# Returns true if +comparison_object+ is the same exact object, or +comparison_object+
@@ -4,7 +4,7 @@ def self.convert(klass, coder)
return coder unless coder.is_a?(Psych::Coder)
case coder["active_record_yaml_version"]
when 1 then coder
when 1, 2 then coder
else
if coder["attributes"].is_a?(AttributeSet)
Rails420.convert(klass, coder)
@@ -267,6 +267,10 @@ def attribute_types # :nodoc:
@attribute_types ||= Hash.new(Type::Value.new)
end
def yaml_encoder # :nodoc:
@yaml_encoder ||= AttributeSet::YAMLEncoder.new(attribute_types)
end
# Returns the type of the attribute with the given name, after applying
# all modifiers. This method is the only valid source of information for
# anything related to the types of a model's attributes. This method will
@@ -375,6 +379,7 @@ def reload_schema_from_cache
@columns = nil
@columns_hash = nil
@attribute_names = nil
@yaml_encoder = nil
direct_descendants.each do |descendant|
descendant.send(:reload_schema_from_cache)
end
@@ -109,6 +109,16 @@ def test_deserializing_rails_4_2_0_yaml
assert_equal("Have a nice day", topic.content)
end
def test_yaml_encoding_keeps_mutations
author = Author.first
author.name = "Sean"
dumped = YAML.load(YAML.dump(author))
assert_equal "Sean", dumped.name
assert_equal author.name_was, dumped.name_was
assert_equal author.changes, dumped.changes
end
private
def yaml_fixture(file_name)

0 comments on commit c4cb686

Please sign in to comment.