Skip to content

Commit

Permalink
Merge pull request #47463 from Shopify/configurable-default-column-se…
Browse files Browse the repository at this point in the history
…rializer

Allow to define the default column serializer
  • Loading branch information
byroot committed Feb 22, 2023
2 parents 3ad83f9 + 185f2d7 commit 1797e5b
Show file tree
Hide file tree
Showing 18 changed files with 346 additions and 177 deletions.
2 changes: 1 addition & 1 deletion actiontext/app/models/action_text/rich_text.rb
Expand Up @@ -8,7 +8,7 @@ module ActionText
class RichText < Record
self.table_name = "action_text_rich_texts"

serialize :body, ActionText::Content
serialize :body, coder: ActionText::Content
delegate :to_s, :nil?, to: :body

belongs_to :record, polymorphic: true, touch: true
Expand Down
1 change: 1 addition & 0 deletions activerecord/lib/active_record.rb
Expand Up @@ -118,6 +118,7 @@ module ActiveRecord
end

module Coders
autoload :ColumnSerializer, "active_record/coders/column_serializer"
autoload :JSON, "active_record/coders/json"
autoload :YAMLColumn, "active_record/coders/yaml_column"
end
Expand Down
173 changes: 141 additions & 32 deletions activerecord/lib/active_record/attribute_methods/serialization.rb
Expand Up @@ -15,6 +15,10 @@ def initialize(name, type)
end
end

included do
class_attribute :default_column_serializer, instance_accessor: false, default: Coders::YAMLColumn
end

module ClassMethods
# If you have an attribute that needs to be saved to the database as a
# serialized object, and retrieved by deserializing into the same object,
Expand All @@ -36,21 +40,16 @@ module ClassMethods
# ==== Parameters
#
# * +attr_name+ - The name of the attribute to serialize.
# * +class_name_or_coder+ - Optional. May be one of the following:
# * <em>default</em> - The attribute value will be serialized as YAML.
# The attribute value must respond to +to_yaml+.
# * +Array+ - The attribute value will be serialized as YAML, but an
# empty +Array+ will be serialized as +NULL+. The attribute value
# must be an +Array+.
# * +Hash+ - The attribute value will be serialized as YAML, but an
# empty +Hash+ will be serialized as +NULL+. The attribute value
# must be a +Hash+.
# * +JSON+ - The attribute value will be serialized as JSON. The
# attribute value must respond to +to_json+.
# * <em>custom coder</em> - The attribute value will be serialized
# * +coder+ The serializer implementation to use, e.g. +JSON+.
# * The attribute value will be serialized
# using the coder's <tt>dump(value)</tt> method, and will be
# deserialized using the coder's <tt>load(string)</tt> method. The
# +dump+ method may return +nil+ to serialize the value as +NULL+.
# * +type+ - Optional. What the type of the serialized object should be.
# * Attempting to serialize another type will raise an
# <tt>ActiveRecord::SerializationTypeMismatch</tt> error.
# * If the column is +NULL+ or starting from a new record, the default value
# will set to +type.new+
# * +yaml+ - Optional. Yaml specific options. The allowed config is:
# * +:permitted_classes+ - +Array+ with the permitted classes.
# * +:unsafe_load+ - Unsafely load YAML blobs, allow YAML to load any class.
Expand All @@ -61,30 +60,101 @@ module ClassMethods
# this option is not passed, the previous default value (if any) will
# be used. Otherwise, the default will be +nil+.
#
# ==== Choosing a serializer
#
# While any serialization format can be used, it is recommended to carefully
# evaludate the properties of a serializer before using it, as migrating to
# another format later on can be difficult.
#
# ===== Avoid accepting arbitrary types
#
# When serializing data in a column, it is heavily recommended to make sure
# only expected types will be serialized. For instance some serializer like
# +Marshal+ or +YAML+ are capable of serializing almost any Ruby object.
#
# This can lead to unexpected types being serialized, and it it happens
# that type serialization must remain backward and forward compatible as long
# as some database record still contain these serialized types.
#
# class Address
# def initialize(line, city, country)
# @line, @city, @country = line, city, country
# end
# end
#
# In the above example, if any of the +Address+ attributes is renamed,
# instances that were persisted before the change will be loaded with the
# old attributes. This problem is even worse when the serialized type comes
# from a dependency which doesn't expect to be serialized this way and may
# change its internal representation without notice.
#
# As such, it is heavily recommended to instead convert these objects into
# primitives of the serialization format, for example:
#
# class Address
# attr_reader :line, :city, :country
#
# def self.load(payload)
# data = YAML.safe_load(payload)
# new(data["line"], data["city"], data["country"])
# end
#
# def self.dump(address)
# YAML.safe_dump(
# "line" => address.line,
# "city" => address.city,
# "country" => address.country,
# )
# end
#
# def initialize(line, city, country)
# @line, @city, @country = line, city, country
# end
# end
#
# class User < ActiveRecord::Base
# serialize :address, coder: Address
# end
#
# This patterns allow to be more deliberate about what is serialized, and
# to evolve the format in a backward compatible way.
#
# ===== Ensure serialization stability
#
# Some serialization methods may accept some types they don't support by
# silently casting them to another types. This can cause bugs when the
# data is deserialized.
#
# For instance the +JSON+ serializer provided in the standard library will
# silently cast unsupported types to +String+:
#
# >> JSON.parse(JSON.dump(Struct.new(:foo)))
# => "#<Class:0x000000013090b4c0>"
#
# ==== Examples
#
# ===== Serialize the +preferences+ attribute using YAML
#
# class User < ActiveRecord::Base
# serialize :preferences
# serialize :preferences, coder: YAML
# end
#
# ===== Serialize the +preferences+ attribute using JSON
#
# class User < ActiveRecord::Base
# serialize :preferences, JSON
# serialize :preferences, coder: JSON
# end
#
# ===== Serialize the +preferences+ +Hash+ using YAML
#
# class User < ActiveRecord::Base
# serialize :preferences, Hash
# serialize :preferences, type: Hash, coder: YAML
# end
#
# ===== Serializes +preferences+ to YAML, permitting select classes
#
# class User < ActiveRecord::Base
# serialize :preferences, yaml: { permitted_classes: [Symbol, Time] }
# serialize :preferences, coder: YAML, yaml: { permitted_classes: [Symbol, Time] }
# end
#
# ===== Serialize the +preferences+ attribute using a custom coder
Expand All @@ -106,35 +176,74 @@ module ClassMethods
# end
#
# class User < ActiveRecord::Base
# serialize :preferences, Rot13JSON
# serialize :preferences, coder: Rot13JSON
# end
#
def serialize(attr_name, class_name_or_coder = Object, yaml: {}, **options)
# When ::JSON is used, force it to go through the Active Support JSON encoder
# to ensure special objects (e.g. Active Record models) are dumped correctly
# using the #as_json hook.
coder = if class_name_or_coder == ::JSON
Coders::JSON
elsif [:load, :dump].all? { |x| class_name_or_coder.respond_to?(x) }
class_name_or_coder
else
Coders::YAMLColumn.new(attr_name, class_name_or_coder, **yaml)
def serialize(attr_name, class_name_or_coder = nil, coder: nil, type: Object, yaml: {}, **options)
unless class_name_or_coder.nil?
if class_name_or_coder.respond_to?(:new)
ActiveRecord.deprecator.warn(<<-MSG)
Passing the class as positional argument is deprecated and will be remove in Rails 7.2.
Please pass the class as a keyword argument:
serialize #{attr_name.inspect}, type: #{class_name_or_coder.name}
MSG
type = class_name_or_coder
else
ActiveRecord.deprecator.warn(<<-MSG)
Passing the coder as positional argument is deprecated and will be remove in Rails 7.2.
Please pass the coder as a keyword argument:
serialize #{attr_name.inspect}, coder: #{class_name_or_coder}
MSG
coder = class_name_or_coder
end
end

coder ||= default_column_serializer
unless coder
raise ArgumentError, <<~MSG.squish
missing keyword: :coder
If no default coder is configured, a coder must be provided to `serialize`.
MSG
end

column_serializer = build_column_serializer(attr_name, coder, type, yaml)

attribute(attr_name, **options) do |cast_type|
if type_incompatible_with_serialize?(cast_type, class_name_or_coder)
if type_incompatible_with_serialize?(cast_type, coder, type)
raise ColumnNotSerializableError.new(attr_name, cast_type)
end

cast_type = cast_type.subtype if Type::Serialized === cast_type
Type::Serialized.new(cast_type, coder, default: columns_hash[attr_name.to_s]&.default)
Type::Serialized.new(cast_type, column_serializer, default: columns_hash[attr_name.to_s]&.default)
end
end

private
def type_incompatible_with_serialize?(type, class_name)
type.is_a?(ActiveRecord::Type::Json) && class_name == ::JSON ||
type.respond_to?(:type_cast_array, true) && class_name == ::Array
def build_column_serializer(attr_name, coder, type, yaml = nil)
# When ::JSON is used, force it to go through the Active Support JSON encoder
# to ensure special objects (e.g. Active Record models) are dumped correctly
# using the #as_json hook.
coder = Coders::JSON if coder == ::JSON

if coder == ::YAML
Coders::YAMLColumn.new(attr_name, type, **(yaml || {}))
elsif coder.respond_to?(:new) && !coder.respond_to?(:load)
coder.new(attr_name, type)
elsif type && type != Object
Coders::ColumnSerializer.new(attr_name, coder, type)
else
coder
end
end

def type_incompatible_with_serialize?(cast_type, coder, type)
cast_type.is_a?(ActiveRecord::Type::Json) && coder == ::JSON ||
cast_type.respond_to?(:type_cast_array, true) && type == ::Array
end
end
end
Expand Down
61 changes: 61 additions & 0 deletions activerecord/lib/active_record/coders/column_serializer.rb
@@ -0,0 +1,61 @@
# frozen_string_literal: true

module ActiveRecord
module Coders # :nodoc:
class ColumnSerializer # :nodoc:
attr_reader :object_class
attr_reader :coder

def initialize(attr_name, coder, object_class = Object)
@attr_name = attr_name
@object_class = object_class
@coder = coder
check_arity_of_constructor
end

def init_with(coder) # :nodoc:
@attr_name = coder["attr_name"]
@object_class = coder["object_class"]
@coder = coder["coder"]
end

def dump(object)
return if object.nil?

assert_valid_value(object, action: "dump")
coder.dump(object)
end

def load(payload)
if payload.nil?
if @object_class != ::Object
return @object_class.new
end
return nil
end

object = coder.load(payload)

assert_valid_value(object, action: "load")
object ||= object_class.new if object_class != Object

object
end

# Public because it's called by Type::Serialized
def assert_valid_value(object, action:)
unless object.nil? || object_class === object
raise SerializationTypeMismatch,
"can't #{action} `#{@attr_name}`: was supposed to be a #{object_class}, but was a #{object.class}. -- #{object.inspect}"
end
end

private
def check_arity_of_constructor
load(nil)
rescue ArgumentError
raise ArgumentError, "Cannot serialize #{object_class}. Classes passed to `serialize` must have a 0 argument constructor."
end
end
end
end
2 changes: 1 addition & 1 deletion activerecord/lib/active_record/coders/json.rb
Expand Up @@ -2,7 +2,7 @@

module ActiveRecord
module Coders # :nodoc:
class JSON # :nodoc:
module JSON # :nodoc:
def self.dump(obj)
ActiveSupport::JSON.encode(obj)
end
Expand Down

0 comments on commit 1797e5b

Please sign in to comment.