Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support replacing cache compressor #48451

Merged
merged 1 commit into from Jul 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
54 changes: 44 additions & 10 deletions activesupport/CHANGELOG.md
@@ -1,3 +1,39 @@
* Active Support cache stores now support replacing the default compressor via
a `:compressor` option. The specified compressor must respond to `deflate`
and `inflate`. For example:

```ruby
module MyCompressor
def self.deflate(string)
# compression logic...
end

def self.inflate(compressed)
# decompression logic...
end
end

config.cache_store = :redis_cache_store, { compressor: MyCompressor }
```

*Jonathan Hefner*

* Active Support cache stores now support a `:serializer` option. Similar to
the `:coder` option, serializers must respond to `dump` and `load`. However,
serializers are only responsible for serializing a cached value, whereas
coders are responsible for serializing the entire `ActiveSupport::Cache::Entry`
instance. Additionally, the output from serializers can be automatically
compressed, whereas coders are responsible for their own compression.

Specifying a serializer instead of a coder also enables performance
optimizations, including the bare string optimization introduced by cache
format version 7.1.

The `:serializer` and `:coder` options are mutually exclusive. Specifying
both will raise an `ArgumentError`.

*Jonathan Hefner*

* Fix `ActiveSupport::Inflector.humanize(nil)` raising ``NoMethodError: undefined method `end_with?' for nil:NilClass``.

*James Robinson*
Expand Down Expand Up @@ -164,25 +200,23 @@
read caches from upgraded servers, leave the cache format unchanged on the
first deploy, then enable the `7.1` cache format on a subsequent deploy.

The new `:message_pack` cache coder also includes this optimization.

*Jonathan Hefner*

* The `:coder` option for Active Support cache stores now supports a
`:message_pack` value:
* Active Support cache stores can now use a preconfigured serializer based on
`ActiveSupport::MessagePack` via the `:serializer` option:

```ruby
config.cache_store = :redis_cache_store, { coder: :message_pack }
config.cache_store = :redis_cache_store, { serializer: :message_pack }
```

The `:message_pack` coder can reduce cache entry sizes and improve
The `:message_pack` serializer can reduce cache entry sizes and improve
performance, but requires the [`msgpack` gem](https://rubygems.org/gems/msgpack)
(>= 1.7.0).

The `:message_pack` coder can read cache entries written by the default
coder, and the default coder can now read entries written by the
`:message_pack` coder. These behaviors make it easy to migrate between
coders without invalidating the entire cache.
The `:message_pack` serializer can read cache entries written by the default
serializer, and the default serializer can now read entries written by the
`:message_pack` serializer. These behaviors make it easy to migrate between
serializer without invalidating the entire cache.

*Jonathan Hefner*

Expand Down
97 changes: 85 additions & 12 deletions activesupport/lib/active_support/cache.rb
@@ -1,12 +1,14 @@
# frozen_string_literal: true

require "zlib"
require "active_support/core_ext/array/extract_options"
require "active_support/core_ext/enumerable"
require "active_support/core_ext/module/attribute_accessors"
require "active_support/core_ext/numeric/bytes"
require "active_support/core_ext/object/to_param"
require "active_support/core_ext/object/try"
require "active_support/core_ext/string/inflections"
require_relative "cache/coder"
require_relative "cache/entry"
require_relative "cache/serializer_with_fallback"

Expand All @@ -25,11 +27,13 @@ module Cache
:coder,
:compress,
:compress_threshold,
:compressor,
:expire_in,
:expired_in,
:expires_in,
:namespace,
:race_condition_ttl,
:serializer,
:skip_nil,
]

Expand Down Expand Up @@ -249,27 +253,79 @@ def retrieve_pool_options(options)
# Sets the namespace for the cache. This option is especially useful if
# your application shares a cache with other applications.
#
# [+:serializer+]
# The serializer for cached values. Must respond to +dump+ and +load+.
#
# The default serializer depends on the cache format version (set via
# +config.active_support.cache_format_version+ when using Rails). The
# default serializer for each format version includes a fallback
# mechanism to deserialize values from any format version. This behavior
# makes it easy to migrate between format versions without invalidating
# the entire cache.
#
# You can also specify <tt>serializer: :message_pack</tt> to use a
# preconfigured serializer based on ActiveSupport::MessagePack. The
# +:message_pack+ serializer includes the same deserialization fallback
# mechanism, allowing easy migration from (or to) the default
# serializer. The +:message_pack+ serializer may improve performance,
# but it requires the +msgpack+ gem.
#
# [+:compressor+]
# The compressor for serialized cache values. Must respond to +deflate+
# and +inflate+.
#
# The default compressor is +Zlib+. To define a new custom compressor
# that also decompresses old cache entries, you can check compressed
# values for Zlib's <tt>"\x78"</tt> signature:
#
# module MyCompressor
# def self.deflate(dumped)
# # compression logic... (make sure result does not start with "\x78"!)
# end
#
# def self.inflate(compressed)
# if compressed.start_with?("\x78")
# Zlib.inflate(compressed)
# else
# # decompression logic...
# end
# end
# end
#
# ActiveSupport::Cache.lookup_store(:redis_cache_store, compressor: MyCompressor)
#
# [+:coder+]
# Replaces the default serializer for cache entries. +coder+ must
# respond to +dump+ and +load+. Using a custom coder disables automatic
# compression.
# The coder for serializing and (optionally) compressing cache entries.
# Must respond to +dump+ and +load+.
#
# The default coder composes the serializer and compressor, and includes
# some performance optimizations. If you only need to override the
# serializer or compressor, you should specify the +:serializer+ or
# +:compressor+ options instead.
#
# Alternatively, you can specify <tt>coder: :message_pack</tt> to use a
# preconfigured coder based on ActiveSupport::MessagePack that supports
# automatic compression and includes a fallback mechanism to load old
# cache entries from the default coder. However, this option requires
# the +msgpack+ gem.
# The +:coder+ option is mutally exclusive with the +:serializer+ and
# +:compressor+ options. Specifying them together will raise an
# +ArgumentError+.
#
# Any other specified options are treated as default options for the
# relevant cache operations, such as #read, #write, and #fetch.
def initialize(options = nil)
@options = options ? normalize_options(options) : {}
@options = options ? validate_options(normalize_options(options)) : {}

@options[:compress] = true unless @options.key?(:compress)
@options[:compress_threshold] ||= DEFAULT_COMPRESS_LIMIT

@coder = @options.delete(:coder) { default_coder } || :passthrough
@coder = Cache::SerializerWithFallback[@coder] if @coder.is_a?(Symbol)
@coder = @options.delete(:coder) do
legacy_serializer = Cache.format_version < 7.1 && !@options[:serializer]
serializer = @options.delete(:serializer) || default_serializer
serializer = Cache::SerializerWithFallback[serializer] if serializer.is_a?(Symbol)
compressor = @options.delete(:compressor) { Zlib }

Cache::Coder.new(serializer, compressor, legacy_serializer: legacy_serializer)
end

@coder ||= Cache::SerializerWithFallback[:passthrough]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathanhefner Hi there! 👋🏻
Sorry to ping you on an already merged PR, but hopefully you'll see this! 🤞🏻

This change is breaking some stuff here at Shopify. We have some places that initialize a memory store this way:

ActiveSupport::Cache::MemoryStore.new(coder: :passthrough)

I understand I can fix our code by changing it to this:

ActiveSupport::Cache::MemoryStore.new(coder: ActiveSupport::Cache::SerializerWithFallback[:passthrough])

However, do you think this code could stay backwards compatible and allow setting a coder as a symbol?
One could for example add this line:

@coder = Cache::SerializerWithFallback[@coder] if @coder.is_a?(Symbol)

(Note this line existed as-is at LOC 272 before this PR.)

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidstosik we should fix our code. Passing a coder as a symbol make no sense.

I can help you if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda misread your message. I see what passing a symbol was supported for a bit, I need to check if it was supported in 7.0. If it wasn't then we should fix our code. If it was we should fix Rails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this wasn't supported in 7.0, this was added in 7.1 alpha and removed before release, so let's fix the Shopify side.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, the way to achieve this behavior in 7.0 was to specify coder: nil, and that still works in 7.1. However, I don't think that is documented anywhere. Is that something we officially support? If so, I can add documentation and a regression test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that something we officially support?

I think it would make sense. It's a fairly niche use case but can be useful with MemoryStore as a perf optimization if you know your values aren't mutable, or with MemcachedStore if you wish to use a serializer at the Dalli level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperisfine Thanks for digging that up! I already have a fix for our code so I think ignoring my message above is okay. 👍🏻


@coder_supports_compression = @coder.respond_to?(:dump_compressed)
end

Expand Down Expand Up @@ -686,7 +742,7 @@ def clear(options = nil)
end

private
def default_coder
def default_serializer
case Cache.format_version
when 6.1
ActiveSupport.deprecator.warn <<~EOM
Expand Down Expand Up @@ -842,6 +898,23 @@ def normalize_options(options)
options
end

def validate_options(options)
if options.key?(:coder) && options[:serializer]
raise ArgumentError, "Cannot specify :serializer and :coder options together"
end

if options.key?(:coder) && options[:compressor]
raise ArgumentError, "Cannot specify :compressor and :coder options together"
end

if Cache.format_version < 7.1 && !options[:serializer] && options[:compressor]
raise ArgumentError, "Cannot specify :compressor option when using" \
" default serializer and cache format version is < 7.1"
end

options
end

# Expands and namespaces the cache key.
# Raises an exception when the key is +nil+ or an empty string.
# May be overridden by cache stores to do additional normalization.
Expand Down
123 changes: 123 additions & 0 deletions activesupport/lib/active_support/cache/coder.rb
@@ -0,0 +1,123 @@
# frozen_string_literal: true

require_relative "entry"

module ActiveSupport
module Cache
class Coder # :nodoc:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth making this public to provide a migration path from the :coder option to the :serializer option. But that can be done in a follow-up PR.

def initialize(serializer, compressor, legacy_serializer: false)
@serializer = serializer
@compressor = compressor
@legacy_serializer = legacy_serializer
end

def dump(entry)
return @serializer.dump(entry) if @legacy_serializer

dump_compressed(entry, Float::INFINITY)
end

def dump_compressed(entry, threshold)
return @serializer.dump_compressed(entry, threshold) if @legacy_serializer

# If value is a string with a supported encoding, use it as the payload
# instead of passing it through the serializer.
if type = type_for_string(entry.value)
payload = entry.value.b
else
type = OBJECT_DUMP_TYPE
payload = @serializer.dump(entry.value)
end

if compressed = try_compress(payload, threshold)
payload = compressed
type = type | COMPRESSED_FLAG
end

expires_at = entry.expires_at || -1.0

version = dump_version(entry.version) if entry.version
version_length = version&.bytesize || -1

packed = SIGNATURE.b
packed << [type, expires_at, version_length].pack(PACKED_TEMPLATE)
packed << version if version
packed << payload
end

def load(dumped)
return @serializer.load(dumped) if !signature?(dumped)

type = dumped.unpack1(PACKED_TYPE_TEMPLATE)
expires_at = dumped.unpack1(PACKED_EXPIRES_AT_TEMPLATE)
version_length = dumped.unpack1(PACKED_VERSION_LENGTH_TEMPLATE)

expires_at = nil if expires_at < 0
version = load_version(dumped.byteslice(PACKED_VERSION_INDEX, version_length)) if version_length >= 0
payload = dumped.byteslice((PACKED_VERSION_INDEX + [version_length, 0].max)..)

payload = @compressor.inflate(payload) if type & COMPRESSED_FLAG > 0

if string_encoding = STRING_ENCODINGS[type & ~COMPRESSED_FLAG]
value = payload.force_encoding(string_encoding)
else
value = @serializer.load(payload)
end

Cache::Entry.new(value, version: version, expires_at: expires_at)
end

private
SIGNATURE = "\x00\x11".b.freeze

OBJECT_DUMP_TYPE = 0x01

STRING_ENCODINGS = {
0x02 => Encoding::UTF_8,
0x03 => Encoding::BINARY,
0x04 => Encoding::US_ASCII,
}

COMPRESSED_FLAG = 0x80

PACKED_TEMPLATE = "CEl<"
PACKED_TYPE_TEMPLATE = "@#{SIGNATURE.bytesize}C"
PACKED_EXPIRES_AT_TEMPLATE = "@#{[0].pack(PACKED_TYPE_TEMPLATE).bytesize}E"
PACKED_VERSION_LENGTH_TEMPLATE = "@#{[0].pack(PACKED_EXPIRES_AT_TEMPLATE).bytesize}l<"
PACKED_VERSION_INDEX = [0].pack(PACKED_VERSION_LENGTH_TEMPLATE).bytesize

MARSHAL_SIGNATURE = "\x04\x08".b.freeze

def signature?(dumped)
dumped.is_a?(String) && dumped.start_with?(SIGNATURE)
end

def type_for_string(value)
STRING_ENCODINGS.key(value.encoding) if value.instance_of?(String)
end

def try_compress(string, threshold)
if @compressor && string.bytesize >= threshold
compressed = @compressor.deflate(string)
compressed if compressed.bytesize < string.bytesize
end
end

def dump_version(version)
if version.encoding != Encoding::UTF_8 || version.start_with?(MARSHAL_SIGNATURE)
Marshal.dump(version)
else
version.b
end
end

def load_version(dumped_version)
if dumped_version.start_with?(MARSHAL_SIGNATURE)
Marshal.load(dumped_version)
else
dumped_version.force_encoding(Encoding::UTF_8)
end
end
end
end
end
2 changes: 1 addition & 1 deletion activesupport/lib/active_support/cache/mem_cache_store.rb
Expand Up @@ -222,7 +222,7 @@ def stats
end

private
def default_coder
def default_serializer
if Cache.format_version == 6.1
ActiveSupport.deprecator.warn <<~EOM
Support for `config.active_support.cache_format_version = 6.1` has been deprecated and will be removed in Rails 7.2.
Expand Down
5 changes: 1 addition & 4 deletions activesupport/lib/active_support/cache/memory_store.rb
Expand Up @@ -72,6 +72,7 @@ def load_value(string)

def initialize(options = nil)
options ||= {}
options[:coder] = DupCoder unless options.key?(:coder) || options.key?(:serializer)
# Disable compression by default.
options[:compress] ||= false
super(options)
Expand Down Expand Up @@ -189,10 +190,6 @@ def synchronize(&block) # :nodoc:
private
PER_ENTRY_OVERHEAD = 240

def default_coder
DupCoder
end

def cached_size(key, payload)
key.to_s.bytesize + payload.bytesize + PER_ENTRY_OVERHEAD
end
Expand Down