Skip to content

Commit

Permalink
Avoid double serialization of message data
Browse files Browse the repository at this point in the history
Prior to this commit, messages with metadata were always serialized in
the following way:

  ```ruby
  Base64.strict_encode64(
    ActiveSupport::JSON.encode({
      "_rails" => {
        "message" => Base64.strict_encode64(
          serializer.dump(data)
        ),
        "pur" => "the purpose",
        "exp" => "the expiration"
      },
    })
  )
  ```

in which the message data is serialized and URL-encoded twice.

This commit changes message serialization such that, when possible, the
data is serialized and URL-encoded only once:

  ```ruby
  Base64.strict_encode64(
    serializer.dump({
      "_rails" => {
        "data" => data,
        "pur" => "the purpose",
        "exp" => "the expiration"
      },
    })
  )
  ```

This improves performance in proportion to the size of the data:

**Benchmark**

  ```ruby
  # frozen_string_literal: true
  require "benchmark/ips"
  require "active_support/all"

  verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)

  payloads = [
    { "content" => "x" * 100 },
    { "content" => "x" * 2000 },
    { "content" => "x" * 1_000_000 },
  ]

  if ActiveSupport::Messages::Metadata.respond_to?(:use_message_serializer_for_metadata)
    ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = true
  end

  Benchmark.ips do |x|
    payloads.each do |payload|
      x.report("generate ~#{payload["content"].size}B") do
        $generated_message = verifier.generate(payload, purpose: "x")
      end

      x.report("verify ~#{payload["content"].size}B") do
        verifier.verify($generated_message, purpose: "x")
      end
    end
  end

  puts

  puts "Message size:"
  payloads.each do |payload|
    puts "  ~#{payload["content"].size} bytes of data => " \
      "#{verifier.generate(payload, purpose: "x").size} byte message"
  end
  ```

**Before**

  ```
  Warming up --------------------------------------
        generate ~100B     1.578k i/100ms
          verify ~100B     2.506k i/100ms
       generate ~2000B   447.000  i/100ms
         verify ~2000B     1.409k i/100ms
    generate ~1000000B     1.000  i/100ms
      verify ~1000000B     6.000  i/100ms
  Calculating -------------------------------------
        generate ~100B     15.807k (± 1.8%) i/s -     80.478k in   5.093161s
          verify ~100B     25.240k (± 2.1%) i/s -    127.806k in   5.066096s
       generate ~2000B      4.530k (± 2.4%) i/s -     22.797k in   5.035398s
         verify ~2000B     14.136k (± 2.3%) i/s -     71.859k in   5.086267s
    generate ~1000000B     11.673  (± 0.0%) i/s -     59.000  in   5.060598s
      verify ~1000000B     64.372  (± 6.2%) i/s -    324.000  in   5.053304s

  Message size:
    ~100 bytes of data => 306 byte message
    ~2000 bytes of data => 3690 byte message
    ~1000000 bytes of data => 1777906 byte message
  ```

**After**

  ```
  Warming up --------------------------------------
        generate ~100B     4.689k i/100ms
          verify ~100B     3.183k i/100ms
       generate ~2000B     2.722k i/100ms
         verify ~2000B     2.066k i/100ms
    generate ~1000000B    12.000  i/100ms
      verify ~1000000B    11.000  i/100ms
  Calculating -------------------------------------
        generate ~100B     46.984k (± 1.2%) i/s -    239.139k in   5.090540s
          verify ~100B     32.043k (± 1.2%) i/s -    162.333k in   5.066903s
       generate ~2000B     27.163k (± 1.2%) i/s -    136.100k in   5.011254s
         verify ~2000B     20.726k (± 1.7%) i/s -    105.366k in   5.085442s
    generate ~1000000B    125.600  (± 1.6%) i/s -    636.000  in   5.064607s
      verify ~1000000B    122.039  (± 4.1%) i/s -    616.000  in   5.058386s

  Message size:
    ~100 bytes of data => 234 byte message
    ~2000 bytes of data => 2770 byte message
    ~1000000 bytes of data => 1333434 byte message
  ```

This optimization is only applied for recognized serializers that are
capable of serializing a `Hash`.

Additionally, because the optimization changes the message format, a
`config.active_support.use_message_serializer_for_metadata` option has
been added to disable it.  The optimization is disabled by default, but
enabled with `config.load_defaults 7.1`.

Regardless of whether the optimization is enabled, messages using either
format can still be read.

In the case of a rolling deploy of a Rails upgrade, wherein servers that
have not yet been upgraded must be able to read messages from upgraded
servers, the optimization can be disabled on first deploy, then safely
enabled on a subsequent deploy.
  • Loading branch information
jonathanhefner committed Feb 8, 2023
1 parent ebc3b66 commit 91bb5da
Show file tree
Hide file tree
Showing 10 changed files with 184 additions and 70 deletions.
14 changes: 4 additions & 10 deletions activesupport/lib/active_support/message_encryptor.rb
Expand Up @@ -85,6 +85,7 @@ module ActiveSupport
#
# crypt.rotate old_secret, cipher: "aes-256-cbc"
class MessageEncryptor
include Messages::Metadata
prepend Messages::Rotator::Encryptor

cattr_accessor :use_authenticated_message_encryption, instance_accessor: false, default: false
Expand Down Expand Up @@ -221,13 +222,7 @@ def self.key_len(cipher = default_cipher)
end

private
def serialize(value)
@serializer.dump(value)
end

def deserialize(value)
@serializer.load(value)
end
attr_reader :serializer

def encode(data)
@url_safe ? ::Base64.urlsafe_encode64(data, padding: false) : ::Base64.strict_encode64(data)
Expand All @@ -246,7 +241,7 @@ def _encrypt(value, **metadata_options)
iv = cipher.random_iv
cipher.auth_data = "" if aead_mode?

encrypted_data = cipher.update(Messages::Metadata.wrap(serialize(value), **metadata_options))
encrypted_data = cipher.update(serialize_with_metadata(value, **metadata_options))
encrypted_data << cipher.final

parts = [encrypted_data, iv]
Expand Down Expand Up @@ -275,8 +270,7 @@ def _decrypt(encrypted_message, purpose)
decrypted_data = cipher.update(encrypted_data)
decrypted_data << cipher.final

message = Messages::Metadata.verify(decrypted_data, purpose)
deserialize(message) if message
deserialize_with_metadata(decrypted_data, purpose: purpose)
rescue OpenSSLCipherError, TypeError, ArgumentError, ::JSON::ParserError
raise InvalidMessage
end
Expand Down
11 changes: 7 additions & 4 deletions activesupport/lib/active_support/message_verifier.rb
Expand Up @@ -119,6 +119,7 @@ module ActiveSupport
# @verifier = ActiveSupport::MessageVerifier.new("secret", url_safe: true)
# @verifier.generate("signed message") #=> URL-safe string
class MessageVerifier
include Messages::Metadata
prepend Messages::Rotator::Verifier

class InvalidSignature < StandardError; end
Expand Down Expand Up @@ -198,8 +199,7 @@ def verified(signed_message, purpose: nil, **)
data, digest = get_data_and_digest_from(signed_message)
if digest_matches_data?(digest, data)
begin
message = Messages::Metadata.verify(decode(data), purpose)
@serializer.load(message) if message
deserialize_with_metadata(decode(data), purpose: purpose)
rescue ArgumentError => argument_error
return if argument_error.message.include?("invalid base64")
raise
Expand Down Expand Up @@ -274,11 +274,14 @@ def verify(*args, **options)
# specified when verifying the message; otherwise, verification will fail.
# (See #verified and #verify.)
def generate(value, expires_at: nil, expires_in: nil, purpose: nil)
data = encode(Messages::Metadata.wrap(@serializer.dump(value), expires_at: expires_at, expires_in: expires_in, purpose: purpose))
"#{data}#{SEPARATOR}#{generate_digest(data)}"
data = encode(serialize_with_metadata(value, expires_at: expires_at, expires_in: expires_in, purpose: purpose))
digest = generate_digest(data)
data << SEPARATOR << digest
end

private
attr_reader :serializer

def encode(data)
@url_safe ? Base64.urlsafe_encode64(data, padding: false) : Base64.strict_encode64(data)
end
Expand Down
122 changes: 70 additions & 52 deletions activesupport/lib/active_support/messages/metadata.rb
@@ -1,83 +1,101 @@
# frozen_string_literal: true

require "time"
require "active_support/json"

module ActiveSupport
module Messages # :nodoc:
class Metadata # :nodoc:
def initialize(message, expires_at = nil, purpose = nil)
@message, @purpose = message, purpose
@expires_at = expires_at.is_a?(String) ? parse_expires_at(expires_at) : expires_at
end
module Metadata # :nodoc:
singleton_class.attr_accessor :use_message_serializer_for_metadata

def as_json(options = {})
{ _rails: { message: @message, exp: @expires_at, pur: @purpose } }
end
ENVELOPE_SERIALIZERS = [
::JSON,
ActiveSupport::JSON,
ActiveSupport::JsonWithMarshalFallback,
Marshal,
]

class << self
def wrap(message, expires_at: nil, expires_in: nil, purpose: nil)
if expires_at || expires_in || purpose
JSON.encode new(encode(message), pick_expiry(expires_at, expires_in), purpose)
private
def serialize_with_metadata(data, **metadata)
has_metadata = metadata.any? { |k, v| v }

if has_metadata && !use_message_serializer_for_metadata?
data_string = serialize_to_json_safe_string(data)
envelope = wrap_in_metadata_envelope({ "message" => data_string }, **metadata)
ActiveSupport::JSON.encode(envelope)
else
message
data = wrap_in_metadata_envelope({ "data" => data }, **metadata) if has_metadata
serializer.dump(data)
end
end

def verify(message, purpose)
extract_metadata(message).verify(purpose)
end

private
def pick_expiry(expires_at, expires_in)
if expires_at
expires_at.utc.iso8601(3)
elsif expires_in
Time.now.utc.advance(seconds: expires_in).iso8601(3)
end
end

def extract_metadata(message)
begin
data = JSON.decode(message) if message.start_with?('{"_rails":')
rescue ::JSON::JSONError
end

if data
new(decode(data["_rails"]["message"]), data["_rails"]["exp"], data["_rails"]["pur"])
def deserialize_with_metadata(message, **expected_metadata)
if dual_serialized_metadata_envelope_json?(message)
envelope = ActiveSupport::JSON.decode(message)
extracted = extract_from_metadata_envelope(envelope, **expected_metadata)
deserialize_from_json_safe_string(extracted["message"]) if extracted
else
deserialized = serializer.load(message)
if metadata_envelope?(deserialized)
extracted = extract_from_metadata_envelope(deserialized, **expected_metadata)
extracted["data"] if extracted
else
new(message)
deserialized if expected_metadata.none? { |k, v| v }
end
end
end

def encode(message)
::Base64.strict_encode64(message)
end
def use_message_serializer_for_metadata?
Metadata.use_message_serializer_for_metadata && Metadata::ENVELOPE_SERIALIZERS.include?(serializer)
end

def decode(message)
::Base64.strict_decode64(message)
end
end
def wrap_in_metadata_envelope(hash, expires_at: nil, expires_in: nil, purpose: nil)
expiry = pick_expiry(expires_at, expires_in)
hash["exp"] = expiry if expiry
hash["pur"] = purpose.to_s if purpose
{ "_rails" => hash }
end

def verify(purpose)
@message if match?(purpose) && fresh?
end
def extract_from_metadata_envelope(envelope, purpose: nil)
hash = envelope["_rails"]
return if hash["exp"] && Time.now.utc >= parse_expiry(hash["exp"])
return if hash["pur"] != purpose&.to_s
hash
end

private
def match?(purpose)
@purpose.to_s == purpose.to_s
def metadata_envelope?(object)
object.is_a?(Hash) && object.key?("_rails")
end

def dual_serialized_metadata_envelope_json?(string)
string.start_with?('{"_rails":{"message":')
end

def fresh?
@expires_at.nil? || Time.now.utc < @expires_at
def pick_expiry(expires_at, expires_in)
if expires_at
expires_at.utc.iso8601(3)
elsif expires_in
Time.now.utc.advance(seconds: expires_in).iso8601(3)
end
end

def parse_expires_at(expires_at)
if ActiveSupport.use_standard_json_time_format
def parse_expiry(expires_at)
if !expires_at.is_a?(String)
expires_at
elsif ActiveSupport.use_standard_json_time_format
Time.iso8601(expires_at)
else
Time.parse(expires_at)
end
end

def serialize_to_json_safe_string(data)
::Base64.strict_encode64(serializer.dump(data))
end

def deserialize_from_json_safe_string(string)
serializer.load(::Base64.strict_decode64(string))
end
end
end
end
7 changes: 7 additions & 0 deletions activesupport/lib/active_support/railtie.rb
Expand Up @@ -192,5 +192,12 @@ class Railtie < Rails::Railtie # :nodoc:
end
end
end

initializer "active_support.set_use_message_serializer_for_metadata" do |app|
config.after_initialize do
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata =
app.config.active_support.use_message_serializer_for_metadata
end
end
end
end
32 changes: 28 additions & 4 deletions activesupport/test/messages/message_metadata_tests.rb
Expand Up @@ -2,6 +2,7 @@

require "active_support/json"
require "active_support/time"
require "active_support/messages/metadata"

module MessageMetadataTests
extend ActiveSupport::Concern
Expand Down Expand Up @@ -89,6 +90,17 @@ module MessageMetadataTests
codec = make_codec(serializer: ActiveSupport::MessageEncryptor::NullSerializer)
assert_roundtrip "a string", codec, { purpose: "x", expires_in: 1.year }, { purpose: "x" }
end

test "messages are readable regardless of use_message_serializer_for_metadata" do
each_scenario do |data, codec|
message = encode(data, codec, purpose: "x")
message_setting = ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata

using_message_serializer_for_metadata(!message_setting) do
assert_equal data, decode(message, codec, purpose: "x")
end
end
end
end

private
Expand Down Expand Up @@ -116,11 +128,23 @@ def self.load(value)
["a string", 123, Time.local(2004), { "key" => "value" }],
]

def using_message_serializer_for_metadata(value = true)
original = ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = value
yield
ensure
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = original
end

def each_scenario
SERIALIZERS.each do |serializer|
codec = make_codec(serializer: serializer)
DATA.each do |data|
yield data, codec
[false, true].each do |use_message_serializer_for_metadata|
using_message_serializer_for_metadata(use_message_serializer_for_metadata) do
SERIALIZERS.each do |serializer|
codec = make_codec(serializer: serializer)
DATA.each do |data|
yield data, codec
end
end
end
end
end
Expand Down
15 changes: 15 additions & 0 deletions activesupport/test/messages/message_verifier_metadata_test.rb
Expand Up @@ -56,6 +56,21 @@ class MessageVerifierMetadataTest < ActiveSupport::TestCase
end
end

test "messages are readable by legacy versions when use_message_serializer_for_metadata = false" do
# Message generated by Rails 7.0 using:
#
# verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)
# legacy_message = verifier.generate("legacy", purpose: "test", expires_at: Time.utc(3000))
#
legacy_message = "eyJfcmFpbHMiOnsibWVzc2FnZSI6IklteGxaMkZqZVNJPSIsImV4cCI6IjMwMDAtMDEtMDFUMDA6MDA6MDAuMDAwWiIsInB1ciI6InRlc3QifX0=--81b11c317dba91cedd86ab79b7d7e68de8d290b3"

verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)

using_message_serializer_for_metadata(false) do
assert_equal legacy_message, verifier.generate("legacy", purpose: "test", expires_at: Time.utc(3000))
end
end

private
def make_codec(**options)
ActiveSupport::MessageVerifier.new("secret", **options)
Expand Down
15 changes: 15 additions & 0 deletions guides/source/configuring.md
Expand Up @@ -2195,6 +2195,21 @@ The default value depends on the `config.load_defaults` target version:
| (original) | `false` |
| 5.2 | `true` |

#### `config.active_support.use_message_serializer_for_metadata`

When `true`, enables a performance optimization that serializes message data and
metadata together. This changes the message format, so messages serialized this
way cannot be read by older (< 7.1) versions of Rails. However, messages that
use the old format can still be read, regardless of whether this optimization is
enabled.

The default value depends on the `config.load_defaults` target version:

| Starting with version | The default value is |
| --------------------- | -------------------- |
| (original) | `false` |
| 7.1 | `true` |

#### `config.active_support.cache_format_version`

Specifies which version of the cache serializer to use. Possible values are `6.1` and `7.0`.
Expand Down
1 change: 1 addition & 0 deletions railties/lib/rails/application/configuration.rb
Expand Up @@ -306,6 +306,7 @@ def load_defaults(target_version)
if respond_to?(:active_support)
active_support.default_message_encryptor_serializer = :json
active_support.default_message_verifier_serializer = :json
active_support.use_message_serializer_for_metadata = true
active_support.raise_on_invalid_cache_expiration_time = true
end

Expand Down
Expand Up @@ -99,6 +99,17 @@
#
# For detailed migration steps, check out https://guides.rubyonrails.org/v7.1/upgrading_ruby_on_rails.html#new-activesupport-messageverifier-default-serializer

# Enable a performance optimization that serializes message data and metadata
# together. This changes the message format, so messages serialized this way
# cannot be read by older versions of Rails. However, messages that use the old
# format can still be read, regardless of whether this optimization is enabled.
#
# To perform a rolling deploy of a Rails 7.1 upgrade, wherein servers that have
# not yet been upgraded must be able to read messages from upgraded servers,
# leave this optimization off on the first deploy, then enable it on a
# subsequent deploy.
# Rails.application.config.active_support.use_message_serializer_for_metadata = true

# Set the maximum size for Rails log files.
#
# `config.load_defaults 7.1` does not set this value for environments other than
Expand Down

0 comments on commit 91bb5da

Please sign in to comment.