Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return a copy of the cache entry when local_cache exists: #36656

Merged
merged 1 commit into from Jul 15, 2019

Conversation

Edouard-chin
Copy link
Member

@Edouard-chin Edouard-chin commented Jul 11, 2019

Return a copy of the cache entry when local_cache exists:

  • When the local cache exists (during the request lifecycle), the
    entry returned from the LocalStore is passed as a reference which
    means mutable object can accidentaly get modified.

    This behaviour seems unnecessarily unsafe and is prone to
    issues like it happened in our application.

    This patch dup the Entry returned from the cache and dup it's
    internal value.

cc/ @rafaelfranca @casperisfine

@data.fetch(key) { @data[key] = yield }
entry = @data.fetch(key) { @data[key] = yield }
dup_entry = entry.dup
dup_entry&.dup_value!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could dup_entry ever be nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case you try to access a key that doesn't exist

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I'm stuck in the past, turns out nil.dup actually exist on modern rubies TIL.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would have to be a deep copy. .dup alone will still let you modify nested objects if present

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because dup_value! does a Marshal.load

@value = Marshal.load(Marshal.dump(@value))

- When the local cache exists (during the request lifecycle), the
  entry returned from the LocalStore is passed as a reference which
  means mutable object can accidentaly get modified.

  This behaviour seems unnecessarily unsafe and is prone to
  issues like it happened in our application.

  This patch dup the `Entry` returned from the cache and dup it's
  internal value.
@@ -75,7 +75,10 @@ def delete_entry(key, options)
end

def fetch_entry(key, options = nil) # :nodoc:
@data.fetch(key) { @data[key] = yield }
entry = @data.fetch(key) { @data[key] = yield }
dup_entry = entry.dup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you need to dup the entry itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the @data hash keeps a reference to the entry, if I don't dup it, user can modify the underlying value which would affect the cache.

@rafaelfranca rafaelfranca merged commit f30f76a into rails:master Jul 15, 2019
@Edouard-chin Edouard-chin deleted the ec-local-cache-reference branch July 15, 2019 21:01
casperisfine pushed a commit to Shopify/rails that referenced this pull request Apr 19, 2021
The local cache need to protect against mutations of both
the initial received value, and the value returned.

See:

  - rails#36656
  - rails#37587

Because of this, the overhead of the local cache end up being
quite significant. On a single read, the value will be deep duped
twice. So unless the value is one of the type benefiting from a
fast path, we'll have to do two `Marshal.load(Marshal.dump(value))`
roundtrips, which for large values can be very expensive.

By using a specialized `LocalEntry` type instead, we can store
the `Marshal` payload rather than the original value. This way
the overhead is reduced to a single `Marshal.dump` on writes
and a single `Marshal.load` on reads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants