Skip to content

Conversation

@st0012
Copy link
Collaborator

@st0012 st0012 commented Aug 21, 2020

This fixes #924 and #903

Here's an issue demonstration borrowed from #924

h_int = {abc: :abc}
=> {:abc=>:abc}
h = {k1: h_int, k2: h_int}
=> {:k1=>{:abc=>:abc}, :k2=>{:abc=>:abc}}
Raven.capture_message "Test extra", extra: {h1: h, h2: h_int}

h
=> {:k1=>{:abc=>:abc}, :k2=>"(...)"}

There are 2 factors that cause the issue:

  1. User-provided options will be embedded as part of the Event object.
  2. When we encode the Event object, processors like RemoveCircularReferences use mutation methods like Hash#merge! to save memory allocation.

First Attempt - Drop methods that mutate the data

A quick fix is to change the step 2 to just use methods like Hash#merge. But this can have a significant impact on the memory allocation (about 16~17% more) per Raven#capture_message call.

Here's an example:

Code Change

# lib/raven/processor/removecircularreferences.rb
       case value
       when Hash
         - !value.frozen? ? value.merge!(value) { |_, v| process v, visited } : value.merge(value) { |_, v| process v, visited }
         + value.merge(value) { |_, v| process v, visited }
       when Array
         - !value.frozen? ? value.map! { |v| process v, visited } : value.map { |v| process v, visited }
         + value.map { |v| process v, visited }
       else
         value
       end

Allocation before the change

Calculating -------------------------------------
              master   448.434k memsize (    17.474k retained)
                         3.632k objects (    11.000  retained)
                        50.000  strings (     4.000  retained)

Allocation after the change + comparsion

Calculating -------------------------------------
              branch   519.698k memsize (    17.474k retained)
                         3.838k objects (    11.000  retained)
                        50.000  strings (     4.000  retained)

Allocation Comparison

Comparison:
              master:     448434 allocated
              branch:     519698 allocated - 1.16x more

A More Allocation-Efficient Solution - Duplicate user-passed data (options)

By duplicating user-passed objects, we can avoid the later mutations from changing the data user has. And because user-passed data usually won't be too big, we won't allocate too much memory by doing this. Here's an example:

Code Change

# lib/raven/instance.rb
   def capture_type(obj, options = {})
      # .....
      message_or_exc = obj.is_a?(String) ? "message" : "exception"
      + options = options.deep_dup
      options[:configuration] = configuration
      # .....

Allocation before the change

Calculating -------------------------------------
              master   448.434k memsize (    17.474k retained)
                         3.632k objects (    11.000  retained)
                        50.000  strings (     4.000  retained)

Allocation after the change

Calculating -------------------------------------
              branch   448.666k memsize (    17.474k retained)
                         3.633k objects (    11.000  retained)
                        50.000  strings (     4.000  retained)

Allocation Comparison

Comparison:
              master:     448434 allocated
              branch:     448666 allocated - 1.00x more

As you can see, this approach only allocates about 2xx bytes more of memory. So it's the approach I chose in this PR.

Question - Why making our own deep_dup extension instead of using ActiveSupport's?

  1. It's not worth adding a new dependency just for one of its files. This could greatly impact the size of some non-rails apps.
  2. The implementation of deep_dup is very stable and hasn't changed for 3 years. So it's not a great risk that we'll be out-of-sync on it.

@codecov-commenter
Copy link

codecov-commenter commented Aug 21, 2020

Codecov Report

Merging #994 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #994   +/-   ##
=======================================
  Coverage   97.77%   97.77%           
=======================================
  Files          46       46           
  Lines        2153     2159    +6     
=======================================
+ Hits         2105     2111    +6     
  Misses         48       48           
Impacted Files Coverage Δ
spec/raven/event_spec.rb 97.64% <100.00%> (+0.03%) ⬆️
spec/raven/instance_spec.rb 99.47% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b53b00...5d64785. Read the comment docs.

@st0012 st0012 force-pushed the fix-#924 branch 4 times, most recently from 6964e6e to e419533 Compare August 21, 2020 07:58
st0012 added 3 commits August 21, 2020 18:49
The duplicable extension is copied from Rails 5.2 instead of Rails 6.0.
This is because the 5.2 version supports Ruby 2.3 but 6.0's doesn't.
Because `Raven#capture_type` used to mutate the passed options, we could
use the same options object as mock expectation and they'll always match.

But since this is not the case anymore, we need to provide a different options
object as the mock expectation.
@estepnv
Copy link

estepnv commented Aug 21, 2020

In which universe #merge, that works like shallow copy for the hash, creates more objects than deep dup?
However, deep copy solution is more preferable and safer for clients but it cannot consume less ram than shallow copy solution

@st0012
Copy link
Collaborator Author

st0012 commented Aug 22, 2020

you need to keep in mind that:

  1. The event passed into processors is an Event object converted into a hash, which is not a small one.
hash = @processors.reduce(event_hash) { |a, e| e.process(a) }

You can try something like

      event = Raven::Event.from_message("this is a test")
      event.to_hash

to see what it contains.

  1. The #process method is recursive.
        !value.frozen? ? value.merge!(value) { |_, v| process v, visited } : value.merge(value) { |_, v| process v, visited }
        !value.frozen? ? value.map! { |v| process v, visited } : value.map { |v| process v, visited }

So using #merge will actually deep copy the hash.

In contrast, this PR only deep-copies the options user passes in but not an event hash, so the allocation will be much smaller.

@estepnv
Copy link

estepnv commented Aug 22, 2020

@st0012 thanks for explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Raven.capture_message sometimes updates extra value

3 participants