Speed up recursion guard #8

misfo · 2013-09-17T21:52:43Z

A simple benchmarking script shows IceNine.deep_freeze running ~30x faster on nested Hashes/Arrays.

We ran into some performance issues at work with using IceNine.deep_freeze and include Adamantium, so we had to switch to using shallow .freezes and include Adamantium::Flat. I did some profiling on deep_freeze and it looked like the recursion guard code was taking up most of the time. In particular the call to Kernal#caller was taking up a big percentage of the running time. RecursionGuard.guarded_objects was taking a smaller, but significant, percentage as well.

This commit explicitly passes the recursion guarding down the call stack. The code is, admittedly, not as clean as it was before, but it makes for acceptable performance of deep freezes on large data structures (in our case, the large data structures were deserialized responses from a JSON API).

The commit does add an additional recursion_guard argument to a few public methods. If you don't want those optional arguments as part of the public API, I could probably refactor those out somehow...

A simple benchmarking script shows IceNine.deep_freeze running ~30x faster on nested Hashes/Arrays: https://gist.github.com/misfo/6600943

misfo · 2013-09-17T21:54:40Z

By the way, I love the library. We actually found (and fixed) a bug in another library because we were deep freezing our data

dkubb · 2013-09-17T22:09:56Z

@misfo thanks for this, I'll check it out in more detail.

At first glance, my largest concern is that Set is not really thread-safe. I had considered submitting a ThreadSafe::Set implementation to https://github.com/headius/thread_safe, and this may prompt me to do it sooner. Alternatively, a ThreadSafe::Hash could probably be used instead.

misfo · 2013-09-17T22:58:42Z

But IceNine.deep_freeze is single-threaded anyways, right? In other words, instances of RecursionGuard don't have to be thread safe because IceNine never shares them between threads (i.e. they're already thread local).

Couldn't we just make RecursionGuard @api private or document that it's not threadsafe?

dkubb · 2013-09-18T06:35:19Z

@misfo I think you're right about that. I see the change you made and it looks good.

Would you mind adding a benchmarks folder (like in https://github.com/dkubb/axiom/tree/master/benchmarks) where you add the benchmark as speed.rb?

In general I really like this approach. I don't mind that we're passing a variable recursively since the interface change won't break any existing usage. I'm going to do a pass over the code and make some suggestions, but otherwise I think this will be merged in very shortly.

One thing that came to mind is if Set is the most efficient data structure. I mean, it's general sure, but there's got to be data structures that can more efficiently store a set of integers than how the built-in Set does. I wouldn't ask you to make the change for this PR, it's just something to think about.

dkubb · 2013-09-18T06:55:17Z

Also, I should mention it's conventional wisdom in the ruby community that Set#include? is faster than Array#include? because it's O(1) vs O(n).

Obviously, when n is large Set is always a better option, however, when "n" is small the total overhead of building up the data structure and testing it may favour Array. It might be interesting to benchmark where the cross-over happens (if you are so inclined).

If an Array was used, then something that just pops the last entry would work, eg:

def initialize
  @object_ids = []
end

# ...

def guard(caller_object_id)
  return if @object_ids.include?(caller_object_id)
  begin
    @object_ids << caller_object_id
    yield
  rescue
    @object_ids.pop
  end
end

I suspect this will be faster than Set when the number of objects is small; the question is will it be better in the general case, and are we often deep freezing really deep data structures or not-so-deep structures. I suspect, since this is ruby, we aren't often deep freezing deep object structures, at least not deep enough to make it necessary to use a Set, but I'm not sure.

mbj · 2013-09-18T07:20:32Z

@misfo I ran into the same performance problem when deep freezing large parsed array/hash primitive trees I parse from JSON in my esearch library. And I ended up into using Adamantium::Flat for the same reasons. I'm really happy you could improve deep freeze performance. After this PR is merged I'll consider to deep freeze again. PR looks also good for me.

dkubb · 2013-09-18T08:07:07Z

@misfo ok, this PR looks good. My only request before merging it is that you merge (current) master in and run rake ci task against this branch to make sure it doesn't introduce any metrics regressions and that all the mutants are killed.

…-up-recursion-guard

This prevents some possible revisiting of object (IDs) and kills a TooManyStatements warning

misfo · 2013-09-18T08:29:27Z

Alrighty. I addressed a few of the things, but I still have some mutants to kill...

misfo · 2013-09-19T02:35:03Z

I addressed all the rake ci results that I could. There are mutants left in the the following classes/modules that I'm unsure about how to kill:

Freezer::Range: it's unclear to me how to actually create a Range with a circular reference to kill these mutants
Freezer::NoFreeze: the second argument to deep_freeze is intentionally ignored. So the mutations it's showing don't in fact cause any tests to fail, but that doesn't actually indicate a problem in this case. I prefixed the argument name with a _ to indicate it's intentionally ignored (Flog called the argument name _ uncommunicative)
RecursionGuard: mutant is showing that it can delete all but the yield in the #guard method, but the unit tests recurse infinitely when I make any of those changes manually to the code. So these mutations seem to be false positives...

misfo · 2013-09-19T02:52:41Z

About the most efficient data structure for the set of object IDs, the Google showed me that somebody already dug into that question in the java world. So it could be that using an Array of true/false values is faster for adding & looking up object IDs than using a Set. We'd have to make an assumption about how large object IDs grow in users' apps, though. Also, here must be a space trade-off there with using an Array...

Anyways, it might be worth looking into as a separate ticket

dkubb · 2013-09-19T05:49:48Z

Anyways, it might be worth looking into as a separate ticket

@misfo I think you're right. I've done some research too and it led me to a bitset as well, but given that your general approach is an improvement I'm no going to push further in this direction. I may open another issue for further research later, but for now I'm happy with the significant improvement this brings.

I'll look at those remaining mutations and see if I can make some suggestions and/or fix them.

dkubb · 2013-09-19T07:03:59Z

@misfo thanks so much for this, I fixed the remaining mutations as well as refactored a few of the tests (in ways that I probably should've done earlier).

I decided to just use a plain Hash to track the object ids, since it's internal to IceNine::RecursionGuard and slightly less overhead than using a Set (which uses Hash underneath anyway).

dkubb · 2013-09-19T07:12:29Z

@misfo btw, I just released ice_nine 0.10.0 to rubygems with these changes: https://rubygems.org/gems/ice_nine/versions/0.10.0

Thanks again!

misfo · 2013-09-19T10:41:21Z

Cool. Thanks!

Just curious: was there a fix for the RecursionGuard mutations?

Sent from my phone

On Sep 19, 2013, at 2:12 AM, Dan Kubb notifications@github.com wrote:

@misfo btw, I just released ice_nine 0.10.0 to rubygems with these changes: https://rubygems.org/gems/ice_nine/versions/0.10.0

Thanks again!

—
Reply to this email directly or view it on GitHub.

mbj · 2013-09-19T10:43:54Z

AFAIK refactoring the code into a form the mutations do not occur. You dont have to kill mutations from specs.

solnic · 2013-09-19T10:52:50Z

...to be more specific, you don't always have to kill mutations via specs. Sometimes alive mutation leads to simplifying code.

misfo · 2013-09-19T15:27:56Z

@dkubb Thanks for releasing the new version. We updated the gem in our app and here are the elapsed times for one of the affected parts of our app (the time it was taking to IceNine.deep_freeze a Hash of deserialized JSON and instantiating an object of a class with include Adamantium):

Before the update:

4.396748 seconds
18.083557 seconds
13.984336 seconds
21.435817 seconds
14.30649 seconds

After updating to 0.10.0:

0.010037 seconds
0.040838 seconds
0.030763 seconds
0.04739 seconds
0.032553 seconds

dkubb · 2013-09-19T15:48:36Z

@mbj actually I found that a few mutations are still alive under ruby 2.0.0, but not 1.9.3. I'm fairly sure that the remaining mutations are actually the result of a bug in mutant, but I need to isolate it. If you're curious run ice_nine edge with mutant under both rubies and note the difference.

dkubb · 2013-09-19T15:48:56Z

@misfo wow! that's quite a difference. nice work!

snusnu · 2013-09-19T15:49:14Z

@misfo holy crap, nice!

solnic · 2013-09-19T15:51:55Z

@misfo wow nice work there man, thanks for that. a lot of our libs will benefit from this patch ❤️

mbj · 2013-09-19T16:13:09Z

@misfo Thx man!

@dkubb Did you run mutant zombified? --zombie ice nine is a dependency, so maybe the mutations itself caused the bugs.

dkubb · 2013-09-19T16:33:04Z

@mbj I saw the difference with the rake task, which I think adds --zombie for specific gems like ice_nine.

mbj · 2013-09-19T16:36:25Z

@dkubb You saw that zillions "Zombifiying $rb_file" messages?

On Thu, Sep 19, 2013 at 09:33:05AM -0700, Dan Kubb wrote:

@mbj I saw the difference with the rake task, which I think adds --zombie for specific gems like ice_nine.

Reply to this email directly or view it on GitHub:
#8 (comment)

Markus Schirp

Phone: +49 201 / 360 379 14
Fax: +49 201 / 360 379 16
Web: schirp-dso.com
Email: mbj@schirp-dso.com
Twitter: twitter.com/m_b_j
OS-Code: github.com/mbj

Altendorferstrasse 44
D-45127 Essen

Speed up recursion guard

e136966

A simple benchmarking script shows IceNine.deep_freeze running ~30x faster on nested Hashes/Arrays: https://gist.github.com/misfo/6600943

misfo added 3 commits September 18, 2013 03:10

Add basic speed benchmark

d9f6249

Merge branch 'master' of https://github.com/dkubb/ice_nine into speed…

0dc0d9e

…-up-recursion-guard

Don't delete object IDs from recursion guard

80cc2e4

This prevents some possible revisiting of object (IDs) and kills a TooManyStatements warning

misfo added 2 commits September 18, 2013 09:34

Cleanups based on Flog output

7fb6a25

Killing mutants

3ad0064

dkubb merged commit 3ad0064 into dkubb:master Sep 19, 2013

misfo deleted the speed-up-recursion-guard branch September 19, 2013 15:54

misfo mentioned this pull request Sep 24, 2013

Freeze shallowly frozen objects and offer a faster option #11

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up recursion guard #8

Speed up recursion guard #8

misfo commented Sep 17, 2013

misfo commented Sep 17, 2013

dkubb commented Sep 17, 2013

misfo commented Sep 17, 2013

dkubb commented Sep 18, 2013

dkubb commented Sep 18, 2013

mbj commented Sep 18, 2013

dkubb commented Sep 18, 2013

misfo commented Sep 18, 2013

misfo commented Sep 19, 2013

misfo commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

misfo commented Sep 19, 2013

mbj commented Sep 19, 2013

solnic commented Sep 19, 2013

misfo commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

snusnu commented Sep 19, 2013

solnic commented Sep 19, 2013

mbj commented Sep 19, 2013

dkubb commented Sep 19, 2013

mbj commented Sep 19, 2013

Speed up recursion guard #8

Speed up recursion guard #8

Conversation

misfo commented Sep 17, 2013

misfo commented Sep 17, 2013

dkubb commented Sep 17, 2013

misfo commented Sep 17, 2013

dkubb commented Sep 18, 2013

dkubb commented Sep 18, 2013

mbj commented Sep 18, 2013

dkubb commented Sep 18, 2013

misfo commented Sep 18, 2013

misfo commented Sep 19, 2013

misfo commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

misfo commented Sep 19, 2013

mbj commented Sep 19, 2013

solnic commented Sep 19, 2013

misfo commented Sep 19, 2013

dkubb commented Sep 19, 2013

dkubb commented Sep 19, 2013

snusnu commented Sep 19, 2013

solnic commented Sep 19, 2013

mbj commented Sep 19, 2013

dkubb commented Sep 19, 2013

mbj commented Sep 19, 2013