ActiveRecord::Base.extend kills JSON performance #9212

Closed
jasonhutchens opened this Issue Feb 7, 2013 · 23 comments

10 participants

@jasonhutchens

We have a large Rails app, and our JSON performance is abysmal. We've verified JSON.parser is using the C implementation and not the Ruby implementation. We've tried YAJL and OJ to no avail. When we test these things outside of Rails everything's really fast. But running the same tests from the Rails console tells another story.

Today I narrowed this down to authlogic (binarylogic/authlogic#344), but a bit of further tinkering reveals the problem may be in active_record.

Here's my Gemfile:

source :rubygems
gem 'json'
gem 'activerecord', '>=3.0.0', :require => 'active_record'

And here's my test:

#!/usr/bin/env ruby

require 'rubygems'
require 'bundler'

Bundler.require(:default)

require 'benchmark'
require 'json'

def profile
  blob = File.read(File.join('big_file.json'))
  time = Benchmark.realtime do
    JSON.parse(JSON.parse(blob).to_json).to_json
  end
  time * 1000
end

NUM = 10

puts NUM.times.map { profile }.reduce(:+)/NUM.to_f

module Foo
  ActiveRecord::Base.extend(Foo)
end

puts NUM.times.map { profile }.reduce(:+)/NUM.to_f

On my machine the output of bundle-exec'ing that test is as follows:

50.8620023727417
790.248107910156

Crazypants, right? That little innocuous Foo module kills JSON performance. Anyone else seeing this kind of problem? What the freakazoid is going on?

@jasonhutchens

The root cause is 'active_support/core_ext/object/to_json', which is somehow loaded when ActiveRecord::Base.extend(Foo) is called. This force-overrides to_json as follows:

# Hack to load json gem first so we can overwrite its to_json.
begin
  require 'json'
rescue LoadError
end

# The JSON gem adds a few modules to Ruby core classes containing :to_json definition, overwriting
# their default behavior. That said, we need to define the basic to_json method in all of them,
# otherwise they will always use to_json gem implementation, which is backwards incompatible in
# several cases (for instance, the JSON implementation for Hash does not work) with inheritance
# and consequently classes as ActiveSupport::OrderedHash cannot be serialized to json.
[Object, Array, FalseClass, Float, Hash, Integer, NilClass, String, TrueClass].each do |klass|
  klass.class_eval do
    # Dumps object in JSON (JavaScript Object Notation). See www.json.org for more info.
    def to_json(options = nil)
      ActiveSupport::JSON.encode(self, options)
    end
  end
end

A slightly rewritten test demonstrates this:

#!/usr/bin/env ruby

require 'benchmark'
require 'json'

def profile
  blob = File.read(File.join('big_file.json'))
  time = Benchmark.realtime do
    JSON.parse(JSON.parse(blob).to_json).to_json
  end
  time * 1000
end

NUM = 10

puts NUM.times.map { profile }.reduce(:+)/NUM.to_f

require 'active_support/json'
require 'active_support/core_ext/object/to_json'

puts NUM.times.map { profile }.reduce(:+)/NUM.to_f

Is there any way of preventing active_support/core_ext/object/to_json from doing this override? What will be the side-effects of doing so, if any? Is anyone else out there using fast JSON libraries without problems in Rails?

@jasonhutchens

Looks like this code was introduced in 7bd85a8 to fix https://rails.lighthouseapp.com/projects/8994/tickets/4890 (man, pity you couldn't maintain issue numbers between Lighthouse and GitHub).

@jasonhutchens jasonhutchens reopened this Feb 7, 2013
@jasonhutchens

@josevalim can you comment on this issue?

@Sutto

Cold this possibly be a consequence of extend blowing away the method cache?

Hence to_json being called recursively a lot of times incurring expensive costs causing the speed delay.

Reloading it possibly recreates the cache (all assumptions) hence the speed up.

@jasonhutchens

I think the slowness we're seeing is merely due to the fact that ActiveSupport::JSON.encode is slow (we're calling it pretty directly in the second test program shown above). I think ActiveRecord::Base.extend is just a red herring; it causes active_support/core_ext/object/to_json to be loaded. We're using authlogic which does something similar when it's required, which is presumably why removing authlogic from our Gemfile stopped us from hitting the bug in the rails console.

The intent of the rails code is obvious though; deliberately prevent the json gem from being used. This smells wrong to me, especially since the bug this code was written to fix was so specific.

@steveklabnik
Ruby on Rails member

Cold this possibly be a consequence of extend blowing away the method cache?

It only extends once, the cache should be fine.

@te-chris

Has there been any progress on this?

@jasonhutchens

No progress, I was waiting for @josevalim to comment. In practice we've minimised our use of .to_json, and we've monkeypatched around the issue.

@te-chris

That makes sense, I'm using @josevalim 's active_model_serializers which uses to_json. It definitely calls this and seems to clobber any chance of good json performance. (and explains a performance issue i've got in an app).

@steveklabnik
Ruby on Rails member

As I mentioned in your email, I recently merged a performance-related patch into AMS; I think your issues probably had more to do with our excessive instrumentation and lack of caching. Or at least, that's some of your issue.

@steveklabnik
Ruby on Rails member

I started looking into this issue this morning, here's the problem:

diff --git a/activesupport/lib/active_support/core_ext/object/to_json.rb b/activesupport/lib/active_support/core_ext/object/to_json.rb
index 83cc806..336f6ff 100644
--- a/activesupport/lib/active_support/core_ext/object/to_json.rb
+++ b/activesupport/lib/active_support/core_ext/object/to_json.rb
@@ -1,20 +1,6 @@
-# Hack to load json gem first so we can overwrite its to_json.
-begin
-  require 'json'
-rescue LoadError
-end
-
-# The JSON gem adds a few modules to Ruby core classes containing :to_json definition, overwriting
-# their default behavior. That said, we need to define the basic to_json method in all of them,
-# otherwise they will always use to_json gem implementation, which is backwards incompatible in
-# several cases (for instance, the JSON implementation for Hash does not work) with inheritance
-# and consequently classes as ActiveSupport::OrderedHash cannot be serialized to json.
-[Object, Array, FalseClass, Float, Hash, Integer, NilClass, String, TrueClass].each do |klass|
-  klass.class_eval do
-    # Dumps object in JSON (JavaScript Object Notation). See www.json.org for more info.
-    def to_json(options = nil)
+class Object
+  def to_json(options = nil)
       ActiveSupport::JSON.encode(self, options)
-    end
   end
 end

diff --git a/activesupport/test/json/encoding_test.rb b/activesupport/test/json/encoding_test.rb
index 12ce250..fdfe794 100644
--- a/activesupport/test/json/encoding_test.rb
+++ b/activesupport/test/json/encoding_test.rb
@@ -3,6 +3,8 @@ require 'abstract_unit'
 require 'active_support/core_ext/string/inflections'
 require 'active_support/json'

+require 'json'
+
 class TestJSONEncoding < ActiveSupport::TestCase
   class Foo
     def initialize(a, b)

Yields this:

$ bundle exec rake test
<snip>

Finished tests in 5.441872s, 532.5373 tests/s, 1654.9452 assertions/s.

  1) Failure:
TestJSONEncoding#test_array_should_pass_encoding_options_to_children_in_to_json [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:248]:
--- expected
+++ actual
@@ -1 +1 @@
-"[{\"address\":{\"city\":\"London\"}},{\"address\":{\"city\":\"Paris\"}}]"
+"[{\"name\":\"John\",\"address\":{\"city\":\"London\",\"country\":\"UK\"}},{\"name\":\"Jean\",\"address\":{\"city\":\"Paris\",\"country\":\"France\"}}]"


  2) Failure:
TestJSONEncoding#test_enumerable_should_pass_encoding_options_to_children_in_to_json [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:272]:
--- expected
+++ actual
@@ -1 +1 @@
-"[{\"address\":{\"city\":\"London\"}},{\"address\":{\"city\":\"Paris\"}}]"
+"\"#<Enumerator:0xXXXXXX>\""


  3) Failure:
TestJSONEncoding#test_hash_should_pass_encoding_options_to_children_in_to_json [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:224]:
--- expected
+++ actual
@@ -1 +1 @@
-"{\"address\":{\"city\":\"London\"}}"
+"{\"name\":\"John\",\"address\":{\"city\":\"London\",\"country\":\"UK\"}}"


  4) Failure:
TestJSONEncoding#test_hash_with_time_to_json [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:186]:
--- expected
+++ actual
@@ -1 +1 @@
-"{\"time\":\"2009/01/01 00:00:00 +0000\"}"
+"{\"time\":\"2009-01-01 00:00:00 UTC\"}"


  5) Failure:
TestJSONEncoding#test_opt_out_big_decimal_string_serialization [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:321]:
Expected: "2.5"
  Actual: "\"2.5\""

  6) Error:
TestJSONEncoding#test_struct_encoding:
JSON::ParserError: 757: unexpected token at '"#<struct Struct::Custom name=\"David\", sub=#<struct Struct::UserNameAndDate name=\"David\", date=Fri, 01 Jan 2010>>"'
diff --git a/activesupport/lib/active_support/core_ext/object/to_json.rb b/activesupport/lib/active_support/core_ext/object/to_json.rb
index 83cc806..336f6ff 100644
--- a/activesupport/lib/active_support/core_ext/object/to_json.rb
+++ b/activesupport/lib/active_support/core_ext/object/to_json.rb
@@ -1,20 +1,6 @@
-# Hack to load json gem first so we can overwrite its to_json.
-begin
-  require 'json'
    /opt/rubies/2.0.0-p0/lib/ruby/2.0.0/json/common.rb:155:in `parse'
    /opt/rubies/2.0.0-p0/lib/ruby/2.0.0/json/common.rb:155:in `parse'
    /Users/steve/src/rails/activesupport/test/json/encoding_test.rb:307:in `test_struct_encoding'

  7) Failure:
TestJSONEncoding#test_to_json_should_not_keep_options_around [/Users/steve/src/rails/activesupport/test/json/encoding_test.rb:281]:
--- expected
+++ actual
@@ -1 +1 @@
-{"foo"=>{"foo"=>"hello", "bar"=>"world"}, "other_hash"=>{"foo"=>"other_foo", "test"=>"other_test"}}
+{"foo"=>"#<TestJSONEncoding::CustomWithOptions:0xXXXXXX>", "other_hash"=>{"foo"=>"other_foo", "test"=>"other_test"}}


2898 tests, 9006 assertions, 6 failures, 1 errors, 153 skips
rake aborted!
Command failed with status (1): [ruby -w -I"lib:test" -I"/Users/steve/.gem/ruby/2.0.0/gems/rake-10.0.3/lib" "/Users/steve/.gem/ruby/2.0.0/gems/rake-10.0.3/lib/rake/rake_test_loader.rb" "test/**/*_test.rb" ]

Tasks: TOP => test
(See full trace by running task with --trace)

So that will all need to be dealt with somehow in the process of dealing with this.

@jormon

bump -- @josevalim care to take a look again? this seems like a problem that needs a high-level call

@andrewvc

Wait, so rails doesn't use a C extension for generating JSON at the moment?

@jasonhutchens

@andrewvc yes, that's right.

@te-chris
@steveklabnik
Ruby on Rails member

Rails does not use any C extensions at all. It makes Rails easier to install, and easier for platforms like JRuby.

@andrewvc
@aq1018

The reason to_json is slow, is because it loads json gem first ( which adds to_json to various classes, and they are implemented in C ), rails then override various classes to use its own version of to_json, which is then proxyed to ActionSupport::JSON::Encode This causes Hash#to_json and Array#to_json jumps between ruby land and C land hunting for circular references, and causing it the speed to decrease dramatically.

@jasonhutchens

@aq1018 I don't think that's true. I reckon ActionSupport::JSON::Encode is itself really slow.

@chancancode
Ruby on Rails member
@jeremy jeremy closed this in #12183 Nov 26, 2013
@jasonhutchens

Here's a working link to @chancancode's explanation: intridea/multi_json#138 (comment)

Great stuff!

@saurabhnanda

Are there any before/after benchmarks for @chancancode PR #12183

@tomclegg tomclegg added a commit to curoverse/arvados that referenced this issue Jan 21, 2015
@tomclegg tomclegg 3021: Use Oj to encode API responses, and to decode them in tests.
* We use the Oj and multi_json gems, which makes Oj the default JSON
  parser. However, Rails' ActiveRecord::Base overrides this and uses
  the native JSON parser, which is slow. In our case we have two
  render() calls that represent nearly all cases where we ask
  ActiveRecord to serialize for us. In both cases we already have a
  hash (not a model object), and we always want JSON responses. So we
  can fix the performance problem simply by calling Oj.dump()
  ourselves, and passing the resulting JSON (instead of the hash) to
  render().

More gory details:

* "ActiveRecord::Base.extend kills JSON performance":
  rails/rails#9212

* "when freedom patches fight, nobody wins":
  intridea/multi_json#138 (comment)
1974e0e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment