mem leak in AR (rails v2.3.x) and AM (rails v3.0.x) #2640

Closed
1o1brian opened this Issue Aug 22, 2011 · 15 comments

Projects

None yet

9 participants

@1o1brian

In rails 2.3.x activerecord-2.3.x/lib/active_record/base.rb there's a class variable @@subclasses the appears to keep track of what is inheriting from active record. This variable is a hash that uses the actual class as a key. In a rails app I'm using we're using the active record validations. We build classes dynamically that inherit from active record. Over time the validation requirements change and the class is removed using Module.remove_const and then rebuilt. Because the actual class is used to keep track of the inheritance from active record the old version of the class can't be garbage collected. Over time our app uses more and more memory.

Rails 3.0.x also has this same issue except the problem seems to have moved to activesupport-3.0.x/lib/active_support/descedants_tracker.rb in a class variable named @@direct_descendants for classes the include the module ActiveModel

I am unsure of all of the reason(s) for keeping track of the descant classes of active record and active model so I might not be the right person to propose a fix for the problem. The methods ActiveRecord::Base.reset_subclasses (rails 2.3.x) and ActiveSupport::DescendantsTracker.clear (rails 3.0.x) seem heavy handed. For now I have a method to remove the specific class from the hash @@subclasses. Another solution would be to store the keys to the hashes as symbols?

Here's the method I monkey patch in to fix the memory leak in rails 2.3.x:

module ActiveRecord
  class Base
    def self.clear_subclasses(klass)
      @@subclasses.delete klass
    end
  end
end

It seems like this memory leak is a bug because in ruby/rails one should be able to dynamically build and rebuild classes as needed.

@1o1brian

Here's an example in rails 2.3.x

in app/model/tabless_model.rb:

class TablelessModel < ActiveRecord::Base
  def self.columns() @columns ||= []; end

  def self.column(name, sql_type = nil, default = nil, null = true)
    columns << ActiveRecord::ConnectionAdapters::Column.new(name.to_s, default, sql_type.to_s, null)
  end
end

in lib/tasks/leak.rake:

namespace :app do

  task :leak => :environment do
    STDOUT.puts "ready to leak <return> to start"
    STDIN.gets
    klass_code =<<EOT
class ::Leaker < TablelessModel
  column :from_email, :string
  column :to_email, :string
  column :article_id, :integer
  column :message, :text

  validates_format_of :from_email, :to_email, :with => /^[-a-z0-9_+\.]+\@([-a-z0-9]+\.)+[a-z0-9]{2,4}$/i
  validates_length_of :message, :maximum => 500
end
EOT
    10000.times do |i|
      klass = eval(klass_code)
      if Object.const_defined?('Leaker')
        Object.class_eval { remove_const(:Leaker) }
        STDOUT.puts "#{i}"
      end
    end
    STDOUT.puts "done leaking <return> to quit"
    STDIN.gets
    puts "bye"
  end

end
@jeremyf
Contributor
jeremyf commented Apr 28, 2012

Is this still a problem?

@1o1brian

Yes, The code below as a rake task in the latest rails will show the leak.

desc "leak memory via active model"
task :leak => :environment do

  puts "About to leak memory.  Check your memory usage."
  puts "Press return to start"
  STDIN.gets


  6000.times do
    define_am_object
    remove_am_object
  end

  puts "Check your memory now.  Verify leak."
  puts "Press return to end"
  STDIN.gets

end


def define_am_object
  klass = eval(<<-EOT
class Thing
  include ActiveModel::Validations
  include ActiveModel::Conversion
  extend ActiveModel::Naming

  attr_accessor :name

  validates_presence_of :name

  def initialize(attributes = {})
    attributes.each do |name, value|
      send("\#{name}=", value)
    end
  end

  def persisted?
    false
  end
end
EOT
)
end

def remove_am_object
  Object.instance_eval { remove_const(:Thing) }
end
@isc
isc commented Jun 27, 2012

I'm running into the same issue with Rails 3.2.6. I'm dynamically creating ActiveRecord::Base subclasses which can't seem to be garbage-collect-able (and unfortunately, messing with the @@direct_descendants hash of DescendantsTracker doesn't seem to be enough). I'm using the following snippet as a reduced test case :

require 'rubygems'
require 'active_record'

module MyModule
end
MyModule.const_set 'NewClass', Class.new(ActiveRecord::Base)
p ObjectSpace.each_object(Class).map{|c|c.name || c.inspect}.find_all{|c| c.match 'NewClass'}
MyModule.send :remove_const, 'NewClass'
ActiveSupport::DescendantsTracker.class_variable_get('@@direct_descendants').clear
ObjectSpace.garbage_collect
GC.start()
p ObjectSpace.each_object(Class).map{|c|c.name || c.inspect}.find_all{|c| c.match 'NewClass'}
@isc
isc commented Jun 28, 2012

I thought of using memprof to track down the references to my classes. However memprof is not compatible with ruby 1.9 which I was using and this led me to discover that the problem isn't present with ruby ree 1.8.7.
With ruby 1.9.3-p125 :

time ruby references_ar.rb 
["MyModule::NewClass"]
["MyModule::NewClass"]

real    0m3.426s
user    0m3.332s
sys 0m0.091s

With ruby ree-1.8.7-2011.12 :

time ruby references_ar.rb 
["MyModule::NewClass"]
[]

real    0m0.272s
user    0m0.230s
sys 0m0.039s
@solnic
solnic commented Jun 28, 2012

Out of curiosity: why do you need to dynamically generate classes that inherit from AR? This sounds like a really terrible idea to me.

BTW symbols aren't GCed so if you're generating lots of them that could be the reason why you have memory leaks.

@rkh
Contributor
rkh commented Jun 28, 2012

Out of curiosity: why do you need to dynamically generate classes that inherit from AR? This sounds like a really terrible idea to me.

That's what the Rails reloader is doing.

@plukevdh

Are you using rails reloader in production?

@isc
isc commented Jun 29, 2012

@solnic I dynamically generate AR subclasses because my app connects to a different schema on each request (https://addons.heroku.com/adminium).

@1o1brian

"Out of curiosity: why do you need to dynamically generate classes that inherit from AR? This sounds like a really terrible idea to me."

@solnic We generate classes that leverage active model for validation and the validations change periodically. We have a UI for building the validations. When a validation is changed the class constant is removed from memory and the class is rebuilt. There are hundreds of validations. It would be a terrible waste of resources to rebuild them all when only one is changed.

If the portion of active model that keeps track of subclassing used symbols rather than the actual class objects there would not be a memory leak. Consider the difference in these two ruby programs:

This one will not "leak" memory.

h = {}
10.times do
  class A; end
  h[:A] = true
  Object.class_eval { remove_const(:A) }
end

puts h.inspect

This one will "leak" memory because a reference to the class A is keep around even though we explicitly use remove_const

h = {}
10.times do
  class A; end
  h[A] = true
  Object.class_eval { remove_const(:A) }
end

puts h.inspect

I contend the use of a class constant as a key in a hash violates the ruby best practice of using symbols for keys in hashes.

@thedarkone
Contributor

@1o1brian, @isc: if you try to dynamically create AR subclasses, you are going to have a bad time.

I suggest you roll out a custom patch and manually clean-up the stray references. Something akin to what I using in rails-dev-boost.

There 3 places you need to work around:

  • DescendantsTracker see here (sans the removal of the :clear method)
  • ActiveSupport::Dependencies::ClassCache see here
  • AR's relation caches: see here

@1o1brian: there is nothing wrong with using classes as hash keys. Your suggestion of using symbols isn't going to work for DescendantsTracker. Not all classes have names.

@plukevdh

If I were you, I wouldn't use the Rails validation system for this. Class reloading is a bad idea generally in production. I'd just have some sort of validation check that can look up whatever rules you're creating and checks the data before save. Maybe I'm misunderstanding the problem?

@1o1brian

@thedarkone I did not realize all the classes in this situation would not have names. That is tricky. One would need to use object IDs to work around that? Ick. For my needs cleaning up DescendantsTracker and ActiveSupport::Dependencies::ClassCache is probably sufficient. All the models are are not backed by tables.

I wish there was a way to dynamically use the active model validators without this memory use pitfall. IMO the validators are great. A lot of time and thought has gone in to them. It would be a shame to have to re-implement all that.

@the8472
the8472 commented Jul 2, 2012

using weak references would solve this issue, but MRI/YARV's weak refs have issues of their own, so i could only recommend that for jruby.

@steveklabnik
Member

There is basically no way this is going to get fixed. If you're dynamically building AR classes, you need to clean up after yourself; this is far outside the normal use-case for AR, and would require a bunch of work to work around, adding tons of complexity for this fringe case.

It seems you already have a workaround, and @thedarkone has also provided some insight. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment