Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rails development mode memory leak and slowdown, also on new apps #32892

Closed
mariusandra opened this issue May 15, 2018 · 24 comments
Closed

Rails development mode memory leak and slowdown, also on new apps #32892

mariusandra opened this issue May 15, 2018 · 24 comments

Comments

@mariusandra
Copy link

mariusandra commented May 15, 2018

TL;DR; My decently sized Rails app (173 models, 208 controllers) is suffering from a continuously increasing slowdown in development mode that makes it hard to work on it. Each reload of the code takes longer than the previous one, with delays increasing to minutes after just a few reloads. I have identified a probable cause and it's also present on new apps (rails new leakyapp, no added gems). I haven't found a solution yet.

The issue: path and url helpers are packaged into a module and included in controllers and other classes. When you reload! an app, the helper methods in this module are removed and then re-added. While doing this, ruby needs to remove the methods from each class that includes the module. However each reload of the code leaves the old class instances (e.g. the controller from the previous request, etc) in memory and thus ruby needs to clean route helpers from an ever increasing amount of classes. This gets slow after a while.

Background

Hello,

For the past years I've been coping with an issue that Rails feels extremely slow when developing. I start the server, wait for it to boot up and open the site. It takes a minute or two before it loads.

Then I have to make a change to a controller, model or other .rb file and hit reload in the browser. It takes some seconds and the changed page opens just fine. However after each following change, the reload takes longer and longer. After a few rounds, it's faster to restart the server than to wait for Rails to reload the code. I'm often waiting minutes before anything shows up again.

At first I thought the issue has something to do with activeadmin, a properly slow gem in its own right. Since it always takes forever to reload if I make any change in the admin area, I thought the increasing slowness was just part of the package and I tolerated it for longer than I should have. Lately though it got so unbearable that I needed to find a solution.

So I started profiling. Opening on page in admin from a cold boot, I get this beautiful firegraph:

leak1

(using rack-mini-profiler with ?pp=flamegraph&flamegraph_sample_rate=20).

I guess this is a relatively normal startup. I need to see if I can somehow cache sprockets and check why is Thor still there, but other than that the app loads nicely.

Let's hit reload:

leak2

The graph follows the same shape as the previous one, just the part with sprockets is much smaller.

So far so good. Now let's make a change in a .rb file and then hit reload:

leak3

It takes a while for the page to load. Well over 30 seconds. That's because ActiveAdmin needs to re-load all 100 or so pages just to draw a menu and generate all the paths. Oh well. However that's not the issue here.

The issue is that this part starts to get longer:

leak4

.. and longer:

leak5

.. and even longer:

leak6

.. and after about 6 reloads (I skipped a few images), it's half the size of the ActiveAdmin reload time...

leak7

After 10+ reloads I will be waiting minutes before anything opens. That's enough time to check HN and forget what you were doing.

So who is this culprit? It's this guy: ActionDispatch::Routing::RouteSet::NamedRouteCollection#clear!

The full stacktrace is here.

        def clear!
          @path_helpers.each do |helper|
            @path_helpers_module.send :remove_method, helper
          end

          @url_helpers.each do |helper|
            @url_helpers_module.send  :remove_method, helper
          end

          @routes.clear
          @path_helpers.clear
          @url_helpers.clear
        end

Huh. It takes an increasingly longer time to clear path helpers?

As far as I know, path and url helpers are modules, which are automatically included in all controllers and views. To get *_path methods into other classes (e.g. models), one should add include Rails.application.routes.url_helpers to the class.

So it's taking longer and longer to remove the *_path and *_url methods from @path_helpers_module.

Does this mean that this module is included in more and more classes, which don't get garbage collected? Let's find out.

Throwing caution to the wind (don't do this at home), I opened and edited the route_set.rb file in the gem's folder and added some debug code.

First just to measure what gets slower (it's the @*_helpers.each blocks, not the .clear calls)

!! @path_helpers.remove_method(@url_helpers.each) in 0.1322 sec
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.1383 sec
!! @routes.clear in 0.0 sec
!! @path_helpers.clear in 0.0 sec
!! @url_helpers.clear in 0.0 sec

# next times:
!! @path_helpers.remove_method(@url_helpers.each) in 0.2533 sec
!! @path_helpers.remove_method(@url_helpers.each) in 0.3978 sec
... etc

Then to measure if there are just more @path_helpers and @url_helpers. Nope, their number remains constant:

!! @path_helpers.count == 822
!! @path_helpers.remove_method(@url_helpers.each) in 0.1371 sec

!! @path_helpers.count == 822
!! @path_helpers.remove_method(@url_helpers.each) in 0.2666 sec

So it must be that the module is included in an ever increasing number of classes:

ObjectSpace.each_object(Object).select{|o| o.class.ancestors.include?(@url_helpers_module) }.count

=> 275 # first time
=> 474 # second...
=> 602

Bingo!

It appears that on each reload, Rails reloads new versions all the classes, yet doesn't dispose of the old versions.

Who are all these classes?

 > ObjectSpace.each_object(Object).select{|o| o.class.ancestors.include?(@url_helpers_module) }.map { |o| o.class.to_s }.sort.join(' ')
=> "#<Class:0x00007fda669591b8> ... cut for brevity ... #<Class:0x00007fda85692668> #<Class:0x00007fda85711f08> #<Class:0x00007fda857b5d88> Admin::RequestsController Admin::RequestsController"

Every reload adds a bunch of classes that look like this: #<#<Class:0x00007fda668c8d48>:0x00007fda668c3690 @_routes=#<ActionDispatch::Routing::RouteSet:0x00007fda8015ad08>> and one controller of the page I'm accessing (Admin::RequestsController).

Digging around, it seems that those classes are instances of this proxy class that's created when you call Rails.application.routes.url_helpers.

So it seems something still holds a reference to the controllers from previous requests... and all those proxy url helper modules that are included in so many places.

I installed the ruby-mass gem in the hopes of figuring it out. Unfortunately I didn't get far. I couldn't find if there was any gem that for whichever reason kept references to the old classes.

So I decided to see if I could replicate the issue with a smaller app. I didn't have to look far.

Just a simple rails new bla, rails g model something, rake db:migrate, rails g controller something was enough to replicate this.

Hacking route_set.rb and adding these lines:

 count = ObjectSpace.each_object(Object).select{|o| o.class.ancestors.include?(@url_helpers_module) }.count
 classes = ObjectSpace.each_object(Object).select{|o| o.class.ancestors.include?(@url_helpers_module) }.map { |o| o.class.to_s }.sort.select { |i| !i.include?('#') }
 puts count
 p classes

gives the following ouptut, after making a small change in something_controller.rb and hitting reload:

# no reload
1
[]
!! @url_helpers.count == 12
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.0001 sec

# after the first reload
5
["SomethingController"]
!! @url_helpers.count == 12
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.0 sec

# second reload
9
["SomethingController", "SomethingController"]
!! @url_helpers.count == 12
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.0001 sec

# third reload
13
["SomethingController", "SomethingController", "SomethingController"]
!! @url_helpers.count == 12
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.0001 sec

# fourth reload
17
["SomethingController", "SomethingController", "SomethingController", "SomethingController"]
!! @url_helpers.count == 12
!! @url_helpers_module.remove_method(@url_helpers.each) in 0.0001 sec

The number of classes, from which the path helpers need to be cleared from, keeps increasing by 4 per reload. In an app as small as this, you don't notice it. However if it keeps increasing by an extra 200 classes each reload, it becomes a problem.

Let's select all the SomethingControllers and see who has references to them:

cs = ObjectSpace.each_object(Object).select{ |o| o.class.ancestors.include?(@url_helpers_module) }.select { |o| !o.class.to_s.include?('#') }
cs.map(&:to_s)
=> ["#<SomethingController:0x00007f93d15ad630>", "#<SomethingController:0x00007f93d15d65d0>", "#<SomethingController:0x00007f93d160e520>"]

ids = cs.map { |c| Mass.references(c) }.map{ |k| k.keys }.flatten.select { |k| k !~ /Pry/ }.map{ |k| k.split('#').last.to_i }
=> ["ActionDispatch::Request#70136424506820",
 "ActionDispatch::Request#70136424506980",
 "ActionDispatch::Request#70136424869360",
 "ActionDispatch::Request#70136424572740",
 "ActionDispatch::Request#70136424572900",
 "ActionDispatch::Request#70136430985940",
 "ActionDispatch::Request#70136424704460",
 "ActionDispatch::Request#70136424704620",
 "ActionDispatch::Request#70136424801680"]

The SomethingControllers are only referenced by some ActionDispatch::Requests that are hanging around.

Looking at who is referencing those requests:

ids.map { |id| [id,Mass.references(Mass[id])] }.to_h
=> {70136424506820=>{"SomethingController#70136424655640"=>["@_request"], "ActionDispatch::Response#70136424655900"=>["@request"], "ActionDispatch::Http::Headers#70136424674980"=>["@req"]},
 70136424506980=>{"ActionDispatch::Request::Session#70136424506960"=>["@req"], "ActionDispatch::Cookies::CookieJar#70136424672740"=>["@request"]},
 70136424869360=>{"ActionDispatch::RemoteIp::GetIp#70136424869340"=>["@req"]},
 
 70136424572740=>{"ActionDispatch::Http::Headers#70136424736180"=>["@req"], "SomethingController#70136424739560"=>["@_request"], "ActionDispatch::Response#70136424739820"=>["@request"]},
 70136424572900=>{"ActionDispatch::Request::Session#70136424572880"=>["@req"], "ActionDispatch::Cookies::CookieJar#70136424758900"=>["@request"]},
 70136430985940=>{"ActionDispatch::RemoteIp::GetIp#70136430985920"=>["@req"]},
 
 70136424704460=>{"ActionDispatch::Http::Headers#70136424850800"=>["@req"], "SomethingController#70136424854160"=>["@_request"], "ActionDispatch::Response#70136424854420"=>["@request"]},
 70136424704620=>{"ActionDispatch::Request::Session#70136424704600"=>["@req"], "ActionDispatch::Cookies::CookieJar#70136424873540"=>["@request"]},
 70136424801680=>{"ActionDispatch::RemoteIp::GetIp#70136424801660"=>["@req"]}}

I investigated all of them in depth (via Mass.references) and couldn't find any top node that held a reference to any object from all the 3 different requests. The objects from each request however seem to have slightly cyclical relationships (e.g. a request holds a session, a session holds an env, the env holds the session and the request... or something like that). I'm not sure if this changes anything - they could and should be removed in bulk after a reload.

Starting the GC manually does nothing.

And now I'm stuck.

Does anyone have an idea what is happening, if it's perhaps a feature not a bug, and how can I fix this so that my reload times go back to around 30 seconds? 😄

Thanks for reading! 🍻

Steps to reproduce

  1. Create a new Rails app. Add a few models and controllers.
  2. Add this to any controller:
def index
  bla = 2
  render plain: ObjectSpace.each_object(Object).select{|o| o.class.ancestors.to_s.include?('ApplicationController') }.map(&:to_s).inspect
end
  1. rails server and open the endpoint from step 2
  2. Make any change in any ".rb" code and reload. E.g. increment the bla counter.
  3. See the number of loaded instances of the controller go up and up.
  4. Wonder what can be done about it?

Expected behavior

Rails should just keep the latest instance of the controller class in memory, not all of them.

Actual behavior

Every reload of the code adds a new controller instance in memory and cleaning url helper methods from all these old controllers takes longer after each reload.

System configuration

Rails version: 5.2.0

Ruby version: 2.5.0

Disclaimer

When I say "Rails does this" or "Rails does that" above, I'm using "Rails" as a shorthand for Object. The issue could easily come from any included gem, e.g. puma or I might be doing something wrong myself.

@kaspth
Copy link
Contributor

kaspth commented May 18, 2018

Thanks so much, this is exceptionally well researched and explained! We're seeing this in our 4.2 app as well. I've never had the time to dig into out, but I'm really glad you did 😊

@pixeltrix would you happen to have any ideas about how to untie the kept references after the routes module unload?

@mariusandra
Copy link
Author

@kaspth thanks for confirming you're also seeing this!

It's been a month and there are no other replies. So I would like to ask whoever is still reading this: do you have the same problem?

It's easy to test in two ways:

  1. When you make changes in a .rb file and your app reloads in development mode, does the memory usage keep going up forever?
  2. If you put ...
  puts "!! " + ObjectSpace.each_object(Object).select{|o| o.class.ancestors.to_s.include?('ApplicationController') }.map(&:to_s).inspect

... into any controller, add a "!" and reload, does the list of things printed get longer and longer?

It would be great to know if this a) is a non-issue, b) is a standard issue that affects everyone, is usually not noticeable, but is made much much worse by unique my selection of gems and models... or c) my PC is just haunted 👻 :)

Thanks!

@mariusandra
Copy link
Author

"The list grows 7ish long but eventually garbage collection starts and I'm back at 1."... is a reply I got when asking around Slack about this issue.

Could this mean the problem can be split into two:

  1. Rails (or Ruby or Rack or Puma or ...?) keeps multiple copies of your app in memory after reloading... until it hits some threshold that forces it to GC.

  2. My app is so bloated (or contains an extra memory leak) such that anything past the 4th reload is practically unbearable and I never get to the threshold.

Even if part of the slowdown is specific to my app, what can explain the growing memory footprint in others?

Is there someone who has tried the one-liner above and is not seeing this buildup of classes?

I tried and could go up to 8 instances of my controller in memory before breaking down from the slowness of it all. No GC happened.

@ahorek ahorek mentioned this issue Aug 1, 2018
@iuriialekseev
Copy link

iuriialekseev commented Sep 14, 2018

We run into that issue too:

First reload:
screen shot 2018-09-14 at 16 03 54
After 10 or so code changes and reloads:
screen shot 2018-09-14 at 16 07 58

Is there someone who has tried the one-liner above and is not seeing this buildup of classes?

  • We tried and yes, there are a buildup of classes similar to @mariusandra

I tried and could go up to 8 instances of my controller in memory before breaking down from the slowness of it all. No GC happened.

  • In our case, we have 20+ instances of the same controller in memory and after that, they've got GC'ed.

@fxgallego
Copy link

I'm suffering with this issue too.
If you patch RouteSet#clear! as below everything appears to work ok and fast after each reload

        def clear!
          @routes  = {}
          @path_helpers = Set.new
          @url_helpers = Set.new
          @url_helpers_module  = Module.new
          @path_helpers_module = Module.new
        end

The code is the same as the #initialize method.
Why we need to keep the anonymous module alive and cleaning it between requests? Is it because sometimes only routes are reloaded?
I'm sure that my patch will broke something, but what?

@rafaelfranca
Copy link
Member

Is it because sometimes only routes are reloaded?

Yes, it is possible to only reload the routes without reloading the other constants.

This is a very interesting issue. I wonder what is holding the references to those request objects even after a GC. Have you tried to force a GC after each request to see if that helps?

@fxgallego
Copy link

fxgallego commented Sep 20, 2018

Yes forced GC and nothing changes, I've added an initializer with this code (Rails 4.2.10, Ruby 2.5.1 W10)

module Rails
  class Application
    class RoutesReloader

      def reload!
        GC.start(full_mark: true, immediate_sweep: true)
        result = nil
        measure("Clear Paths") {clear!}
        measure("Load paths") {load_paths}
        measure("Finalize") {result = finalize!}
        result
      ensure
        revert
      end

      def measure(name, &block)
        start = Time.now
        block.call
        Rails.logger.info("#{name}: #{Time.now - start} seconds")
      end

    end
  end
end

@tenderlove
Copy link
Member

@mariusandra I was able to reproduce the controller leak by following your instructions (thanks for the detailed analysis btw!). I'm seeing the controller leak limited to the number of Puma threads. Heap dumps only contain the number of threads + 1 instances of a controller (from what I can tell).

I made a test app you can clone here. I tested by just curling:

[aaron@TC-186 memleak (master)]$ curl http://localhost:3000/users
{"address":"0x00007fcb0839d548", "type":"OBJECT", "class":"0x00007fcb0838f330", "ivars":13, "references":["0x00007fcb083ff2c0", "0x00007fcb0839d818", "0x00007fcb0839d2c8", "0x00007fcb084941b8", "0x00007fcb08394088", "0x00007fcb08384a70", "0x00007fcb0800a2c8"], "file":"/Users/aaron/git/rails/actionpack/lib/action_controller/metal.rb", "line":252, "method":"new", "generation":158, "memsize":144, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true, "pinned":true}}
,{"address":"0x00007fcb0842f218", "type":"OBJECT", "class":"0x00007fcb08426a78", "ivars":17, "references":["0x00007fcb084adca8", "0x00007fcb0842f420", "0x00007fcb0842f038", "0x00007fcb08c26430", "0x00007fcb084a7790", "0x00007fcb08427798", "0x00007fcb0841c1b8", "0x00007fcb0800bee8", "0x00007fcb0849e370", "0x00007fcb0849ca70"], "file":"/Users/aaron/git/rails/actionpack/lib/action_controller/metal.rb", "line":252, "method":"new", "generation":122, "memsize":176, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}

In the above case I booted puma like this:

$ RAILS_MAX_THREADS=1 rails server

I always see RAILS_MAX_THREADS + 1 leak. AFAICT, the stack is holding a reference to the controller instance but I'm doing a bit more analysis (I'll update this ticket when I get more info out of the heap dumps).

@mariusandra do you see unbounded leaks? If the leaks are bounded, we should probably figure out a faster way to do method reloading.

@tenderlove
Copy link
Member

tenderlove commented Sep 21, 2018

I need to stop working on this today, but here is what I've found so far.

  1. You don't need to modify the controller to see the "leaked" objects. Just making a request is enough.
  2. Some of these objects are clearly being held by the vm stack, others look like they are somehow being held by a reference from the worker thread.

The test app I linked to earlier can dump the heap if you pass the right parameter. I've just been curling a few times, then dumping the heap, then looking at the references. The following program will output a dot file that only contains the part of the heap that can reach a particular object:

require 'json'
require 'set'
require 'cgi'

class Slot
  attr_reader :info, :address_to_slot

  def initialize info, address_to_slot
    @info            = info
    @address_to_slot = address_to_slot
    @can_reach       = {}
  end

  def references
    @references ||= (info[__method__.to_s] || []).uniq.map { |a|
      address_to_slot[a]
    }.compact
  end

  def address; info[__method__.to_s]; end
  def root; info[__method__.to_s]; end
  def root?; info["type"] == "ROOT"; end

  def reachable? addr, seen = Set.new
    return if seen.include?(address)

    seen << address

    @can_reach.fetch(addr) do
      reachable = address == addr || references.any? { |ref|
        ref.reachable?(addr, seen)
      }
      @can_reach[addr] = reachable
    end
  end

  def marked?; @can_reach.values.any?; end
  def id;      root? ? root : address; end

  def label
    attributes = %w{ type file line }.map { |key| info[key] }.compact.map { |k|
      CGI.escape_html(k.to_s)
    }.join " | "
    extras = ''
    if @can_reach[address]
      extras = "style=bold fontsize=30 fontcolor=red color=red"
    end
    "#{id.inspect} [label=\"{ #{id} | #{attributes} }\"#{extras}];"
  end
end

address_to_slot = {}
roots           = []
all_nodes       = []

File.open(ARGV.shift, 'r') do |f|
  f.each_line do |line|
    info = JSON.parse line
    slot = Slot.new info, address_to_slot
    all_nodes << slot
    if slot.root?
      roots << slot
    else
      address_to_slot[slot.address] = slot
    end
  end
end

ARGV.each do |needle|
  needle = needle.gsub(/x[0]+/, 'x')

  roots.each do |root|
    root.reachable? needle
  end
end

puts "digraph g {"
puts "node [shape=record];"

all_nodes.each do |node|
  next unless node.marked?
  puts node.label
  node.references.each do |ref|
    next unless ref.marked?
    puts "#{node.id.inspect} -> #{ref.id.inspect}"
  end
end

puts "}"

I've been using this program to find what references the "leaked" objects. I just give it the heap JSON file along with the address of one of the leaked objects returned from the controller. Some of the references are pretty straight forward. The red node is the "leaked" object. This one is clearly just being held by the stack:

Stack Held

Some of these are not. This one is more complicated:

out2

The thing that's confusing me about the second one is that there is a thread that's holding a direct reference to the "env" hash and it has a reference to the object:

out2 dot 2018-09-20 18-12-07

I am having a hard time figuring out how it has this direct reference. It might still be a stack for that thread, but I would have expected that to be part of the machine context roots. I will look more tomorrow.

Also, the heap dump code uses the object's mark function to determine references. The mark function for threads can be found here.

Also, each node in the graph shows the address, type, allocation file, and allocation line for that object. Unfortunately I don't know why Threads are keeping a reference to that hash. 😞

@tenderlove
Copy link
Member

Also I've found that switching to webrick for local dev fixes this problem.

@fxgallego
Copy link

@tenderlove I'm using webrick with rails 4.2.10 ruby 2.5.1 and the problem with the routes reloader is there.Maybe we are talking about different things.

@tenderlove
Copy link
Member

@fxgallego ya, I think so. I'm trying to track down why these objects are still alive (I think it might still be on the thread stack). However, I don't think we should be reloading the routes file so frequently (as opposed to this commit: b9b06da).

If we addressed reloading the routes file so frequently, then these live objects wouldn't matter so much. So we might be talking about different things, but I think they are related 😊

@fxgallego
Copy link

fxgallego commented Sep 22, 2018

Hi, I've found the source of my problem. I'm on Rails 4.2.10 and found that there is no leak with controllers in this version. In order to build a large application menu in a view, I found some code that was constantizing the controller of each menu element only to ask its human_name. When the controller loads, it includes the module returned by Rails.application.routes.url_helpers. And I have a lot of controllers and routes. The problem is that Rails.application.routes.url_helpers returns a new Module on each call. This new module, included in each controller causes the performance problem when routes are reloaded.

So my first step was to search "Memoize Rails.application.routes.url_helpers" and the first result was #24554 . Memoizing #url_helpers has a lot of sense to me, because you can inadvertently do the same I was doing. One of the justifications to not merge the PR was that #url_helpers is expected to be called only one time on each request. That's unlikely to happen, and difficult to try avoid. (try a breakpoint on #url_helpers to see how many times is called)

At the end, removing the offending code bring the routes reload times to normal. Patching #url_helpers to return always the same module instance would help a little more.

tenderlove added a commit that referenced this issue Sep 24, 2018
This commit eagerly builds the route helper module after the routes have
been drawn and finalized.  This allows us to cache the helper module but
not have to worry about people accessing the module while route
definition is "in-flight", and automatically deals with cache
invalidation as the module is regenerated anytime someone redraws the
routes.

The restriction this commit introduces is that the url helper module can
only be accessed *after* the routes are done being drawn.

Refs #24554 and #32892
@dillonwelch
Copy link
Contributor

This might have some relation to #32304 that I made earlier this year.

tenderlove added a commit that referenced this issue Sep 25, 2018
This commit eagerly builds the route helper module after the routes have
been drawn and finalized.  This allows us to cache the helper module but
not have to worry about people accessing the module while route
definition is "in-flight", and automatically deals with cache
invalidation as the module is regenerated anytime someone redraws the
routes.

The restriction this commit introduces is that the url helper module can
only be accessed *after* the routes are done being drawn.

Refs #24554 and #32892
@rails-bot
Copy link

rails-bot bot commented Dec 24, 2018

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 5-2-stable branch or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Dec 24, 2018
@rails-bot rails-bot bot closed this as completed Dec 31, 2018
@budu
Copy link

budu commented Jan 3, 2019

Is there any new development for this issue?

The PR #33970 has been reverted and I can't find anything else.

@rafaelfranca rafaelfranca reopened this Jan 3, 2019
@rails-bot rails-bot bot removed the stale label Jan 3, 2019
@rafaelfranca rafaelfranca added this to the 6.0.0 milestone Jan 3, 2019
@budu
Copy link

budu commented Jan 15, 2019

I've been investigating a similar issue in one large application I am working on. I tried to look into it a couple of times before, but never had enough time. This ticket has been an incredible source of ideas to further my research and I took last weekend to go deeper into it.

The issue was quite severe in my project. It can be easily replicated by calling reload! in the console. A single call would begin at approximately 6 seconds the first time and take more than a minute after 10 reloads. After some benchmarks, most of the slowness was caused by creating/clearing routes during reloading, even though the number of routes/helpers remained stable. The memory footprint would grow linearly with each reload.

I've used a very simple brute force approach inspired by http://www.be9.io/2015/09/21/memory-leak/, that is:

  • setup ObjectSpace with tracing: require 'objspace'; ObjectSpace.trace_object_allocations_start
  • generate (sorted) dumps interleaved with reloads: 3.times {|n| reload!; GC.start; sleep 1; f="dump#{n}"; File.open(f, 'w'){|io| ObjectSpace.dump_all(output: io)}; %x(sort #{f} > #{f}s)}
  • get what has potentially leaked using the command line: comm -13 dump0s dump1s > dump01 && comm -12 dump01 dump2s > leak

From there, I would look into the resulting file for anything out of the ordinary, especially duplicated class names. I wasted a lot of time at first, but I ended up getting better at spotting the culprits. Once I had an interesting memory address, I used @tenderlove's script (slightly modified) from a previous comment to look at the memory tree to see what was happening. Here's some zoomed in screenshots of what I found:

leak1

The one above show a leaked class from translation models created by Globalize.

leak2

That one is an example of a leak coming from models using HABTM association.

There was also another leak originating from ActiveAdmin controller classes. The commonality between all these leaks was various anonymous classes referenced by the ActiveSupport::DescendantsTracker module.

After looking into the module code and searching for known issues, I've found there was an issue about this problem #31395 and a pull request #31442. I then proceeded to extract the fixed DescendantsTracker module and monkey patch it in development and voilà, no more leak!

I'll see if I can help with #31442 in the coming weeks.

@cmthakur
Copy link

Do we have any way around to resolve this problem temporarily until the issue gets fixed from rails itself?
@tenderlove @budu

@budu
Copy link

budu commented Mar 25, 2019

@cmthakur You can use the patched DescendantsTracker by @ebeigarts here: https://github.com/rails/rails/blob/ea8f8ed7ac91bc397a6ac6cf2faad36e4a9a0616/activesupport/lib/active_support/descendants_tracker.rb

There's some actions in the issue #31442 it might get merged soon.

@mariusandra
Copy link
Author

Hi everyone! As the author of the issue I just want to report that ever since we upgraded to Rails 6 and started using the Zeitwerk loader, these reload slowdowns have vanished. So the issue is solved for me! Others in my team also report similar results.

Thus I'm closing the issue! 🎉🎉🎉

(Note: I didn't test Rails 6 without the Zeitwerk loader, so I can't say if that configuration would also fix the issue or not... Probably though because AFAIK the dependents tracker patch that's linked above made it into the core of Rails and not Zeitwerk explicitly...)

@kaspth
Copy link
Contributor

kaspth commented Sep 23, 2019

Woot, that’s great to hear! 😄

@caifara
Copy link

caifara commented Sep 24, 2019

@mariusandra Thanks for all the work and the follow up! Guess we need to upgrade sooner rather than later!

@kwent
Copy link

kwent commented Sep 27, 2019

@mariusandra I can confirm than even with config.load_defaults 5.2 in config/application.rb, issue is fixed.

module MyApp
  class Application < Rails::Application
    # Initialize configuration defaults for originally generated Rails version.
    config.load_defaults 5.2
end

👍

@xrimpy
Copy link

xrimpy commented Jul 12, 2020

I updated to Rails 6, just because of this issue.

Figured out the AA Filter's bug was was causing the issue activeadmin/activeadmin#5752

Had missed config.load_defaults 6.0

Thank You

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests