-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rails development mode memory leak and slowdown, also on new apps #32892
Comments
Thanks so much, this is exceptionally well researched and explained! We're seeing this in our 4.2 app as well. I've never had the time to dig into out, but I'm really glad you did 😊 @pixeltrix would you happen to have any ideas about how to untie the kept references after the routes module unload? |
@kaspth thanks for confirming you're also seeing this! It's been a month and there are no other replies. So I would like to ask whoever is still reading this: do you have the same problem? It's easy to test in two ways:
puts "!! " + ObjectSpace.each_object(Object).select{|o| o.class.ancestors.to_s.include?('ApplicationController') }.map(&:to_s).inspect ... into any controller, add a "!" and reload, does the list of things printed get longer and longer? It would be great to know if this a) is a non-issue, b) is a standard issue that affects everyone, is usually not noticeable, but is made much much worse by unique my selection of gems and models... or c) my PC is just haunted 👻 :) Thanks! |
"The list grows 7ish long but eventually garbage collection starts and I'm back at 1."... is a reply I got when asking around Slack about this issue. Could this mean the problem can be split into two:
Even if part of the slowdown is specific to my app, what can explain the growing memory footprint in others? Is there someone who has tried the one-liner above and is not seeing this buildup of classes? I tried and could go up to 8 instances of my controller in memory before breaking down from the slowness of it all. No GC happened. |
We run into that issue too: First reload:
|
I'm suffering with this issue too. def clear!
@routes = {}
@path_helpers = Set.new
@url_helpers = Set.new
@url_helpers_module = Module.new
@path_helpers_module = Module.new
end The code is the same as the #initialize method. |
Yes, it is possible to only reload the routes without reloading the other constants. This is a very interesting issue. I wonder what is holding the references to those request objects even after a GC. Have you tried to force a GC after each request to see if that helps? |
Yes forced GC and nothing changes, I've added an initializer with this code (Rails 4.2.10, Ruby 2.5.1 W10) module Rails
class Application
class RoutesReloader
def reload!
GC.start(full_mark: true, immediate_sweep: true)
result = nil
measure("Clear Paths") {clear!}
measure("Load paths") {load_paths}
measure("Finalize") {result = finalize!}
result
ensure
revert
end
def measure(name, &block)
start = Time.now
block.call
Rails.logger.info("#{name}: #{Time.now - start} seconds")
end
end
end
end |
@mariusandra I was able to reproduce the controller leak by following your instructions (thanks for the detailed analysis btw!). I'm seeing the controller leak limited to the number of Puma threads. Heap dumps only contain the number of threads + 1 instances of a controller (from what I can tell). I made a test app you can clone here. I tested by just curling:
In the above case I booted puma like this:
I always see @mariusandra do you see unbounded leaks? If the leaks are bounded, we should probably figure out a faster way to do method reloading. |
I need to stop working on this today, but here is what I've found so far.
The test app I linked to earlier can dump the heap if you pass the right parameter. I've just been curling a few times, then dumping the heap, then looking at the references. The following program will output a dot file that only contains the part of the heap that can reach a particular object: require 'json'
require 'set'
require 'cgi'
class Slot
attr_reader :info, :address_to_slot
def initialize info, address_to_slot
@info = info
@address_to_slot = address_to_slot
@can_reach = {}
end
def references
@references ||= (info[__method__.to_s] || []).uniq.map { |a|
address_to_slot[a]
}.compact
end
def address; info[__method__.to_s]; end
def root; info[__method__.to_s]; end
def root?; info["type"] == "ROOT"; end
def reachable? addr, seen = Set.new
return if seen.include?(address)
seen << address
@can_reach.fetch(addr) do
reachable = address == addr || references.any? { |ref|
ref.reachable?(addr, seen)
}
@can_reach[addr] = reachable
end
end
def marked?; @can_reach.values.any?; end
def id; root? ? root : address; end
def label
attributes = %w{ type file line }.map { |key| info[key] }.compact.map { |k|
CGI.escape_html(k.to_s)
}.join " | "
extras = ''
if @can_reach[address]
extras = "style=bold fontsize=30 fontcolor=red color=red"
end
"#{id.inspect} [label=\"{ #{id} | #{attributes} }\"#{extras}];"
end
end
address_to_slot = {}
roots = []
all_nodes = []
File.open(ARGV.shift, 'r') do |f|
f.each_line do |line|
info = JSON.parse line
slot = Slot.new info, address_to_slot
all_nodes << slot
if slot.root?
roots << slot
else
address_to_slot[slot.address] = slot
end
end
end
ARGV.each do |needle|
needle = needle.gsub(/x[0]+/, 'x')
roots.each do |root|
root.reachable? needle
end
end
puts "digraph g {"
puts "node [shape=record];"
all_nodes.each do |node|
next unless node.marked?
puts node.label
node.references.each do |ref|
next unless ref.marked?
puts "#{node.id.inspect} -> #{ref.id.inspect}"
end
end
puts "}" I've been using this program to find what references the "leaked" objects. I just give it the heap JSON file along with the address of one of the leaked objects returned from the controller. Some of the references are pretty straight forward. The red node is the "leaked" object. This one is clearly just being held by the stack: Some of these are not. This one is more complicated: The thing that's confusing me about the second one is that there is a thread that's holding a direct reference to the "env" hash and it has a reference to the object: I am having a hard time figuring out how it has this direct reference. It might still be a stack for that thread, but I would have expected that to be part of the machine context roots. I will look more tomorrow. Also, the heap dump code uses the object's mark function to determine references. The mark function for threads can be found here. Also, each node in the graph shows the address, type, allocation file, and allocation line for that object. Unfortunately I don't know why Threads are keeping a reference to that hash. 😞 |
Also I've found that switching to webrick for local dev fixes this problem. |
@tenderlove I'm using webrick with rails 4.2.10 ruby 2.5.1 and the problem with the routes reloader is there.Maybe we are talking about different things. |
@fxgallego ya, I think so. I'm trying to track down why these objects are still alive (I think it might still be on the thread stack). However, I don't think we should be reloading the routes file so frequently (as opposed to this commit: b9b06da). If we addressed reloading the routes file so frequently, then these live objects wouldn't matter so much. So we might be talking about different things, but I think they are related 😊 |
Hi, I've found the source of my problem. I'm on Rails 4.2.10 and found that there is no leak with controllers in this version. In order to build a large application menu in a view, I found some code that was constantizing the controller of each menu element only to ask its human_name. When the controller loads, it includes the module returned by So my first step was to search "Memoize Rails.application.routes.url_helpers" and the first result was #24554 . Memoizing #url_helpers has a lot of sense to me, because you can inadvertently do the same I was doing. One of the justifications to not merge the PR was that #url_helpers is expected to be called only one time on each request. That's unlikely to happen, and difficult to try avoid. (try a breakpoint on #url_helpers to see how many times is called) At the end, removing the offending code bring the routes reload times to normal. Patching #url_helpers to return always the same module instance would help a little more. |
This commit eagerly builds the route helper module after the routes have been drawn and finalized. This allows us to cache the helper module but not have to worry about people accessing the module while route definition is "in-flight", and automatically deals with cache invalidation as the module is regenerated anytime someone redraws the routes. The restriction this commit introduces is that the url helper module can only be accessed *after* the routes are done being drawn. Refs #24554 and #32892
This might have some relation to #32304 that I made earlier this year. |
This commit eagerly builds the route helper module after the routes have been drawn and finalized. This allows us to cache the helper module but not have to worry about people accessing the module while route definition is "in-flight", and automatically deals with cache invalidation as the module is regenerated anytime someone redraws the routes. The restriction this commit introduces is that the url helper module can only be accessed *after* the routes are done being drawn. Refs #24554 and #32892
This issue has been automatically marked as stale because it has not been commented on for at least three months. |
Is there any new development for this issue? The PR #33970 has been reverted and I can't find anything else. |
I've been investigating a similar issue in one large application I am working on. I tried to look into it a couple of times before, but never had enough time. This ticket has been an incredible source of ideas to further my research and I took last weekend to go deeper into it. The issue was quite severe in my project. It can be easily replicated by calling I've used a very simple brute force approach inspired by http://www.be9.io/2015/09/21/memory-leak/, that is:
From there, I would look into the resulting file for anything out of the ordinary, especially duplicated class names. I wasted a lot of time at first, but I ended up getting better at spotting the culprits. Once I had an interesting memory address, I used @tenderlove's script (slightly modified) from a previous comment to look at the memory tree to see what was happening. Here's some zoomed in screenshots of what I found: The one above show a leaked class from translation models created by Globalize. That one is an example of a leak coming from models using HABTM association. There was also another leak originating from ActiveAdmin controller classes. The commonality between all these leaks was various anonymous classes referenced by the After looking into the module code and searching for known issues, I've found there was an issue about this problem #31395 and a pull request #31442. I then proceeded to extract the fixed I'll see if I can help with #31442 in the coming weeks. |
Do we have any way around to resolve this problem temporarily until the issue gets fixed from rails itself? |
@cmthakur You can use the patched DescendantsTracker by @ebeigarts here: https://github.com/rails/rails/blob/ea8f8ed7ac91bc397a6ac6cf2faad36e4a9a0616/activesupport/lib/active_support/descendants_tracker.rb There's some actions in the issue #31442 it might get merged soon. |
Hi everyone! As the author of the issue I just want to report that ever since we upgraded to Rails 6 and started using the Zeitwerk loader, these reload slowdowns have vanished. So the issue is solved for me! Others in my team also report similar results. Thus I'm closing the issue! 🎉🎉🎉 (Note: I didn't test Rails 6 without the Zeitwerk loader, so I can't say if that configuration would also fix the issue or not... Probably though because AFAIK the dependents tracker patch that's linked above made it into the core of Rails and not Zeitwerk explicitly...) |
Woot, that’s great to hear! 😄 |
@mariusandra Thanks for all the work and the follow up! Guess we need to upgrade sooner rather than later! |
@mariusandra I can confirm than even with module MyApp
class Application < Rails::Application
# Initialize configuration defaults for originally generated Rails version.
config.load_defaults 5.2
end 👍 |
I updated to Rails 6, just because of this issue. Figured out the AA Filter's bug was was causing the issue activeadmin/activeadmin#5752 Had missed config.load_defaults 6.0 Thank You |
TL;DR; My decently sized Rails app (173 models, 208 controllers) is suffering from a continuously increasing slowdown in development mode that makes it hard to work on it. Each reload of the code takes longer than the previous one, with delays increasing to minutes after just a few reloads. I have identified a probable cause and it's also present on new apps (
rails new leakyapp
, no added gems). I haven't found a solution yet.The issue: path and url helpers are packaged into a module and included in controllers and other classes. When you
reload!
an app, the helper methods in this module are removed and then re-added. While doing this, ruby needs to remove the methods from each class that includes the module. However each reload of the code leaves the old class instances (e.g. the controller from the previous request, etc) in memory and thus ruby needs to clean route helpers from an ever increasing amount of classes. This gets slow after a while.Background
Hello,
For the past years I've been coping with an issue that Rails feels extremely slow when developing. I start the server, wait for it to boot up and open the site. It takes a minute or two before it loads.
Then I have to make a change to a controller, model or other .rb file and hit reload in the browser. It takes some seconds and the changed page opens just fine. However after each following change, the reload takes longer and longer. After a few rounds, it's faster to restart the server than to wait for Rails to reload the code. I'm often waiting minutes before anything shows up again.
At first I thought the issue has something to do with activeadmin, a properly slow gem in its own right. Since it always takes forever to reload if I make any change in the admin area, I thought the increasing slowness was just part of the package and I tolerated it for longer than I should have. Lately though it got so unbearable that I needed to find a solution.
So I started profiling. Opening on page in admin from a cold boot, I get this beautiful firegraph:
(using rack-mini-profiler with
?pp=flamegraph&flamegraph_sample_rate=20
).I guess this is a relatively normal startup. I need to see if I can somehow cache sprockets and check why is Thor still there, but other than that the app loads nicely.
Let's hit reload:
The graph follows the same shape as the previous one, just the part with sprockets is much smaller.
So far so good. Now let's make a change in a
.rb
file and then hit reload:It takes a while for the page to load. Well over 30 seconds. That's because ActiveAdmin needs to re-load all 100 or so pages just to draw a menu and generate all the paths. Oh well. However that's not the issue here.
The issue is that this part starts to get longer:
.. and longer:
.. and even longer:
.. and after about 6 reloads (I skipped a few images), it's half the size of the ActiveAdmin reload time...
After 10+ reloads I will be waiting minutes before anything opens. That's enough time to check HN and forget what you were doing.
So who is this culprit? It's this guy: ActionDispatch::Routing::RouteSet::NamedRouteCollection#clear!
The full stacktrace is here.
Huh. It takes an increasingly longer time to clear path helpers?
As far as I know, path and url helpers are modules, which are automatically included in all controllers and views. To get
*_path
methods into other classes (e.g. models), one should addinclude Rails.application.routes.url_helpers
to the class.So it's taking longer and longer to remove the
*_path
and*_url
methods from@path_helpers_module
.Does this mean that this module is included in more and more classes, which don't get garbage collected? Let's find out.
Throwing caution to the wind (don't do this at home), I opened and edited the
route_set.rb
file in the gem's folder and added some debug code.First just to measure what gets slower (it's the
@*_helpers.each
blocks, not the.clear
calls)Then to measure if there are just more
@path_helpers
and@url_helpers
. Nope, their number remains constant:So it must be that the module is included in an ever increasing number of classes:
Bingo!
It appears that on each reload, Rails reloads new versions all the classes, yet doesn't dispose of the old versions.
Who are all these classes?
Every reload adds a bunch of classes that look like this:
#<#<Class:0x00007fda668c8d48>:0x00007fda668c3690 @_routes=#<ActionDispatch::Routing::RouteSet:0x00007fda8015ad08>>
and one controller of the page I'm accessing (Admin::RequestsController
).Digging around, it seems that those classes are instances of this proxy class that's created when you call
Rails.application.routes.url_helpers
.So it seems something still holds a reference to the controllers from previous requests... and all those proxy url helper modules that are included in so many places.
I installed the
ruby-mass
gem in the hopes of figuring it out. Unfortunately I didn't get far. I couldn't find if there was any gem that for whichever reason kept references to the old classes.So I decided to see if I could replicate the issue with a smaller app. I didn't have to look far.
Just a simple
rails new bla
,rails g model something
,rake db:migrate
,rails g controller something
was enough to replicate this.Hacking
route_set.rb
and adding these lines:gives the following ouptut, after making a small change in
something_controller.rb
and hitting reload:The number of classes, from which the path helpers need to be cleared from, keeps increasing by 4 per reload. In an app as small as this, you don't notice it. However if it keeps increasing by an extra 200 classes each reload, it becomes a problem.
Let's select all the
SomethingController
s and see who has references to them:The
SomethingController
s are only referenced by someActionDispatch::Request
s that are hanging around.Looking at who is referencing those requests:
I investigated all of them in depth (via
Mass.references
) and couldn't find any top node that held a reference to any object from all the 3 different requests. The objects from each request however seem to have slightly cyclical relationships (e.g. a request holds a session, a session holds an env, the env holds the session and the request... or something like that). I'm not sure if this changes anything - they could and should be removed in bulk after a reload.Starting the GC manually does nothing.
And now I'm stuck.
Does anyone have an idea what is happening, if it's perhaps a feature not a bug, and how can I fix this so that my reload times go back to around 30 seconds? 😄
Thanks for reading! 🍻
Steps to reproduce
rails server
and open the endpoint from step 2bla
counter.Expected behavior
Rails should just keep the latest instance of the controller class in memory, not all of them.
Actual behavior
Every reload of the code adds a new controller instance in memory and cleaning url helper methods from all these old controllers takes longer after each reload.
System configuration
Rails version: 5.2.0
Ruby version: 2.5.0
Disclaimer
When I say "Rails does this" or "Rails does that" above, I'm using "Rails" as a shorthand for
Object
. The issue could easily come from any included gem, e.g.puma
or I might be doing something wrong myself.The text was updated successfully, but these errors were encountered: