Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Action Cable memory leak #26119
Steps to reproduce
$ redis-server & $ git clone email@example.com:chrismccord/channelsac.git $ cd channelsac/rails $ bundle $ bundle exec puma -e production -w 8
Next, visit http://localhost:3000 and refresh the app multiple times. On our hardware, we can watch the memory grow unbounded, as seen here:
This was after an apparent "warm up" period where Rails appeared to be lazy loading resources, but the first several dozen page loads shows oddly high memory growth. I wasn't sure if this was lazy loading or puma workers warming up, so it may be benign:
Either way, the memory growth appears to grow unbounded following this high growth period and is never reclaimed.
The memory is reclaimed following connection cleanup.
Memory grows with each new connection and is not reclaimed after connections die.
Let me know If I can provide more information to help diagnose further.
I tried to reproduce this, but I'm not getting any traction. My processes max out at around 110MB and will even reclaim memory if I let it sit. I used the same Ruby version and same Gemfile.lock. AFAICT, the only thing I'm doing differently is that I'm running on OS X.
I have tried reloading many times in both Safari and Chrome. I ran
require 'thread' queue = SizedQueue.new 100 reaper = Thread.new do while thread = queue.pop thread.kill sleep 1 end end loop do t = Thread.new do system 'curl -s -i -N -H "Connection: Upgrade" -H "Upgrade: websocket" -H "Host: localhost" -H "Origin: http://localhost:3000" http://localhost:3000/cable > /dev/null' end queue << t end
Still no memory leaks.
I applied this patch so that I could get memory statistics:
diff --git a/rails/app/controllers/pages_controller.rb b/rails/app/controllers/pages_controller.rb index bb7924d..90f4104 100644 --- a/rails/app/controllers/pages_controller.rb +++ b/rails/app/controllers/pages_controller.rb @@ -1,4 +1,11 @@ class PagesController < ApplicationController + def gc + GC.start + GC.start + GC.start + render json: JSON.dump(GC.stat) + end + def show end end diff --git a/rails/config/routes.rb b/rails/config/routes.rb index e0fbf08..984263b 100644 --- a/rails/config/routes.rb +++ b/rails/config/routes.rb @@ -2,4 +2,5 @@ Rails.application.routes.draw do # For details on the DSL available within this file, see http://guides.rubyonrails.org/routing.html root to: "pages#index" mount ActionCable.server => "/cable" + get "/pages/gc", :to => 'pages#gc' end
My current hunch is that it is specific to the platform. The only difference I can see between our environments is that I'm running on OS X and the leak was observed on Ubuntu. @chrismccord does the script I pasted above reproduce the memory leak on your environment? I'd like to try to automate the repro steps as much as possible.
I applied the patches and put together a new branch
My findings with this on OS X are similar to yours. My puma workers (4) spin up to 118mb quickly, but then they creep up slows to ~130mb, but settle there after 10 minutes, which seems normal. I reprovisioned a new ubuntu instance and I can replicate the seemingly unbounded memory growth there again. The puma workers continue to creep up after 20 min of running a couple browser tabs agains the server, consuming 164mb when I quit.
In both OS X and ubuntu, the GC statistics appeared to remain steady, so I don't have a good answer to where that memory is going within the puma processes.
If I run the
@tenderlove if you can send me your public key, I can give you access to the server to take a look. If not, let me know so I can tear it down. Thanks!
I'd send you my keys, but I'm not going to be able to poke at it until next week.
It's possible that we have a
I hit the gc endpoint many times after running the tsung benchmark, but I only paid attention to the heap_live_slots, which did steadily reduce down, even though the puma processes maintained their memory usage.
No worries! I'll kill off the instance, but I took a snapshot so I can reprovision quickly once you're ready :)
Hi. Sorry it's taken so long to get to this. I finally got the application up and running on an Ubuntu virtual machine. When I use tsung though, I get this error:
I tried hitting the page and refreshing a bunch, but still can't reproduce it. All processes would get up to about 148M and just sit there.
@chrisarcand are you able to reproduce this without