Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActionCable can deplete AR's connection pool #23778

Closed
AndreasWurm opened this issue Feb 19, 2016 · 16 comments
Assignees
Milestone

Comments

@AndreasWurm
Copy link

@AndreasWurm AndreasWurm commented Feb 19, 2016

Normally ActionCable::ServerWorker::ActiveRecordConnectionManagement takes care to ensure connections are checked back into the pool. This is not ensured when using ActiveRecord from within a ActionCable::Channel::Streams stream_form/stream_for callback.

Observed this behaviour in rails 5.0.0.beta2.

Example:

class MyChannel < ApplicationCable::Channel
  def subscribed
    queue = "my_channels_queue"
    stream_from queue, -> (message) do
      # open a db connection, this will not be returned to the pool
      ActiveRecord::Base.connection
      transmit ActiveSupport::JSON.decode(message), via: queue
    end
  end
end
@maclover7

This comment has been minimized.

Copy link
Member

@maclover7 maclover7 commented Feb 19, 2016

If a user is manually calling ActiveRecord::Base.connection, shouldn't they have to return it to Active Record when they are done using the connection? Not sure it is Action Cable's responsibility to cleanup here...

@AndreasWurm

This comment has been minimized.

Copy link
Author

@AndreasWurm AndreasWurm commented Feb 19, 2016

well its just an example.. if I do MyModel.find(1) or any AR call instead the same thing happens, and my expectation was it just works and its taken care of.

The Readme states

You have access to your full domain model written with Active Record or your ORM of choice.

and has examples using active record without managing connections manually, and it works everywere else since ActionCable::ServerWorker::ActiveRecordConnectionManagement takes care of things, just not in the provided example. Personally I consider this a bug since I dont think anyone would expect that for this method its their responsibility to manage the pool, when everywhere else its taken care of.

@matthewd matthewd added this to the 5.0.0 milestone Feb 19, 2016
@matthewd matthewd self-assigned this Feb 19, 2016
@zhu1230

This comment has been minimized.

Copy link

@zhu1230 zhu1230 commented Feb 20, 2016

+1

@benoitongit

This comment has been minimized.

Copy link

@benoitongit benoitongit commented Feb 21, 2016

Same issue, I get the following error after using ActiveRecord from an ApplicationCable::Channel class:
"ActiveRecord::ConnectionTimeoutError - could not obtain a connection from the pool within 5.000."
Tested with last commit: 22c318d
Also good to point out that this issue does not happen when testing with this version: 49f6ce6

@dhh

This comment has been minimized.

Copy link
Member

@dhh dhh commented Feb 24, 2016

@jeremy Same issue as we were seeing, it appears.

@doits

This comment has been minimized.

Copy link
Contributor

@doits doits commented Feb 25, 2016

After updating to 5.0.0.beta3, I get a lot of ActiveRecord::ConnectionTimeoutError when running a large test suite of my rails app (including a lot of feature specs). With 5.0.0.beta2 it does not happen. Only mysterious thing: I don't use ActionCable at all. So either I hit a different bug or it is the same but the source doesn't lie in ActionCable but goes deeper.

@dhh

This comment has been minimized.

Copy link
Member

@dhh dhh commented Feb 25, 2016

@dhh dhh assigned jeremy and unassigned matthewd Feb 25, 2016
@chashmeetsingh

This comment has been minimized.

Copy link
Contributor

@chashmeetsingh chashmeetsingh commented Feb 25, 2016

I tried many configurations for the gem file bouncing between beta2 and 3 and found that the issue is caused by the active job gem beta3 , beta2 works just fine
Hope this helps!
https://github.com/chashmeetsingh/WebsocketSample

@jeremy

This comment has been minimized.

Copy link
Member

@jeremy jeremy commented Feb 25, 2016

The issue we're seeing is that the AJ async adapter creates enormous executor pools for each queue, quickly exhausting the db conn pool.

Quick hack in config/environments/development.rb:

ActiveJob::AsyncJob.const_set :DEFAULT_EXECUTOR_OPTIONS,
  ActiveJob::AsyncJob::DEFAULT_EXECUTOR_OPTIONS.merge(min_threads: 0, max_threads: 2, idletime: 10)

The defaults now are min = cpu count, max = 10 * cpu count, idle time = 1 minute.

If you have 4 CPUs and 10 different AJ queue names, that means up to 400 db connections per process.

Large pools are OK, really, but they need to release connections back to the pool after performing each job. And, ideally, they'd use a separate db connection pool that doesn't compete for the same resources as serving web requests.

@matthewd

This comment has been minimized.

Copy link
Member

@matthewd matthewd commented Feb 25, 2016

I expect to be able to address this via #23807.

@jeremy

This comment has been minimized.

Copy link
Member

@jeremy jeremy commented Feb 25, 2016

@matthewd That is… awesome! 🤘 Tames many beasts.

Another less-hacky workaround. Explicitly release connections after performing jobs in dev/test:

  # Dev and test use the async adapter which sets up large
  # worker pools per queue, easily exhausting available db conns.
  # That's only a problem because we don't release connections
  # back to the pool by default.
  if queue_adapter.is_a? ActiveJob::QueueAdapters::AsyncAdapter
    after_perform do
      ActiveRecord::Base.clear_active_connections!
    end
  end
@doits

This comment has been minimized.

Copy link
Contributor

@doits doits commented Feb 25, 2016

if queue_adapter.is_a? ActiveJob::QueueAdapters::AsyncAdapter

Ah, that is exactly the problem I have. With the latest change of ActiveJob, from the changelog ...

Change the default adapter from inline to async. [...]

... my tests (including a lot of mail specs) began using the as AsyncAdapter and therefore the problem manifested itself. Nice the puzzle fits together for me now ;-)

@jeremy

This comment has been minimized.

Copy link
Member

@jeremy jeremy commented Mar 1, 2016

@AndreasWurm Could you check with master now that #23807 merged?

@AndreasWurm

This comment has been minimized.

Copy link
Author

@AndreasWurm AndreasWurm commented Mar 2, 2016

@jeremy
changed to rails master but either I get this error from javascript "Uncaught Error: Existing connection must be closed before opening" and I cant send through the channel since its undefined or puma just stops responding at all. So cant really test the scenario. Also just from looking at the patch I suspect it doesnt fix my original issue since it basically just does the same as before: callbacks on work, subscribe and unsubscribe.

In my original post i missed to report Im using the redis adapter.

Also something i noticed thats bugging me:

stream_from "some_queue", -> (message) do
  raise "This will silently vanish"
  transmit ActiveSupport::JSON.decode(message), via: "some_queue"
end
@AndreasWurm

This comment has been minimized.

Copy link
Author

@AndreasWurm AndreasWurm commented Mar 7, 2016

with the current master and only one channel it works again. the error still exists and connections are not checked back into the pool when used from inside a stream_from callback.

@jeremy

This comment has been minimized.

Copy link
Member

@jeremy jeremy commented Mar 7, 2016

@AndreasWurm Confirm. Manual connection management is still required in stream blocks, but shouldn't be. Working on it.

@sgrif sgrif closed this in #24540 Apr 14, 2016
sgrif added a commit that referenced this issue Apr 14, 2016
Alternate implementation of #24162 with tests. The code had diverged
too far on master to pull that implemenation directly.

Fixes #23778
Close #24162

[Mattew Draper & Sean Griffin]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.