Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fiber-safe ConnectionPool + Dataloader gobbles all available database connections #4739

Closed
swrobel opened this issue Dec 20, 2023 · 13 comments
Closed

Comments

@swrobel
Copy link
Contributor

swrobel commented Dec 20, 2023

Describe the bug
When using the Rails 7.1 setting config.active_support.isolation_level = :fiber, GraphQL eventually gobbles all database connections

Versions

graphql version: 2.2.1
rails (or other framework): 7.1.2
other applicable versions (graphql-batch, etc)

  • graphql-c_parser (1.0.7)
  • graphql-sources (1.2.0)

GraphQL schema

I can't figure out exactly which part of my schema is triggering this, as I can send the same query twice in series and have it succeed the first time, and the second time it will say it can't obtain a connection from the pool. However, the fields I'm requesting don't involve any dataloader calls.

GraphQL query

This happens with many queries, but here's a simple example GraphQL query and response

mutation InitializeUI($input: CreateSessionInput!) {
  createSession(input: $input) {
    errors
  }
}
{
  "input": {
    "refreshToken": "8f2b217b-ae3b-4b53-aa19-611bb4512666:85d1287c-74ef-495c-9449-001b88fb416f"
  }
}
{
  "data": {
    "createSession": {
      "errors": [],
    }
  }
}

Steps to reproduce

Send multiple queries in series

Expected behavior

Queries should complete successfully

Actual behavior

ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use

Full backtrace:

Click to view exception backtrace
7m     warn: Async::Task [oid=0x20bfc] [ec=0x20c10] [pid=17733] [2023-12-20 09:38:24 -0500]
            | Task may have ended with unhandled exception.
            |   ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use
            |   → vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:128 in `block in wait_poll'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:117 in `loop'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:117 in `wait_poll'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:78 in `internal_poll'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:204 in `internal_poll'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:73 in `block in poll'
            |     /opt/homebrew/Cellar/ruby/3.2.2_1/lib/ruby/3.2.0/monitor.rb:202 in `synchronize'
            |     /opt/homebrew/Cellar/ruby/3.2.2_1/lib/ruby/3.2.0/monitor.rb:202 in `mon_synchronize'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:82 in `synchronize'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:73 in `poll'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:661 in `acquire_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:355 in `checkout'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:184 in `connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_handler.rb:246 in `retrieve_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:287 in `retrieve_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:254 in `connection'
            |     vendor/bundle/ruby/3.2.0/gems/appmap-0.103.0/lib/appmap/handler/rails/sql_handler.rb:135 in `in_transaction?'
            |     vendor/bundle/ruby/3.2.0/gems/appmap-0.103.0/lib/appmap/handler/rails/sql_handler.rb:49 in `examine'
            |     vendor/bundle/ruby/3.2.0/gems/appmap-0.103.0/lib/appmap/handler/rails/sql_handler.rb:158 in `call'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:137 in `block in finish'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:26 in `block in iterate_guarding_exceptions'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:25 in `each'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:25 in `iterate_guarding_exceptions'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:125 in `each'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:136 in `finish'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:262 in `block in finish_with_values'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:26 in `block in iterate_guarding_exceptions'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:25 in `each'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:25 in `iterate_guarding_exceptions'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:261 in `finish_with_values'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/fanout.rb:254 in `finish'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/notifications/instrumenter.rb:64 in `instrument'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract_adapter.rb:1143 in `log'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/postgresql/database_statements.rb:53 in `raw_execute'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/database_statements.rb:521 in `internal_execute'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/postgresql/schema_statements.rb:245 in `client_min_messages='
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/postgresql_adapter.rb:1001 in `configure_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract_adapter.rb:744 in `reset!'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/postgresql_adapter.rb:377 in `block in reset!'
            |     vendor/bundle/ruby/3.2.0/gems/activesupport-7.1.2/lib/active_support/concurrency/null_lock.rb:9 in `synchronize'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/postgresql_adapter.rb:369 in `reset!'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:426 in `block in reap'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:424 in `each'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:424 in `reap'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:660 in `acquire_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:355 in `checkout'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:184 in `connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_handler.rb:246 in `retrieve_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:287 in `retrieve_connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:254 in `connection'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:362 in `cached_find_by_statement'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:404 in `cached_find_by'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:285 in `find_by'
            |     vendor/bundle/ruby/3.2.0/gems/ar_lazy_preload-1.1.2/lib/ar_lazy_preload/active_record/base.rb:35 in `find_by'
            |     vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/secure_password.rb:51 in `authenticate_by'
            |     app/graphql/mutations/create_session.rb:45 in `authenticate_refresh_token'
            |     vendor/bundle/ruby/3.2.0/gems/skylight-6.0.1/lib/skylight/helpers.rb:184 in `authenticate_refresh_token'
            |     app/graphql/mutations/create_session.rb:19 in `resolve'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/resolver.rb:107 in `public_send'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/resolver.rb:107 in `block (3 levels) in resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:770 in `minimal_after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/query.rb:372 in `after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/resolver.rb:94 in `block (2 levels) in resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:770 in `minimal_after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/query.rb:372 in `after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/resolver.rb:85 in `block in resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:770 in `minimal_after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/query.rb:372 in `after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/resolver.rb:73 in `resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/has_single_input_argument.rb:36 in `resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/relay_classic_mutation.rb:45 in `resolve_with_support'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/field.rb:698 in `public_send'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/field.rb:698 in `block (2 levels) in resolve'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/field.rb:832 in `with_extensions'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/field.rb:669 in `block in resolve'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:770 in `minimal_after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/query.rb:372 in `after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/field.rb:667 in `resolve'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:358 in `block (2 levels) in evaluate_selection_with_resolved_keyword_args'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/tracing/trace.rb:52 in `execute_field'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:357 in `block in evaluate_selection_with_resolved_keyword_args'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:709 in `call_method_on_directives'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:354 in `evaluate_selection_with_resolved_keyword_args'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:331 in `block in evaluate_selection_with_args'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:830 in `after_lazy'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:279 in `evaluate_selection_with_args'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/runtime.rb:273 in `block in evaluate_selection'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/execution/interpreter/arguments_cache.rb:51 in `block in dataload_for'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/schema/member/has_arguments.rb:300 in `block (3 levels) in coerce_arguments'
            |     vendor/bundle/ruby/3.2.0/gems/graphql-2.2.1/lib/graphql/dataloader/async_dataloader.rb:65 in `block in spawn_job_task'
            |     vendor/bundle/ruby/3.2.0/gems/async-2.6.5/lib/async/task.rb:160 in `block in run'
            |     vendor/bundle/ruby/3.2.0/gems/async-2.6.5/lib/async/task.rb:330 in `block in schedule'
could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use
@rmosolgo
Copy link
Owner

Hey, thanks for sharing all those details. Even for queries that don't use Dataloader::Sources, field resolution is wrapped by dataloader handling. That way, if the field does use a source, it can properly pause and wait for data to be available.

I considered adding a maximum queue size to dataloader before (#3913), but the customer I was emailing with found a different solution to the problem. I didn't immediate add it because I didn't encounter this problem, but I'll investigate adding an Async::Semaphore to limit the number of actively-running Fibers.

Out of curiousity, do you know what your connection pool size is, and whether you're running a threaded server or not? If you're running the Rails default of 5 connections in the pool, and using a threaded server, then it might be possible to genuinely run out of connections across the different fibers.

@rmosolgo
Copy link
Owner

I added a queue size limit of sorts in #4742, you could try out that branch if you want to investigate that possibility. It defaults to 10 fibers. But it only limits fibers for graphql execution -- the Dataloader may have to spin up more fibers to resolve sources, and that can't really be bounded because it may have an arbitrarily deep dependency tree. So you could try setting working_queue_size: 5 or so to try a really small number, but the actual number of Fibers will still be greater than that.

@swrobel
Copy link
Contributor Author

swrobel commented Dec 20, 2023

Out of curiousity, do you know what your connection pool size is, and whether you're running a threaded server or not? If you're running the Rails default of 5 connections in the pool, and using a threaded server, then it might be possible to genuinely run out of connections across the different fibers.

Yes, I'm using the rails default of 5, at least in development mode (as I imagine most others are in dev mode)

@swrobel
Copy link
Contributor Author

swrobel commented Dec 20, 2023

So you could try setting working_queue_size: 5 or so to try a really small number, but the actual number of Fibers will still be greater than that.

I tried setting it to 1 and it still triggers the same exception. I tested with multiple values, and it was always on the 6th request (all in series) after a freshly-launched server, which makes me think for some reason connections aren't properly being returned to the pool after a request completes.

@rmosolgo
Copy link
Owner

Ah, yeah, that's a great clue -- so it seems like one connection is being taken (and not released) per request.

Just to rule it out: if you keep isolation_level = :fiber but switch back to "plain" GraphQL::Dataloader, does it work fine? (I'd like to be sure that it's something in AsyncDataloader, and not some other part of the system. For example, I see appmap in the backtrace above and see a lot of Thread.current in the appmap source. It'd be nice to be sure it's AsyncDataloader!)

In the meantime, I'll do some digging to see if we can track what is holding those connections.

@swrobel
Copy link
Contributor Author

swrobel commented Dec 21, 2023

Just to rule it out: if you keep isolation_level = :fiber but switch back to "plain" GraphQL::Dataloader, does it work fine?

Wow, good call! I didn't think to try that. I'm running into the same issue even when using the plain dataloader 🤔

@rmosolgo
Copy link
Owner

Interesting -- that makes me think there's some other part of the system that's holding on to connections longer than expected.

Looking at the stacktrace in the original report, the gems I see are:

  • Rails (activesupport, activerecord, etc)
  • GraphQL
  • Async
  • Appmap
  • Skylight

I wonder if we could isolate some of the variables, for example:

  • If you test a non-GraphQL controller action, does the application still run out of connections? (If other actions work fine in :fiber mode, then we could dig into GraphQL and the various integrations around it.)
  • Does a query like { __typename } also cause this issue? (I doubt that query would make any ActiveRecord calls on its own, although the surrounding context might.)
  • If you comment out the .execute(...) call in your GraphQL controller action, and make the controller return plain JSON, does it still surface this issue? (That would help us further isolate GraphQL itself vs. the surrounding context.)
  • When this issue does surface, what if you disable some of the tracing libraries, like Skylight and Appmap? Does disabling those make this problem go away? (If the problem still happens while those are disabled, then we can rule out those libraries as the source of the issue.)

Let me know what you find! I appreciate that using Fibers this way is on the leading edge of Rails development but I'd love to do whatever I can to make it work, because I think it's a really bright future for Ruby concurrency.

@swrobel
Copy link
Contributor Author

swrobel commented Dec 21, 2023

Testing your other ideas:

  1. non-GraphQL controller actions. PASS
  2. { __typename }-only query. PASS
  3. I tried replacing my entire graphql_controller.rb with the version generated from a brand new 2.2.1 rails g graphql:install setup. FAIL
  4. I removed all of the other items from the stacktrace, so it's just Rails & GraphQL. FAIL

Worth noting that the last 2 lines of the stracktrace are:

graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:263:in `block in spawn_job_fiber'
graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:231:in `block in spawn_fiber'

So it looks like fibers are still used in the stock dataloader (though I'm sure you knew that 😆)

Appreciate you working with my existing app, as putting together a repro for stuff like this can be quite tricky and time-consuming! Glad to help, as I feel the same way about the future of Ruby concurrency, and appreciate your forward-thinking work (and ability to engage the pillars of the Ruby concurrency community in your efforts)!

@rmosolgo
Copy link
Owner

rmosolgo commented Dec 21, 2023

Well, shoot. I was able to replicate this bug in my test app, just like you described. (I didn't see the bug there before -- but I hadn't actually set isolation_level = :fiber 🤦 ).

I debugged using this method:

class ActiveRecord::ConnectionAdapters::ConnectionPool
  # Based on ConnectionPool#stat
  def my_stat
    synchronize do
      {
        # size: size,
        connections: @connections.size,
        in_use: @connections.select(&:in_use?).map(&:object_id),
        owner_alive: @connections.select { |c| c.owner&.alive? }.map(&:object_id),
        owner: @connections.map { |c| c.owner },
        current_context: ActiveSupport::IsolatedExecutionState.context,
        # busy: @connections.count { |c| c.in_use? && c.owner.alive? },
        # dead: @connections.count { |c| c.in_use? && !c.owner.alive? },
        # idle: @connections.count { |c| !c.in_use? },
        waiting: num_waiting_in_queue,
      }
    end
  end
end

I put it in my GraphQL resolver right before making an ActiveRecord call:

def local_count(set:)
   pp ActiveRecord::Base.connection_pool.my_stat
   Card.where(set: set).count 
end 

My first surprise was that my helper raised an error: undefined method context' for ActiveSupport::IsolatedExecutionState:Module"` But I had just been reading about that method in the Rails docs... (Later I realized I was on an old Rails version, 7.0.8.) Anyways, I commented out that line and ran my app again, and it printed things like this:

{:connections=>5,
 :in_use=>[14820, 30460, 30600, 30720, 30840],
 :owner_alive=>[14820, 30460, 30600, 30720, 30840],
 :owner=>
  [#<Thread:0x00000001058eb358@puma srv tp 004 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/puma-5.6.7/lib/puma/thread_pool.rb:104 sleep_forever>,
   #<Thread:0x00000001058e99e0@puma srv tp 005 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/puma-5.6.7/lib/puma/thread_pool.rb:104 run>,
   #<Thread:0x00000001058e99e0@puma srv tp 005 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/puma-5.6.7/lib/puma/thread_pool.rb:104 run>,
   #<Thread:0x00000001058e99e0@puma srv tp 005 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/puma-5.6.7/lib/puma/thread_pool.rb:104 run>,
   #<Thread:0x00000001058e99e0@puma srv tp 005 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/puma-5.6.7/lib/puma/thread_pool.rb:104 run>],
 :waiting=>1}

The surprise to me was that connection.owner was a Thread -- shouldn't it have been a Fiber? So I updated to Rails 7.1.2 (which I see above, you're already using, so that's not helpful). Then Puma 5 gave a compatibility error on application boot, so I updated it to Puma 6.

I started the app again and it worked. (I also uncommented the current_context: ... part from my_stat.) The debug output looked like this:

{:connections=>5,
 :in_use=>[29700, 29940, 29980, 30120, 30180],
 :owner_alive=>[29980, 30120],
 :owner=>
  [#<Fiber:0x000000010707d520 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (terminated)>,
   #<Fiber:0x000000010709f2b0 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (terminated)>,
   #<Fiber:0x00000001075b7d10 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>,
   #<Fiber:0x00000001075bf218 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>,
   #<Fiber:0x0000000107017b58 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (terminated)>],
 :current_context=>
  #<Fiber:0x00000001075b1280 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (resumed)>,
 :waiting=>0}

As expected, the current_context is a Fiber, just like the connection owners. And after this, I could run queries over and over in GraphiQL without encountering the connection pool error.

I'm not sure where this leaves us, since you're already on Rails 7.1.2, but here are a couple of questions that might help:

  • What web server library (and version) are you using? (If you're on Puma, but not Puma 6, try that?)
  • What if you add my_stat to your project? What kind of output do you get from it?
  • If those don't provide any clues, could you share the full backtrace from the error when using the "normal" Dataloader? Maybe something in there will provide a clue about where to look next.

@swrobel
Copy link
Contributor Author

swrobel commented Dec 21, 2023

I'm using Puma 6.4.0. Here's my config, which is pretty simple:

bind "tcp://0.0.0.0:#{ENV["PORT"] || 3000}"

if Rails.env.development?
  threads 1, 1
  workers 0

  # Allow puma to be restarted by `rails restart` command.
  plugin :tmp_restart
elsif !Rails.env.test?
  preload_app!
  fork_worker if Rails.env.staging?
  workers Integer(ENV["WEB_CONCURRENCY"] || 1)
end

I added my_stat and ran it with the normal dataloader, and here are the results:

Click to view exception backtrace
=> Booting Puma
=> Rails 7.1.2 application starting in development
=> Run `bin/rails server --help` for more startup options
Puma starting in single mode...
* Puma version: 6.4.0 (ruby 3.2.2-p53) ("The Eagle of Durango")
*  Min threads: 1
*  Max threads: 1
*  Environment: development
*          PID: 71336
* Listening on http://0.0.0.0:3000
Use Ctrl-C to stop
{:connections=>1,
 :in_use=>[20480],
 :owner_alive=>[20480],
 :owner=>[#<Fiber:0x00000001095f6cc0 (suspended)>],
 :current_context=>
  #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
method=POST path=/graphql format=json controller=GraphqlController action=execute status=200 allocations=176548 duration=273.49 view=0.16 db=24.25 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}
{:connections=>2,
 :in_use=>[24940],
 :owner_alive=>[24940],
 :owner=>
  [nil,
   #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>],
 :current_context=>
  #<Fiber:0x000000010fd892c0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
method=POST path=/graphql format=json controller=GraphqlController action=execute status=200 allocations=2223 duration=54.73 view=0.09 db=0.00 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}
{:connections=>2,
 :in_use=>[20480, 24940],
 :owner_alive=>[20480, 24940],
 :owner=>
  [#<Fiber:0x000000010fd892c0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>],
 :current_context=>
  #<Fiber:0x000000010ff2dce8 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
method=POST path=/graphql format=json controller=GraphqlController action=execute status=200 allocations=4977 duration=95.70 view=0.08 db=0.00 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}
{:connections=>3,
 :in_use=>[20480, 24940, 25420],
 :owner_alive=>[20480, 24940, 25420],
 :owner=>
  [#<Fiber:0x000000010fd892c0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ff2dce8 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>],
 :current_context=>
  #<Fiber:0x00000001100c8d78 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
method=POST path=/graphql format=json controller=GraphqlController action=execute status=200 allocations=5020 duration=79.27 view=0.07 db=0.00 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}
{:connections=>4,
 :in_use=>[20480, 24940, 25420, 25520],
 :owner_alive=>[20480, 24940, 25420, 25520],
 :owner=>
  [#<Fiber:0x000000010fd892c0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ff2dce8 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x00000001100c8d78 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>],
 :current_context=>
  #<Fiber:0x0000000110203760 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
method=POST path=/graphql format=json controller=GraphqlController action=execute status=200 allocations=5063 duration=71.95 view=0.07 db=0.00 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}
{:connections=>5,
 :in_use=>[20480, 24940, 25420, 25520, 25620],
 :owner_alive=>[20480, 24940, 25420, 25520, 25620],
 :owner=>
  [#<Fiber:0x000000010fd892c0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ceb0fc0 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x000000010ff2dce8 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x00000001100c8d78 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>,
   #<Fiber:0x0000000110203760 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (suspended)>],
 :current_context=>
  #<Fiber:0x0000000110464bf8 vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:228 (resumed)>,
 :waiting=>0}
could not obtain a connection from the pool within 5.000 seconds (waited 5.005 seconds); all pooled connections were in use
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:128:in `block in wait_poll'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:117:in `loop'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:117:in `wait_poll'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:78:in `internal_poll'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:204:in `internal_poll'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:73:in `block in poll'
/opt/homebrew/Cellar/ruby/3.2.2_1/lib/ruby/3.2.0/monitor.rb:202:in `synchronize'
/opt/homebrew/Cellar/ruby/3.2.2_1/lib/ruby/3.2.0/monitor.rb:202:in `mon_synchronize'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:82:in `synchronize'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool/queue.rb:73:in `poll'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:661:in `acquire_connection'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:355:in `checkout'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:184:in `connection'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_adapters/abstract/connection_handler.rb:246:in `retrieve_connection'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:287:in `retrieve_connection'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/connection_handling.rb:254:in `connection'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:362:in `cached_find_by_statement'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:404:in `cached_find_by'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/core.rb:285:in `find_by'
vendor/bundle/ruby/3.2.0/gems/activerecord-7.1.2/lib/active_record/secure_password.rb:51:in `authenticate_by'
app/graphql/mutations/create_session.rb:45:in `authenticate_refresh_token'
app/graphql/mutations/create_session.rb:19:in `resolve'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/resolver.rb:107:in `public_send'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/resolver.rb:107:in `block (3 levels) in resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:770:in `minimal_after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/query.rb:372:in `after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/resolver.rb:94:in `block (2 levels) in resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:770:in `minimal_after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/query.rb:372:in `after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/resolver.rb:85:in `block in resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:770:in `minimal_after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/query.rb:372:in `after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/resolver.rb:73:in `resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/has_single_input_argument.rb:36:in `resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/relay_classic_mutation.rb:45:in `resolve_with_support'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/field.rb:698:in `public_send'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/field.rb:698:in `block (2 levels) in resolve'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/field.rb:832:in `with_extensions'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/field.rb:669:in `block in resolve'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:770:in `minimal_after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/query.rb:372:in `after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/field.rb:667:in `resolve'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:358:in `block (2 levels) in evaluate_selection_with_resolved_keyword_args'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/tracing/trace.rb:52:in `execute_field'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:357:in `block in evaluate_selection_with_resolved_keyword_args'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:709:in `call_method_on_directives'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:354:in `evaluate_selection_with_resolved_keyword_args'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:331:in `block in evaluate_selection_with_args'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:830:in `after_lazy'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:279:in `evaluate_selection_with_args'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/runtime.rb:273:in `block in evaluate_selection'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/execution/interpreter/arguments_cache.rb:51:in `block in dataload_for'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/schema/member/has_arguments.rb:300:in `block (3 levels) in coerce_arguments'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:263:in `block in spawn_job_fiber'
vendor/bundle/ruby/3.2.0/bundler/gems/graphql-ruby-b8da42e225b7/lib/graphql/dataloader.rb:231:in `block in spawn_fiber'
method=POST path=/graphql format=json controller=GraphqlController action=execute status=500 allocations=1837 duration=5016.44 view=0.67 db=0.00 params={variables: {input: {refreshToken: "[FILTERED]"}}, operationName: "InitializeUI"}

@rmosolgo
Copy link
Owner

rmosolgo commented Dec 21, 2023

Ok, thanks -- the give-away was those suspended Fibers. It turns out some changes from #4625 caused job fibers to be suspended and not resumed. I've got fixes for 2.2: #4743 and 2.1: #4744

I was able to replicate the bug in my own app, with those suspended fibers, and #4743 fixed it for me. I just merged that PR, could you try your app on master, using the "plain" Dataloader, and see if that works better for you?

@swrobel swrobel changed the title AsyncDataloader gobbles all available database connections Fiber-safe ConnectionPool + Dataloader gobbles all available database connections Dec 21, 2023
@swrobel
Copy link
Contributor Author

swrobel commented Dec 21, 2023

could you try your app on master, using the "plain" Dataloader, and see if that works better for you?

No more issues! I did have to keep appmap disabled, as that seems to have a conflict with the fiber-safe connectionpool, but I'll file an issue with them separately!

@rmosolgo
Copy link
Owner

I don't know that we've totally solved this problem, since this "fix" found above created other problems. I was able to replicate the issue in Async directly and I opened an issue there: socketry/async#295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants