New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fiber safe ActiveRecord connection pool #42271
Comments
I've pushed the full source code here: https://github.com/socketry/db/tree/master/benchmark/compare |
👋 I'm not an expert in concurrent programming and Async internals, but I'm passionate about making Rails and ActiveRecord compatible with the Async stack and I took a peak at this. I think it has a lot of potential impact for serving high throughput workloads in a very efficient way with fibers. I was able to make AR configuration accept While I took many shortcuts, my proof of concept makes ActiveRecord to query the database through the Async stack and the @ioquatix let me know if this is promising and if it unlocks you to plug more things from the async stack. If it is, I can PR my ActiveRecord changes required to this repo. Here's the diff: main...kirs:async-rb |
Yes this sounds like a good first step - making the connection pool a configuration option. Right now, the pool class would be specific to the adaptor class too, so I wonder if we can make some default interface here, e.g. |
Oh yeah, `adapter.default_pool is a nice idea. I can PR something for that. |
Hmm, it's the pool implementation that initializes the adapter: https://github.com/rails/rails/blob/main/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L642 At that point all we have is the adapter name like |
Actually it’s fine by me because the adapter itself is the most important part. So it’s fine to create the connection pool from that if it’s easier. |
This issue has been automatically marked as stale because it has not been commented on for at least three months. |
Boop. |
@ioquatix I'm still looking at this ticket. I think we could borrow some design from Sequel like you pointed out. There's something I've been thinking about: ActiveRecord's pool has a Reaper thread which returns lost connections to the pool - in case a programmer forgets to checkin a connection at the end of a thread or a thread dies unexpectedly. Would we want to have a similar thing for Fiber-local pool? I think so, but I don't see |
It's also designed/optimized around the assumption that a thread would check out a connection for a long time which isn't true with fibers/Falcon in the same way it might be with Puma. |
Can you elaborate why? This would help me to understand the context better. |
It's true that Puma's worker thread would be alive for much longer than per-request Fiber, but Rails would check out connection at the end of the request, so each new request would |
@kirs those are fair points. ActiveRecord would check out a connection for the duration of the request, even if that request only does 1 or 2 queries. But the request itself might take several seconds, or even minutes to complete. |
As much as I like this approach, this is pretty far from Rails' conventions, which let you grab the connection without a block, and relies on a method like What I'd love to do here is make AR Fiber-friendly in a backward compatible way, without changing how Rails works with connections. That would mean having to support out of band How does this sound for you, do you think it's possible? On the side note - with |
I would love to see this happen and I would be happy to help out, if any of you @kirs @ioquatix think that I could be of any help let me know! I looked at how sequel (I hope I looked at the right sequel repo) does it and they have https://github.com/jeremyevans/sequel/blob/67beb74437300f2e08fc8a28f5aa12867df5d492/lib/sequel/connection_pool.rb#L50 this methos which looks at the options which are passed at initialisation time and then decide which pool to use, which to my understanding is not related to what database driver they use. |
I'd like to revive some discussion on this thread as to how to proceed with making the Connection Pool Fiber-safe. I was able to get Fiber-safe ActiveRecord interactions with this commit, which makes use of the recently added One thing I'd like to propose to keep these discussions focused is that for now, we only focus on making the Connection Pool Fiber safe, and we don't (yet) address the issue of how/whether to improve Rails' "Checkout Policy", i.e. let's not discuss whether the pool should check connections back into the pool after each statement, but just content ourselves with a Fiber-safe pool that, just as it does today, lazily checks out connections on demand that are held until the end of the request / Sidekiq job. |
And here's a stab at a bare minimum PR that seems to do the trick in my local tests with Async 2: #44219 |
Something I've been wondering as we consider alternative "checkout policies": aside from the cost to regularly checkout/checkin connections, is there anything about the way Postgres (and similar DBs) are architected that would provide better caching or other performance boosts when the client reuses the same connection for a series of queries (i.e. the queries required to serve a web request) rather than performing the queries on different connections checked out from the pool? IS there such a thing as "query locality" that favors connection reuse? |
@machty I want to believe query/connection affinity can make some improvement but honestly I'm not sure since some servers shared cache/buffer pools. While PG has separate process per connection so it might incur extra page faults. Regarding generic minimal scheduler, I'm working on it. |
#44219 has been merged. @ioquatix From what I'm hearing from PG folk it seems like what you're saying is correct: there isn't really a connection-affinity performance consideration to be concerned about. So maybe it's time to start thinking through some alternative Checkout Policy schemes to implement in Rails? I was also thinking (and maybe late to the party on this one): if a server is using something like pgbouncer in transaction-pooling mode, couldn't we just set ActiveRecord's max pool size to infinity, let each Rails process create as many connections to pgbouncer as it wants, and just lean on pgbouncer to enforce that max connections to the db? Of course, there's still value in landing something like a flexible pooling scheme in user/application land, but wouldn't this sidestep the whole issue for a lot of people who already use pgbouncer and want to try out Async + Rails w Fiber-safe pool? |
I think there are a few things that will need to shuffle around to get us all the way there without any performance regression for current users, but there are definitely next steps that we can take. Specifically, I'm thinking that we could add a I'm curious how far we can get with that change -- and to identify how many other API layers will themselves need to change to work with a block. We'll need to then make checkout cheaper before we can actually start block-scoping, and not just pretending... but just shifting the API to support it should allow some relevant experimentation. @machty is that something you'd be interested in looking in to? |
@matthewd I can starting poking around over the next few days, will let you know where I land |
This issue has been automatically marked as stale because it has not been commented on for at least three months. |
Just following up on this, @matthewd, is there still interest in working towards releasing connections like this? It might be something I could look into. |
I would like also to help @joeldrapper. |
@mohammed-io I discussed this briefly with @tenderlove and @jhawthorn at RubyConf recently. 🙏 They mentioned there might be a compatibility issue to this approach because you can currently get the ID of the last inserted record. That’s not an important feature for my needs so I would be happy to disable it with a configuration. For now, I’ve had to patch both Async and Additionally, I’ve had to rescue from ActiveRecord’s connection pool error in order to implement a queueing mechanism that yields to the scheduler, since it doesn’t do that by default, even when you have a generous timeout. It's a real mess and makes working with Fibers in Rails quite awkward. You can generally avoid using the database while fanning out to do HTTP requests, etc, but it can be quite difficult to completely avoid it. Even initialising a record can sometimes acquire a connection, which is held (at least) until the end of the task. And even then, it’s only released if you do it manually. |
@joeldrapper thanks for your hard work surfacing, investigating and fixing these issues! |
Can you please share your patches including the code for above?
@joeldrapper Do you mean |
Thanks @ioquatix. ❤️ Is there anyone who can re-open this issue? It was marked stale by a bot, but I don’t think it’s stale. Otherwise, do we need to open a new issue? @j-manu I’ll try to see if I can extract the patches from this branch I’m experimenting with. It’s very hacky and not at all something to keep around. And yes, I do mean something like One thing I didn’t really touch on is while it is possible to use this kind of patch to constrain the use of Fibers in background jobs, the greedy database connection checkout mechanism is incredibly limiting when it comes to long-running requests — web sockets, SSE, etc. I was recently listening to an episode of the Rubber Duck Dev show where they were talking about web sockets vs SSE vs polling vs long-polling. Their conclusion was essentially that you should use polling in Rails because of limited connections. I wanted to shout about Fibers and Falcon but while Falcon is capable of maintaining thousands of concurrent connections, Rails would force you to use one database connection each, rather than sharing a small pool of database connections as needed by the SSE/websocket connections. SSE would be an incredibly powerful lightweight tool for pushing live events, updating graphs, triggering notifications, etc. But one database connection per active client is just unworkable for most environments. It shouldn't be the case that having thousands of active clients polling every couple of seconds — constantly working the load balancer, router, controllers, views, etc. to complete an entire request cycle — is more scalable than long-running connections managed by the Fiber scheduler. |
@joeldrapper Doesn't ActionCable work around this limitation? It used to explicitly release the connections but that was removed in this commit - 185c93e#diff-13c68eb84831f4ad0c140b918ea1091d1497532b7864b537efe05030d68c0d0e I don't know how it is handled now. |
@j-manu I’ve not used ActionCable, so I may have this wrong, but I thought connections were limited to the number of threads available on your web server, unless you use something like AnyCable for the web socket connections so your server threads only handle discrete messages. With a Fiber-based web server such as Falcon, you wouldn't need AnyCable because your Ruby web server could handle thousands of active SSR or WebSocket connections concurrently, with the Fiber scheduler pausing and resuming Fibers when they need to process messages. To me, this seems like a much simpler architecture and one we should be aiming to support. But to get there, we’ll need a way to share a limited number of database connections between many Fibers. I would love to find a way to configure ActiveRecord to acquire a database connection as needed for each query, and then immediately release it back to the pool. |
@joeldrapper I was referring to
ActionCable does not need a database connection per connected client is my understanding. Anycable's homepage shows a comparison with ActionCable handling 20K connections. Anycable is less resource intensive and performant.
yeah. Others too - #37092 Thinking aloud here - Why not do something similar to "around_action" for AR's query methods which will use I tried this module ConnectionWrapper
def first
ActiveRecord::Base.connection_pool.with_connection do
super
end
end
end
class << ActiveRecord::Base
prepend(ConnectionWrapper)
end Testing it with Sync do
barrier = Async::Barrier.new
3.times do |index|
barrier.async do
sleep index
puts User.first.email
puts ActiveRecord::Base.connection_pool.stat
sleep 5
puts "Done #{index}"
end
end
barrier.wait
end shows only 1 connection being used. If you remove the wrapper, it will use 3 connections. IMO, looking at AR's code, it seems quite difficult to change the connection handling behaviour so working around it is the only viable short term solution. There are probably better entry points for this wrapper than wrapping every AR method but I am not familiar enough with AR to know which is the appropriate one. |
Yeah, I really think this is optimal behavior, because it maximizes the concurrent workload a connection pool can do, its size can be smaller than the total number of threads/fibers. I think then there would be no need for This is how Sequel's connection pool works, and I believe the |
I've been playing around with the AR connection pool and investigating it's behaviour when running on Falcon or using the fiber scheduler.
I'm comparing with the db gem, using a very predictable query:
SELECT pg_sleep(1)
.Here is the fastest implementation - there is no thread local or fiber local state.
The performance of this:
The most direct comparison with ActiveRecord I could come up with looks like this:
The performance of this Is still pretty good:
This shows that AR is quite capable of decent performance. However the problem is the connection pool implementation is essentially incompatible with Fiber based event loops.
ConnectionPool#with_connection
I wrote a similar implementation using
#with_connection
but found that internally it still uses a per-thread cache of connections. Each request has it's own fiber, but runs on the same thread, so what essentially happens is we end up with all the requests trying to use the same connection.My intuition is that
#with_connection
should be a direct wrapper aroundcheckout
/checkin
. For example, I could imagine someone wants to use multiple connections during the same request for different queries.The performance of this approach is therefore not that great, and it seems to get worse over time which makes me wonder if the fibers are clobbering each other.
The errors from Falcon are things like:
ActiveRecord::Base.connection.execute
This is the most "typical" way I imagine AR would be used.
Not only that, but I'm not sure what the semantics should be for tidying up the connections afterwards. It feels like
ActiveRecord::Base.clear_active_connections!
is the wrong approach.As you can imagine, the performance of this approach is problematic.
In my debug log, I can see that some requests are taking over 60 seconds, because they become essentially serialized - i.e. one after another.
Non-blocking Database Adapters
The
pg
gem recently added basic support for the fiber scheduler.mysql2
could also take a similar approach. The implementation leaves a lot to be desired, and I've implemented fully non-blocking database adapters in thedb
gem:These gems use specific non-blocking interfaces exposed by
libpq
andlibmariadbclient
respectively via FFI.Possible Solutions
It would be really wonderful to see AR take a position compatible with the fiber scheduler.
The way I typically approach the problem is that fiber local is the "default" execution context, which is true for Ruby. Therefore, it doesn't seem unreasonable to me to change the current implementation to be use
Fiber.current
instead ofThread.current
for keying "per-request" connection state. However, this is also a bit more tricky w.r.t. cleaning up state. One main concern with this approach would be backwards compatibility.The idea of having "thread-local" pools and "fiber-local" pools seems like moving the problem on to the user, so for me personally I don't like this approach. If you wish to retain a shared "global pool", using a multi-level - i.e. a pool cable of dealing with threads AND fibers might be the right approach - with a per-thread pool which doesn't require locking. Some memory allocators work like this and achieve impressive performance while still looking effectively "global".
Another option which is kind of more interesting to me, is to have establish generic interface e.g. in the above example of a memory allocator -
malloc
andfree
- for a pool maybecheckout
/checkin
oracquire
/release
(my preference). Then, allow users to provide their own connection pool. Have a standard interface for AREL feature detection, so that we can essentially have external database drivers. This way, I can easily plug in the work I've done indb
gems.Summary
The current performance is essentially serialised due to the internal locking, and this produces poor performance. I believe that AR would benefit from a fiber aware connection pool. It might also be nice to have an official abstraction for connection pooling and generally "drivers", which allows us to plug in external database drivers more easily.
The text was updated successfully, but these errors were encountered: