Database Pool Exhaustion #862

rickychilcott · 2020-05-15T12:54:36Z

I'm intermittently experiencing ActionView::Template::Error: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use errors.

I'm fairly certain I've tracked it down to Thread creation/execution in

thredded/lib/thredded/collection_to_strings_with_cache_renderer.rb

Lines 82 to 89 in bde4da0

    
             Thread.start do 
        
               # `ActionView::PartialRenderer` mutates the contents of `opts[:locals]`, `opts[:locals][:as]` in particular: 
        
               # https://github.com/rails/rails/blob/v6.0.2.1/actionview/lib/action_view/renderer/partial_renderer.rb#L379 
        
               # https://github.com/rails/rails/blob/v6.0.2.1/actionview/lib/action_view/renderer/partial_renderer.rb#L348-L356 
        
               opts[:locals] = opts[:locals].dup if opts[:locals] 
        
               render_partials_serial(view_context.dup, slice, opts) 
        
             end 
        
           end.flat_map(&:value)

I think I understand why the partials are rendered this way -- an attempt at parallelizing view rendering for very expensive/large-collection partials? -- but it also isn't protecting against going over the pool limit and triggering issues like I'm experiencing.

For now, I've set Thredded::CollectionToStringsWithCacheRenderer.render_threads = 1 which appears to have resolved the issue - albeit with slightly lower performance. It took some digging to figure that out, so I'm not sure if this is well documented yet or if other have experienced this before?

I also wonder if I can somehow increase the number of rendering threads beyond 1, but stop db exhaustion from happening. I spent a few minutes testing the concept of wrapping each serial partial render with ActiveRecord::Base.connection_pool.with_connection to grab a connection and return it after each render. Perhaps I'm just not fully testing the system, but it seemed to resolve my issues -- though I'd think that it could still cause pool exhaustion (if you have 10 connections in your pool and you try to render a large collection that generates 25 threads, it's possible it could take too long to get a connection)

def render_partials_serial(view_context, collection, opts)
  ActiveRecord::Base.connection_pool.with_connection do
    partial_renderer = ActionView::PartialRenderer.new(@lookup_context)
    collection.map { |object| render_partial(partial_renderer, view_context, opts.merge(object: object)) }
  end
end

Any thoughts or ways I can assist? I do recognize that if I increased my database pool to allow more connections, I may never hit this issue, but resources are somewhat constrained and it was surprising to see a Thread.start in Thredded without some more documentation around it or issues being raised.

Thanks for everyone's work on this fine gem.

The text was updated successfully, but these errors were encountered:

Velora · 2020-06-12T21:56:11Z

Hey @rickychilcott I wanted to chime in to say that we are experiencing the same issue as you.

I plan on looking into this more in the near future, but I wanted to share what I have found so far in case it is helpful to anyone.

First, Thredded::CollectionToStringsWithCacheRenderer.render_threads = 1 does resolve the issue for us as well.

I believe in our situation we could narrow it down to possibly being related to URLs being added in posts? We've noticed this issue occur on topics that have between 10-17 posts where almost every post has a URL added in the post. Other topics with many more posts and no URLs in their content have not had this issue.

We haven't tested enough to come up with perfect case were we can create the issue ourselves, but when we saw topics run into this issue it would simply be impossible to view them unless we set Thredded::CollectionToStringsWithCacheRenderer.render_threads = 1, and from those that did run into this issue they generally had many URLs in posts with between 10-17 posts total when they would break.

glebm · 2020-06-13T10:02:23Z

Interesting insight about URLs. The only place I can think of that might hit the database there concurrently is Rails.cache (Thredded caches the results of Onebox renders by default). What's your Rails.cache set to?

Velora · 2020-06-15T19:57:29Z

We're using memcachier. The posts shouldn't be expiring from the cache from what I can see (unless they aren't accessed in long enough that they are cleared from running out of cache space). I will have to test some more, but I don't think we have ever seen this error on a topic that didn't have a number of URLs in its posts.

glebm · 2020-06-15T20:51:22Z

Does the stack trace indicate where it's trying to obtain a connection?

timdiggins · 2022-02-17T16:56:41Z

@rickychilcott @Velora is this now fixed with master or are you still having to set Threads to 1?

Velora · 2022-02-18T01:17:26Z

@timdiggins thanks for following up. I haven't tried master, but with 0.16.16 it was an issue. Was there a commit to master that should fix this?

timdiggins · 2022-02-18T11:56:38Z

@Velora I don't think there was a commit since 0.16.16 specifically addressing this.

rickychilcott · 2022-02-21T13:46:02Z

I'm deploying a change now that will make it easy for me to test this. I'll try it out in the next few days.

rickychilcott · 2022-02-21T23:23:04Z

I just checked this in my production system and setting it to 10 threads. No good -- could not obtain a connection from the pool within 5.000`. So I've set it back to 1 thread.

rickychilcott · 2022-02-21T23:25:08Z

And to clarify, this was tested with master revision 9a7158d, so the very latest.

timdiggins · 2022-02-22T08:06:19Z

It would be nice to have some kind of resolution to this issue before releasing v1.0 -- at minimum have some clarity as to when and why this happens.

I'm not really following what's happening, ie what's the problem.

Is it just that each thread requires a separate DB connection from the pool and that that quickly exhausts the total when there are enough concurrent renders? Or is it not releasing them at the end or similar issue? If so, is there a fix to this?

Alternatively, should we just change the default and make larger thread count as more of an experimental setting and state in the Readme that you should set the threads to 1 unless you have a very large pool of DB connections.

timdiggins · 2022-02-22T14:36:18Z

@rickychilcott can you post a stacktrace of the database pool exhaustion in case it gives any insight into what's going on?

timdiggins · 2022-02-22T16:40:19Z

FYI slightly related discussion: #770

rickychilcott · 2022-02-23T00:14:34Z

@rickychilcott can you post a stacktrace of the database pool exhaustion in case it gives any insight into what's going on?

Sure. The log is at https://gist.github.com/rickychilcott/49c98899a1689697aa3ed637f1160a61

rickychilcott · 2022-02-23T00:20:35Z

It's been a while since I've looked at this, but the hard part is getting it to be reproducible locally.

I think if you set something like this in your database.yml you might be able to repro it more easily. I normally have it be something like:

default: &default
  adapter: postgresql
  encoding: unicode
  pool: <%= ENV["DB_POOL"] || ENV['RAILS_MAX_THREADS'] || 5 %>

And then set DB_POOL to a low number like 2 and then finally view a Thredded post with a larger number of comments. This is of course an artificially constrained setup, but would more easily force it to exhaust the connection limit.

timdiggins · 2022-02-23T07:14:04Z

Sure. The log is at https://gist.github.com/rickychilcott/49c98899a1689697aa3ed637f1160a61

Thanks @rickychilcott - Very useful! I'll look a bit more but any chance you can tell me what's going on at app/models/user.rb:83 ? It's the only line in the stacktrace that comes from the main app itself rather than gems. Probably some kind of db access?

rickychilcott · 2022-02-23T11:05:06Z

You're right. Something basic. That line is...

default_scope -> { order(:name) }

Pretty boring.

timdiggins · 2022-02-23T15:50:02Z

@rickychilcott your responses have been super helpful.

The cause of db accesses in rendering posts are typically User mentions, and currently our demo content in the database seeder generates no user mentions, so it has been hard to repro this straightforwardly. Cutting down DB_POOL to a ridiculously low number (as you suggested) and also adding user mentions to the demo content makes it simple to repro.

It's also then easy to see that adding the ActiveRecord::Base.connection_pool.with_connection around the render_partials_serial fixes this (though only relevant to do this in multi-threading environment.

The drawback is that this will then grab a db connection from the pool for each render thread, whether it's needed (e.g. has a user mention) or not, but the downside is that db pool exhaustion is avoided. An alternative is to wrap the users_provider fetch (lib/thredded/users_provider.rb:28) with the connection_pool.with_connection, but I'm less convinced about this.

I'll put up a draft PR with a possible solution (and the code I used to generate the sample data) for comment.

timdiggins · 2022-02-23T15:55:10Z

~~potentially~~ fixed by #926

timdiggins mentioned this issue Feb 22, 2022

prep for v1 release #923

Merged

timdiggins mentioned this issue Feb 24, 2022

Release 1.0 supporting rails 5.2 - 7.0 #921

Closed

15 tasks

timdiggins closed this as completed Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database Pool Exhaustion #862

Database Pool Exhaustion #862

rickychilcott commented May 15, 2020

Velora commented Jun 12, 2020

glebm commented Jun 13, 2020

Velora commented Jun 15, 2020

glebm commented Jun 15, 2020

timdiggins commented Feb 17, 2022

Velora commented Feb 18, 2022

timdiggins commented Feb 18, 2022

rickychilcott commented Feb 21, 2022

rickychilcott commented Feb 21, 2022

rickychilcott commented Feb 21, 2022 •

edited

timdiggins commented Feb 22, 2022

timdiggins commented Feb 22, 2022 •

edited

timdiggins commented Feb 22, 2022

rickychilcott commented Feb 23, 2022

rickychilcott commented Feb 23, 2022

timdiggins commented Feb 23, 2022

rickychilcott commented Feb 23, 2022 •

edited

timdiggins commented Feb 23, 2022

timdiggins commented Feb 23, 2022 •

edited

Database Pool Exhaustion #862

Database Pool Exhaustion #862

Comments

rickychilcott commented May 15, 2020

Velora commented Jun 12, 2020

glebm commented Jun 13, 2020

Velora commented Jun 15, 2020

glebm commented Jun 15, 2020

timdiggins commented Feb 17, 2022

Velora commented Feb 18, 2022

timdiggins commented Feb 18, 2022

rickychilcott commented Feb 21, 2022

rickychilcott commented Feb 21, 2022

rickychilcott commented Feb 21, 2022 • edited

timdiggins commented Feb 22, 2022

timdiggins commented Feb 22, 2022 • edited

timdiggins commented Feb 22, 2022

rickychilcott commented Feb 23, 2022

rickychilcott commented Feb 23, 2022

timdiggins commented Feb 23, 2022

rickychilcott commented Feb 23, 2022 • edited

timdiggins commented Feb 23, 2022

timdiggins commented Feb 23, 2022 • edited

rickychilcott commented Feb 21, 2022 •

edited

timdiggins commented Feb 22, 2022 •

edited

rickychilcott commented Feb 23, 2022 •

edited

timdiggins commented Feb 23, 2022 •

edited