Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Pool should release internal DB resources #14037

Closed
dnagir opened this issue Feb 13, 2014 · 24 comments
Closed

Connection Pool should release internal DB resources #14037

dnagir opened this issue Feb 13, 2014 · 24 comments

Comments

@dnagir
Copy link
Contributor

dnagir commented Feb 13, 2014

When a connection is returned to the connection pool, the associated resources do not seem to be released (judging by the search_path in PostgreSQL being preserved).

This could be a major problem leaking state and the resources being held internally by the Database (potentially growing).

I've discovered this trying to figure out a few things in the other question.

As it was pointed out be one of the PostgreSQL gurus:

When you return connections to the pool, you must ensure the pool runs DISCARD ALL; to reset the connection state.

This is especially true if we'll look at dedicated pooling solutions, like pgpool that actually uses ABORT; DISCARD ALL commands before returning the connection to the pool.

@tenderlove
Copy link
Member

Can you provide more references than just this SO response? I can't seem to find anything in the PG docs that says we need to do this.

It seems like switching search paths on each request is something you need to take responsibility for, e.g. if you're going to switch the schema search path, then you should be responsible for running the DISCARD ALL statement. I'm not convinced this is an issue with "normal" apps (but I'm willing to be convinced!)

@dnagir
Copy link
Contributor Author

dnagir commented Feb 16, 2014

Okay. I'm not too good at convincing, but I'll give it a shot.

This is the explanation from one of the DBAs on why the connection should be reset when returning to the pool (as used by PgBouncer):

Since pgbouncer is a connection pooler, it will reuse one actual connection to the database server
for potentially many client connections. It has to make sure that whatever session state the first
client created is reset when the next client gets the connection. Otherwise the first client might do
something like SET statement_timeout = '5min', and that would apply to all subsequent clients
who happen to be assigned that connection. To avoid that, pgbouncer issues the "server reset query"
before handing out the server connection to a new client.

Then if we'll look at the PG Change log, for v 8.3 we'll see:

2007-04-12 02:53 neilc RESET SESSION, plus related new DDL commands. Patch from Marko Kreen, reviewed by Neil Conway. This patch adds the following DDL command variants: RESET SESSION, RESET TEMP, RESET PLANS, CLOSE ALL, and DEALLOCATE ALL. RESET SESSION is intended for use by connection pool software and the like, in order to reset a client session to something close to its initial state.

Note that later the RESET commands were renamed to DISCARD:

2007-04-26 12:13 neilc Rename the newly-added commands for discarding session state. RESET SESSION, RESET PLANS, and RESET TEMP are now DISCARD ALL, DISCARD PLANS, and DISCARD TEMP, respectively. This is to avoid confusion with the pre-existing RESET variants: the DISCARD commands are not actually similar to RESET. Patch from Marko Kreen, with some minor editorialization.

I'm not an expert in PG, but that's what my quick research yields. Would be good to get an opinion from an expert here.

But any connection pooling solutions I can see use the RESET ALL (or a variant) (PgBouncer, pgpool). The Rails has its own pooling implementation and needs to follow those practices too IMO (safe by default).

It's not a big deal for me personally to manage the search path manually, but I feel obligated to bring this up because if the search path is "leaking" then there's a chance of other data leaking too.

@Alamoz
Copy link

Alamoz commented Mar 15, 2014

@dnagir @tenderlove - the normal Postgres use case is that sessions can possibly originate from different users or locations and possibly use different schemas. In such cases, the connection pooler definitely needs to execute DISCARD ALL to clear all the previous session state data, prepared queries, locks, etc. prior to connecting the new session. The "normal" use case in Rails is that all authentication information, tables, timeout values, etc. will remain the same within the Rails app and if any of those entities change then the Rails app needs to reboot (and yes, there are ways to rollout your updated Rails app to avoid downtime, but that is another discussion.) So at first blush, it appears the Postgres connection adapter does not need to execute DISCARD ALL when disconnecting or resetting a connection. However, what happens in the case when an advisory lock has been set and not cleared because of an exception within your Rails code? Further, what about the use case where code in a Rails app may change the character set when establishing a Postgres connection via Active Record? In those cases, not executing DISCARD ALL when checking a connection back into the connection pool will leave the connection in an unwanted state. Looking at the code in Rails 4.0.4 Active Record, these scenarios are possible. Active Record's connection adapter does reconnect to the database when a connection is no longer active, which will reset the Postgres session data with settings from whatever Rails instance performs the reconnect, but when a particular connection is corrupt and not reset, subsequent connections are assigned to a possibly corrupt active connection, which can produce unwanted results. Preempting these cases requires not just executing DISCARD ALL, but insuring that parameters such as character set, timezone, etc. are reset each time Active Record grabs a connection from the pool. Currently, when the connection pool "expires" a connection, it merely sets the connection's status to "false" and then gives it out to the next connection requestor. So the Rails active record connection pool behaves like pgbouncer, the difference being pgbouncer both calls DISCARD ALL before marking an active connection as available, as well as resetting the session parameters each time a client connects or reconnects to the available connection. So I think the active record connection pool does need to add this behaviour to the Postgres adapter, to prevent problems and achieve reliability. Resetting the pg connection session parameters each time an idle active pg connection is allocated is not much of a performance hit and a small price to pay for increased reliability. In high throughput OLTP scenarios, Postgres developers will often not call DISCARD ALL nor reset the session if they are certain the state variables won't change; in such situations the overhead does add up. A typical Rails app is much less active and the connections are not reset very often.

Thanks to @jberkus for explaining pgbouncer and Postgres connection pooling.

#14392 adds DISCARD ALL to PostgreSQLAdapter#clear_cache! Still looking at code to add pg session params reset when connection pool assigns an unused active connection to requestor.

@Alamoz
Copy link

Alamoz commented Mar 15, 2014

DISCARD ALL has the same effect as:

SET SESSION AUTHORIZATION DEFAULT;
RESET ALL;
DEALLOCATE ALL;
CLOSE ALL;
UNLISTEN *;
SELECT pg_advisory_unlock_all();
DISCARD PLANS;
DISCARD TEMP;

See http://www.postgresql.org/docs/8.3/static/sql-discard.html

@jberkus
Copy link

jberkus commented Mar 15, 2014

Dnagir, Alamoz:

FWIW, I personally wouldn't expect Rails to run DISCARD ALL when returning connections to the pool, although it would be a nice option to have in addition to other things. Alamoz is correct when he says that most Rails applications don't SET session variables or use temporary tables anyway, so for the average Rails user this isn't really a concern. And if you know that your application doesn't create database state, then you don't want to be running DISCARD ALL all the time; it's not a "free" command, since it needs to check multiple memory structures even if it doesn't do anything.

Now, Rails does use a lot of prepared queries these days. However, by convention these prepared queries are "anonymous", which means that we only really care about discarding them in order to save memory -- again, not a concern for most users. For advanced users with high performance requirements who are maybe using more advanced prepared query options, though, it would be good to have a discard option. However, for those users you don't need DISCARD ALL; you can just use DISCARD PLANS, which is cheaper.

If you want an example of this, check the pgBouncer configuration file. There you can specify what command the pooler uses on session switch. If we're going to add something to Rails, I'd prefer something like that: a configurable option which would be no command by convention -- most likely in database.yml.

@Alamoz
Copy link

Alamoz commented Mar 15, 2014

@jberkus The Postgres connection adapter in Rails Active Record currently does included code to set Postgres character encoding, verbosity, schema search path, time zone and actually any other parameters passed to it from database.yml. I thought about enabling DISCARD ALL as a configurable option, but I'd prefer to put db specific settings outside of database.yml say in RAILS_ROOT/config/postgres.rb; this is in line with Rails convention and is more logical than placing all that stuff in database.yml (which is better for generic db config parameters.) /config/postgres.rb can contain a variable users can set to either 'DISCARD ALL' or 'DISCARD_PLAN', along with a switch indicating whether a DISCARD should be executed upon connection checkin; likewise for whether session variables should be reset. I'll look into adding this to Active Record.

@Alamoz
Copy link

Alamoz commented Mar 15, 2014

@jberkus One concern is whether any other internal state information is contained in a Postgres session, that may cause problems if not cleared? I've looked at the documentation and didn't notice anything. You are very likely able to answer this question.

@dnagir
Copy link
Contributor Author

dnagir commented Mar 17, 2014

The Rails' ideology of safe by default would contradict to NOT doing DISCARD ALL.
If a typical application doesn't change the session state and developers are 100% sure of it, then there's no problem (with or without DISCARD ALL).
But when a "non-typical" app does - it will also be safe.

There should be an option to opt out from DISCARD ALL if the developers know exactly what they are doing (much like pgbouncer allows or other pooling apps).

The safe by default (rather unsafe) was the reason why mass assignment got "replaced" by strong_parameters. Similar thoughts should be applied to DISCARD ALL as it can be a big security issue.

Also any medium to large Rails app's performance won't be affected by DISCARD ALL in general.
So the performance wouldn't be a huge issue (otherwise please show the numbers).

@jberkus
Copy link

jberkus commented Mar 17, 2014

Alamoz, I'm hampered here by not knowing how built-in connection "pooling" in Rails works; I usually use pgbouncer together with Rails. If you have a continuous PostgreSQL session, users can do things which change the state of that session, including:

  • SET user-settable configuration settings (like search_path and timezone).
  • Create temporary tables
  • Define security context
  • Create prepared statements

As I've pointed out, most Rails applications don't do the first two things because there's no way in the ORM to do them AFAIK. Dnagir points out correctly, though, that NOT doing DISCARD ALL by default is a security issue, although a weak one.

Failure to deallocate prepared statements causes two issues: (1) gradual memory leakage due to growth of the prepared statements table, and (2) the possibility that a later application session would call an earlier session's prepared statement by mistake, and cause an error. Given how Rails uses prepared statements, though, the latter isn't particularly probable.

If an application is relying on SET, though, there are much stronger possibilities for unexpected application errors. For example, the practice of partitioning by schema has become popular thanks to Instagram and Wanelo. This relies on having search_path set correctly for each user session, so sharing of one database connection by two application sessions could cause unexpected errors. Missing SET TIMEZONE is in some ways, worse, because it often won't be obvious that anything is wrong until the data is audited later. However, doing DISCARD ALL won't fix the applications which use SET with pooling; it will just make it obvious why they're broken.

Temporary tables holds a real possibility of mischief, though, that using DISCARD ALL fixes, or at least makes the application error in a sensible way instead of a way that may be hard to detect until your data is already wrong. If each application session creates a temporary table called "my_session" and then two applications swap database backends due to pooling rotation, then they could get a query which successfully returns the wrong data. Also, if database connections live for a very long time, they would accumulate a large memory allocation on the server for temporary tables.

Postgres sessions can also have a security context, mainly through SET ROLE. Again, this isn't something which Rails applications do by default. Again, if sessions are swapped, the security context would be inherited. Since any connection pool has to be the same database user, though, this doesn't actually give the application access to any data it couldn't get otherwise; this is mostly just a source of errors.

My concern about doing DISCARD ALL by default is that I know many Rails apps which spin up a database connection every time a thread starts up, whether or not it needs the database. If you have an application which often has non-database sessions, the total number of DISCARD ALLs can be a siginificant source of load (in the case I looked at, we were talking about 1.6m per hour). However, my main reason to propose not doing DISCARD ALL by default was just backwards compatibility, rather than concerns about performance, which only affect specific usage patterns.

Does that answer your question?

@Alamoz
Copy link

Alamoz commented Mar 18, 2014

Rails Active Record (the ORM) does let you set all those parameters, see the API doc at http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/PostgreSQLAdapter.html

The AR connection pool stores connections in a Ruby queue object, with the number of elements limited by the configurable pool size. A connection is acquired from the pool using the scheme: 1. using an available connection from the queue, or 2. creating a new connection on the queue if the pool is not at capacity, or 3. waiting on the queue for a connection to become available.

Regarding the concern about Rails threads acquiring connections, they would all grab a connection at boot time (if using something like Puma) and only have to attempt reconnection if the connection times out and an Active Record call occurs. If a thread is not doing any db operations, the connection could time out, at which point the pooler would execute DISCARD ALL. So that doesn't appear to be a problem. Rails apps which are scaled vertically on very large machines with many Rails threads could possibly be a concern. I would venture to guess the great majority of Rails apps scale horizontally on lots of small servers (mainly virtual servers.)

The greatest Rails Postgres security concern is when users set up database.yml with credentials for user 'postgres' and then don't change them to some other user. They do this in order to run their rake tasks to perform database operations such as initial schema creation, data population and subsequently schema updates using 'rake db:migrate' after adding or modifying tables. But that is a usage problem, not a Rails problem.

Thanks again for taking the time to answer these questions.

@matthewd
Copy link
Member

My concern about frequent DISCARD ALL is definitely performance.

Firstly, any time we do a DISCARD ALL, we must necessarily follow that with a series of SETs, which means you're talking about ~5 round-trips to the database server. (Though this could be reduced by sufficient code rearrangement, so as to mangle all the SETs into a single multi-statement string.)

Secondly, a well-behaved (or at least, optimally-behaved) application will avoid keeping a database connection checked out of the pool: ideally, it will perform a separate pool checkout for each query, immediately returning the connection thereafter (modulo transactions, of course).

FWIW, we do now do DISCARD ALL under some conditions: https://github.com/rails/rails/blob/9e457a8/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L614-620 -- but that's only called internally when a thread disappears unexpectedly and we want to take back its connection.

If an application is doing something that involves a SET or similar, it is the application's responsibility to return the connection to the pool in the same state it left... which, if it doesn't have a more efficient plan, it can achieve by simply calling #reset!.

@dnagir can you expand on "any medium to large Rails app's performance won't be affected"? I've heard complaints from such apps that the 'SELECT 1' we do on checkout is a measurable performance issue.

@jberkus
Copy link

jberkus commented Mar 18, 2014

Alamoz, it's starting to sound like we need to go over this in person. Where are you located? Going to any conferences soon?

What context does the Rails pooler operate in? Server? Process?

@jberkus
Copy link

jberkus commented Mar 18, 2014

Matthew, why would you be doing a series of SETs? What session variables are you SETing? I haven't, in general, seen Rails apps doing SET at all.

Returning stuff to the pool after one query does raise a potential performance issue with DISCARD all in high-volume environments, though. If you're doing DISCARD ALL after every other query you execute, it is going to be costly for users who are already hitting the DB server hard. Mind you, most Rails apps barely touch the DB server.

@matthewd
Copy link
Member

Sorry Josh, my "we" was ambiguous. I agree that most sane Rails apps shouldn't be doing any SETs. I meant these: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L935 -- some of which can be skipped when they're already set correctly, thanks to PQparameterStatus (I'll shortly submit a patch to do that), but that only helps when the Rails settings match your server setup.

Or should we be doing more to get those into the initial connection string, so they're kept/restored during a DISCARD?

The Rails connection pool is in-process, sharing PostgreSQLAdapter instances between threads. It's really a fairly thin wrapper around a (thread-safe) FIFO queue.

@Alamoz
Copy link

Alamoz commented Mar 18, 2014

DISCARD ALL only needs to occur when a connection is checked back into the pool or reset, definitely not after every query, that isn't necessary. The pool maintains connections to Rails threads until they time out or are disconnected. The SET statements need to be done whenever a new connection is allocated. #14392 does DISCARD ALL only when a connection is cleared, but doesn't re-run the SET statements. My intent with #14392 is to show where the DISCARD ALL should occur; the SET statements are a bit trickier and I'd prefer to see if a committer is actually interested in implementing the change before doing that work.

@Alamoz
Copy link

Alamoz commented Mar 18, 2014

@matthewd Putting the settings into the initial connection string is definitely the best way to do it. The pool will have to save the connection settings for each Rails thread, in the event a connection is lost by the pool.

Adding the DISCARD ALL, along with placing pg settings in the connection string, should have a minimal impact on performance, and making the DISCARD ALL a configurable parameter allows users to disable it if they wish.

@jberkus
Copy link

jberkus commented Mar 18, 2014

Alamoz, Matthew: This is really sounding like something we should hammer out in real time, possibly with 1 or 2 more folks who do more Rails than I do. Hangout? Skype? Conference?

@Alamoz
Copy link

Alamoz commented Mar 18, 2014

I'm open to that, but think it isn't necessary. I will be at the Postgres meeting tonight, perhaps we can have a few words at the pub afterwards.

Think of the Rails connection pool as behaving just like pgbouncer, with not as many bells and whistles (yet.)

@jberkus
Copy link

jberkus commented Mar 18, 2014

Adrien, ok, let's talk that over some more. I didn't know what you were getting at last month. BTW, tonights preso is about a high-volume Rails app.

@Alamoz
Copy link

Alamoz commented Apr 17, 2014

Issue #12867 touches on important aspects of PG connection pooling.

@rails-bot
Copy link

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@rails-bot
Copy link

This issue has been automatically closed because of inactivity.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@dnagir
Copy link
Contributor Author

dnagir commented Mar 19, 2015

Do you guys have anything new RE this issue?

@roniegh
Copy link

roniegh commented Apr 20, 2018

@tenderlove sorry for resurrecting this old thread, however connection pool is still leaking state on master.
As the same database session is used between requests, temporary objects (in pg_temp schema) created during one request are visible to next requests that get that same connection from the pool.
One way this issue can be solved is by adding DISCARD ALL to the ActiveRecord::ConnectionAdapters::ConnectionPool#checkin method which is called to check-in a database connection back into the pool.
As DISCARD ALL resets the session to its initial state, the configure_connection method would have to be called when a database connection is checked out from the pool.

An initializer with the ActiveRecord::Base.connection.class.checkin callback below can be used as a temporary fix:

if defined?(::PhusionPassenger)
  # Phusion Passenger until at least version 5.2.3 calls the ActiveRecord::Base.clear_all_connections! method
  # after instead of before forking the parent process. As PostgreSQL creates connections with the FD_CLOEXEC
  # (close-on-exec) flag, the parent process connections are closed when the child process is executed.
  #
  # TODO: Remove this conditional when the aforementioned Phusion Passenger bug is fixed
  _checkin_block = lambda do |_connection|
    begin
      _connection.raw_connection.exec('DISCARD ALL')
    rescue
      nil
    end
  end

  _checkout_block = lambda do |_connection|
    begin
      _connection.__send__(:configure_connection)
    rescue
      nil
    end
  end
else
  _checkin_block = lambda do |_connection|
    _connection.raw_connection.exec('DISCARD ALL')
  end

  _checkout_block = lambda do |_connection|
    _connection.__send__(:configure_connection)
  end
end

if Rails::VERSION::STRING < "3"
  ::ActiveRecord::Base.connection.class.set_callback(:checkin, :before, &_checkin_block)
  ::ActiveRecord::Base.connection.class.set_callback(:checkout, :before, &_checkout_block)
else
  ::ActiveRecord::Base.connection.class.checkin(&_checkin_block)
  ::ActiveRecord::Base.connection.class.checkout(&_checkout_block)
end

@rails-bot rails-bot bot removed the stale label Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants