New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepared statement eat lots of memories #14645

Closed
ajoulie opened this Issue Apr 8, 2014 · 26 comments

Comments

Projects
None yet
@ajoulie

ajoulie commented Apr 8, 2014

The problem

Summary :
In certain cases, Rails4 generates prepared statements where certain variables are not bound but hardcoded in the query instead.

  1. This seems to defeat the purpose of prepared statements, by generating an infinite number of "single usage" prepared statements
  2. The immediate consequence is that this bloats the database server prepared statements cache very fast, with thousands of unique prepared statements, ending up with an OUT OF MEMORY exception.

Solution :
Either bind all variables (preferred), or do not use prepared statements for queries where certain variables cannot be bound.

Example :

The model :

Class User
  has_many :addresses, -> where(current:true)
end
a_user.addresses

generates the following query :

SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" = $1 AND "addresses"."current" = 't'  [["user_id", 1]]

This statement is executed as a prepared statement, but column "current" is not treated as a bound variable.
This is ok when current is a boolean, a most two prepared statements will be used. But when it is a timestamp, a different prepared statement will be used (and saved into postgresql) for each request.

More generally, only a few columns can be bound variables. This leads to lots of useless prepared statement into database, especially, if you have conditions on timestamp or subquery.

Futhermore, the limit size of StatementPool is 1000. With postgresql this is very high due to the fact that prepared statements are not shared between connections : the prepared statement cache is per connection / backend. This leads quite fast to a very high memory usage on the postgresql side.

How did we find this :
Our application used to work well with Rails 3.1 and 3.2. After upgrading to Rails 4, we experienced a memory leak on the db side : after about 1 hour of running, where we could observe the db memory grow and grow, the database ended up by returning OUT OF MEMORY errors. We finally figured out it was the number of prepared statements that exploded and was bloating the backend prepared statements cache. It seems to come from the fact that Rails 4 makes a wider use of prepared statements. While this is a great idea (thanks @tenderlove to bring that is Rails), it can very be problematic if there is no way to bind variables all variables properly.

If you have timestamp into some queries and if by no chance those queries are big, memory consumption of postgresql server increases drastically as well. In our case, we have 30 postgresql connections, with Rails 4, due to prepared statements, the memory used by the database server moves from 500 KB to more than 3 GB in about 1 hour, leading to out of memory errors.
Here is a stackoverflow post where we explain the symptoms http://bit.ly/1hcG1ba

Thanks for reading, hope it helps.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Apr 8, 2014

@ajoulie thank you for reporting this.

@tenderlove does your work with adequaterecord improve something related with this or it will only make it worse?

@tenderlove

This comment has been minimized.

Member

tenderlove commented Apr 8, 2014

Ya, adequate binds all values unless you interpolate the value yourself. But in that case we cannot detect it. :-(

The other thing you can do to mitigate this problem is to disable prepared statements.

I think the statement pool size is configurable too? Maybe?

Aaron Patterson
http://tenderlovemaking.com/
I'm on an iPhone so I apologize for top posting.

On Apr 8, 2014, at 9:32 AM, Rafael Mendonça França notifications@github.com wrote:

@ajoulie thank you for reporting this.

@tenderlove does your work with adequaterecord improve something related with this or it will only make it worse?


Reply to this email directly or view it on GitHub.

@ajoulie

This comment has been minimized.

ajoulie commented Apr 8, 2014

Thank for answers.

We already deactivated the prepared statements to fix the problem. But what
a pity not to be able to use this great feature :-(

I will have a closer look to adequat than I already had.

I looked for statements pool size configuration into source code. As far as
I remember it was hard coded. I will have another look at this to confirm.

Edited :
Sorry, indeed there is a config : statement_limit in database.yml.

Le 8 avr. 2014 18:45, "Aaron Patterson" notifications@github.com a écrit :

Ya, adequate binds all values unless you interpolate the value yourself.
But in that case we cannot detect it. :-(

The other thing you can do to mitigate this problem is to disable prepared
statements.

I think the statement pool size is configurable too? Maybe?

Aaron Patterson
http://tenderlovemaking.com/
I'm on an iPhone so I apologize for top posting.

On Apr 8, 2014, at 9:32 AM, Rafael Mendonça França <
notifications@github.com> wrote:

@ajoulie thank you for reporting this.

@tenderlove does your work with adequaterecord improve something related
with this or it will only make it worse?

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHubhttps://github.com//issues/14645#issuecomment-39872099
.

@schneems

This comment has been minimized.

Member

schneems commented Apr 15, 2014

Yes indeed it looks like statement_limit might be what we're looking for https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L363-L364 maybe manually lowering this may give you back some of the benefits of prepared statements without so much memory bloat, or it may just be instantly filled with relatively useless one off prepared statements.

I noticed we don't have any docs on this config for statement_limit. I'm working on a docrails commit to add it to the guides http://guides.rubyonrails.org/configuring.html. Is there a cannonical location for config docs inside of AR? Maybe we could lower the default limit, i can try to pull in some advice from our postgres team. Even if all of the prepared statements weren't one off-s it seems 1000 legitimate prepared statements would also bring down their postgres.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Apr 15, 2014

Is there a cannonical location for config docs inside of AR?

Right now we don't have. Only configuration guide.

Maybe we could lower the default limit, i can try to pull in some advice from our postgres team.

Seems good.

@schneems

This comment has been minimized.

Member

schneems commented Apr 15, 2014

New docs are here: rails/docrails@c5e083a

@rails-bot rails-bot added the stale label Aug 19, 2014

@rails-bot

This comment has been minimized.

rails-bot commented Aug 19, 2014

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@rails-bot rails-bot closed this Nov 19, 2014

@rails-bot

This comment has been minimized.

rails-bot commented Nov 19, 2014

This issue has been automatically closed because of inactivity.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@lsimoneau

This comment has been minimized.

lsimoneau commented Sep 23, 2015

Apologies if this is the wrong place to ask this, but I'm curious about this as we were just bitten by this issue. It seems like the use cases of where(user_id: user.id).where("created_at > ?", Time.now) for example would be quite common, and for that to unexpectedly start eating up memory on the DB server to the point of bringing it down seems like something that perhaps shouldn't be "on by default"?

@lsimoneau

This comment has been minimized.

lsimoneau commented Sep 28, 2015

Per @sgrif's request, I've put up a demo app that illustrates this issue here: https://github.com/lsimoneau/prepared_statement_example

Refreshing the /posts page results in an accumulation of essentially single-use prepared statements on the connection.

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 28, 2015

@lsmoneau Thank you for taking the time to do this, I'll look at it in more depth soon.

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 28, 2015

@lsimoneau ----^ (sorry on my phone)

@SamSaffron

This comment has been minimized.

Contributor

SamSaffron commented Oct 19, 2015

see also

#21992

@JoeMcB

This comment has been minimized.

JoeMcB commented Mar 23, 2016

I recently ran into this issue as well. Like @lsimoneau's example, ActiveRecord is not binding date parameters into a general prepared statement and is instead creating a new prepared statement with the date present as sql. An example from our app (Rails 4.2.5.1, Ruby 2.2.2):

design.campaigns.where('end_time > ?', Time.zone.now).where(cancelled: nil) winds up creating the query
SELECT "campaigns".* FROM "campaigns" WHERE "campaigns"."design_id" = $1 AND (end_time > '2016-03-24 03:51:32.48373') AND "campaigns"."cancelled" IS NULL This was pulled straight from pg_stat_activity in Postgres 9.3.

I'm not sure if that is the expected behavior when binding a date in an ActiveRecord query, but in our application this was generating a new prepared statement for every lookup. It generated so many, in fact, that our database was running out of memory as it's prepared statement cache was being filled up with date-specific prepared statements.

We fixed our issue at the application layer, using Time.zone.end_of_day as the cut off so we're not duplicating prepared statements. Still, this seems like something that needs attention.

@schneems

This comment has been minimized.

Member

schneems commented Mar 23, 2016

Any idea if you get the same result in Rails 5? Lots of improvements have gone into Rails5 (i.e. master) that didn't make it into 4.2. If you can reproduce the behavior in Rails 5 I think it's worth opening a new issue for this.

@ghiculescu

This comment has been minimized.

Contributor

ghiculescu commented Jun 30, 2016

Just as another anecdote, we disabled prepared statements based on this issue, and saw this:

image

The cliff at 2:15pm is a deploy (hence all AR connections restarted). The next restart was at 3:15, and memory usage has been totally flat since!

Looking forward to testing out rails 5 to see if we can bring back prepared statements then.

@schneems

This comment has been minimized.

Member

schneems commented Jun 30, 2016

@ghiculescu Any idea if you get the same result in Rails 5 RC2 ? Lots of improvements have gone into Rails5 that didn't make it into 4.2. If you can reproduce the behavior in Rails 5 I think it's worth opening a new issue for this.

@ghiculescu

This comment has been minimized.

Contributor

ghiculescu commented Jun 30, 2016

@schneems now that Rails 5 is live (yay!) I intend to try it soon and find out. I will let you know how I go!

@SamSaffron

This comment has been minimized.

Contributor

SamSaffron commented Jul 1, 2016

For our case (Discourse) this is a non-starter.

All our sites use pgbouncer, without it we eat up way too many DB
connections (and PG gets very very weird if you have too many connections
going - 50 is fine, 10000 is a disaster)

If we were to eliminated pgbouncer, I still would have some concerns around
memory, especially since we have around 10-80 active connection going to pg
going at any point in time for one site. This is cause of threads and
sidekiq (each web has a light weight background job thread).

Clearly the more connections you have the higher risk there is that you
have a "prepared statement" stored that is never going to be used and just
use up memory.

To add to that in some conditions preparing can lead to enormous amounts of
pain especially if the amount of data the prepared statement returns highly
depends on the params sent to it.

On Fri, Jul 1, 2016 at 8:54 AM, Alex Ghiculescu notifications@github.com
wrote:

@schneems https://github.com/schneems now that Rails 5 is live (yay!) I
intend to try it soon and find out. I will let you know how I go!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#14645 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAAUXd4VT9g9fkYltznIty9FxeEwYhkfks5qREkvgaJpZM4Bwq4i
.

@joevandyk

This comment has been minimized.

Contributor

joevandyk commented Aug 1, 2016

@ghiculescu did you get a chance to try rails 5?

@ghiculescu

This comment has been minimized.

Contributor

ghiculescu commented Aug 1, 2016

Not yet :( need to migrate off protected attributes so that might take a while.

@mbhnyc

This comment has been minimized.

Contributor

mbhnyc commented Nov 29, 2016

We seem to be hitting this as well, wondering if anyone has reproduced this on 5 yet, before I start descending down more complex solutions. :)

@silva96

This comment has been minimized.

silva96 commented Feb 13, 2017

@ghiculescu did you get a chance to try rails 5?

@Blackening999

This comment has been minimized.

Blackening999 commented Feb 23, 2017

@silva96 do they have a solution on Rails5?

@allaire

This comment has been minimized.

allaire commented May 30, 2017

Thanks so much for this thread, we also experienced this bug in production.

Instead of completely disable prepared statement, we decided to try statement_limit: 200 and it worked flawlessly! We're going to keep playing with the value to find the perfect balance, but so far so good!

@rails-bot rails-bot bot removed the stale label May 30, 2017

@silva96

This comment has been minimized.

silva96 commented May 30, 2017

We upgraded our app from rails 4.0.2 to 5.1 (a loooong process by the way) and enabled back the prepared statements.

We are definitely not having any issues with them anymore.

image

Memory increases (and immediately drops) on each deploy, since connections are closed and re opened. But after that, the memory stays pretty much stable.

Before the upgrade the memory would be consumed in like 4 days. (we have 1 db server, 3 app servers with 4 puma workers with 4 threads and 20 sidekiq workers on each, thats 108 connections to the db that can store up to 1000 queries each)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment