New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metabase consuming lots of db connections #1702

Closed
agilliland opened this Issue Jan 5, 2016 · 13 comments

Comments

Projects
None yet
5 participants
@agilliland
Contributor

agilliland commented Jan 5, 2016

I've recently seen a situation where a Metabase instance connected to a Postgres db had a decent pile of stale connections that had built up and weren't being closed (30 or so).

This is a sample row of output from the postgres console when running select * from pg_stat_activity;

24597 | <dbname> | 8079 |    16391 | <dbuser> |                  | <ip address> |                                    |       55363 | 2016-01-05 13:45:02.365314+00 |                               | 2016-01-05 13:45:02.38192+00  | 2016-01-05 13:45:02.381951+00 | f       | idle   |             |              | SHOW TRANSACTION ISOLATION LEVEL

Over time these connections build up and while they seem pretty innocuous they are not ideal and can cause problems with max db connections limits.

@camsaul camsaul changed the title from Metabase db connections get stuck on `SHOW TRANSACTION ISOLATION LEVEL` to Metabase leaking connections Jan 20, 2016

@camsaul camsaul added the Performance label Jan 20, 2016

@agilliland agilliland self-assigned this Jan 26, 2016

@camsaul

This comment has been minimized.

Show comment
Hide comment
@camsaul

camsaul Jan 26, 2016

Member

Not sure this is actually a bug.

Check out https://github.com/metabase/metabase/blob/master/src/metabase/driver/generic_sql.clj#L81

That's the code for creating connection pools. We keep them for a while by default for performance.

Member

camsaul commented Jan 26, 2016

Not sure this is actually a bug.

Check out https://github.com/metabase/metabase/blob/master/src/metabase/driver/generic_sql.clj#L81

That's the code for creating connection pools. We keep them for a while by default for performance.

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Jan 26, 2016

Contributor

We should dig into it further though. It's fine that we try to keep around connection pools to the databases for a little while so that we can reuse them, but in the case reported in this issue i've seen 40-50 open connections on a single database which is too many and i've seen them piled up like that after the MB instance hasn't been used in days.

So I'm inclined to believe that something is holding on to connections when it shouldn't be.

Contributor

agilliland commented Jan 26, 2016

We should dig into it further though. It's fine that we try to keep around connection pools to the databases for a little while so that we can reuse them, but in the case reported in this issue i've seen 40-50 open connections on a single database which is too many and i've seen them piled up like that after the MB instance hasn't been used in days.

So I'm inclined to believe that something is holding on to connections when it shouldn't be.

@agilliland agilliland changed the title from Metabase leaking connections to Metabase consuming lots of db connections Feb 3, 2016

@manurana

This comment has been minimized.

Show comment
Hide comment
@manurana

manurana Feb 9, 2016

+1 on this.
We are seeing MB consume all available connections on a PostgreSQL RDS DB. Both the MB metadata DB and our data DB are on the same instance.

manurana commented Feb 9, 2016

+1 on this.
We are seeing MB consume all available connections on a PostgreSQL RDS DB. Both the MB metadata DB and our data DB are on the same instance.

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Feb 9, 2016

Contributor

@manurana can you provide me with a bit more information about what you are seeing? I was actually going to close this ticket pretty soon because I'm not able to reproduce this issue and despite seeing some occasional increases in connections due to usage it always comes back down to a normal level.

How many connections are you seeing? Do the connections ever drop back down and normalize?

Contributor

agilliland commented Feb 9, 2016

@manurana can you provide me with a bit more information about what you are seeing? I was actually going to close this ticket pretty soon because I'm not able to reproduce this issue and despite seeing some occasional increases in connections due to usage it always comes back down to a normal level.

How many connections are you seeing? Do the connections ever drop back down and normalize?

@manurana

This comment has been minimized.

Show comment
Hide comment
@manurana

manurana Feb 9, 2016

@agilliland : As I mentioned, we have MB running on an EC2, and we are using the same RDS PostgreSQL DB instance as a store for both the metadata (db name is metabase) and the analysed data (db name is unodev). We found we were unable to connect to the DB through any other connection. The error we got was something about only 3 reserved ports being available for root user. At that point we had about 4 users on the system.

On investigation, we found that the DB had a connection limit of 23 (its an RDS setting based on memory size), and MB was consuming all of these.

This was about 2 hours ago. I just looked, and 11 connections are still being consumed by MB. Only I am using the system. Attaching the results of select * from pg_stat_activity;
Here 172.31.23.222 is our MB server.
connections.txt

Will attach another file when I see the connections max out again.

manurana commented Feb 9, 2016

@agilliland : As I mentioned, we have MB running on an EC2, and we are using the same RDS PostgreSQL DB instance as a store for both the metadata (db name is metabase) and the analysed data (db name is unodev). We found we were unable to connect to the DB through any other connection. The error we got was something about only 3 reserved ports being available for root user. At that point we had about 4 users on the system.

On investigation, we found that the DB had a connection limit of 23 (its an RDS setting based on memory size), and MB was consuming all of these.

This was about 2 hours ago. I just looked, and 11 connections are still being consumed by MB. Only I am using the system. Attaching the results of select * from pg_stat_activity;
Here 172.31.23.222 is our MB server.
connections.txt

Will attach another file when I see the connections max out again.

@agilliland agilliland added Limitation and removed Bug labels Feb 11, 2016

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Feb 11, 2016

Contributor

thanks @manurana , I'm going to reclassify this as a limitation for now because from my investigation and your description it sounds like things are functioning okay, but there are cases where the connection pool grows larger than expected or desired.

i'm going to investigate a bit more and see what possible tweaks we can make to avoid this happening. it's possible our connection pooling is being a bit over aggressive. there may also be cases like yours where it would help to offer user controls over the pool size if the database has minimal connections available.

Contributor

agilliland commented Feb 11, 2016

thanks @manurana , I'm going to reclassify this as a limitation for now because from my investigation and your description it sounds like things are functioning okay, but there are cases where the connection pool grows larger than expected or desired.

i'm going to investigate a bit more and see what possible tweaks we can make to avoid this happening. it's possible our connection pooling is being a bit over aggressive. there may also be cases like yours where it would help to offer user controls over the pool size if the database has minimal connections available.

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Feb 14, 2016

Contributor

For SQL databases we are currently creating connection pools that last for 3 hours. I see no reason why we can't safely reduce that to 1 hour which will help with this issue.

I'm going to keep looking at some other settings though. I've seen cases where the connection pool doesn't reclaim connections as reliably as I think it should and so you end up with pools that have more open connections than are really necessary.

Contributor

agilliland commented Feb 14, 2016

For SQL databases we are currently creating connection pools that last for 3 hours. I see no reason why we can't safely reduce that to 1 hour which will help with this issue.

I'm going to keep looking at some other settings though. I've seen cases where the connection pool doesn't reclaim connections as reliably as I think it should and so you end up with pools that have more open connections than are really necessary.

@agilliland agilliland added this to the 0.15.0 milestone Feb 16, 2016

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Feb 23, 2016

Contributor

I added a couple other settings to help with reclaiming extra connections a bit more aggressively now. I'm feeling like this issue can be closed at this point and if anything comes up again we can reopen.

Contributor

agilliland commented Feb 23, 2016

I added a couple other settings to help with reclaiming extra connections a bit more aggressively now. I'm feeling like this issue can be closed at this point and if anything comes up again we can reopen.

@agilliland agilliland closed this Feb 23, 2016

@etlweather

This comment has been minimized.

Show comment
Hide comment
@etlweather

etlweather Mar 13, 2016

Contributor

What settings can the SysAdmin use to control the pool? I would like to set a max size on it.

Contributor

etlweather commented Mar 13, 2016

What settings can the SysAdmin use to control the pool? I would like to set a max size on it.

@etlweather

This comment has been minimized.

Show comment
Hide comment
@etlweather

etlweather Mar 14, 2016

Contributor

@agilliland - you said:

I added a couple other settings to help with reclaiming extra connections a bit more aggressively now.

I am interested in finding out what database settings I can change as SysAdmin to control the connection pool - couldn't find anything in the docs. Thanks!

Contributor

etlweather commented Mar 14, 2016

@agilliland - you said:

I added a couple other settings to help with reclaiming extra connections a bit more aggressively now.

I am interested in finding out what database settings I can change as SysAdmin to control the connection pool - couldn't find anything in the docs. Thanks!

@manurana

This comment has been minimized.

Show comment
Hide comment
@manurana

manurana Mar 16, 2016

Hi @agilliland : Wanted to report back after upgrading to 0.15.1. Still seeing very aggressive use of DB connections by MB. Could be we are using MB a lot, but need a way to limit this. I would rather some users not see MB reports, than other processes not be able to connect to my DB.

This is happening on both PostgreSQL DB instance (which is the meta data DB also) as well as a mySQL DB instance.

So + 1 to what @etlweather said.

BTW : great work on the multi-line graphs ! thanks.

manurana commented Mar 16, 2016

Hi @agilliland : Wanted to report back after upgrading to 0.15.1. Still seeing very aggressive use of DB connections by MB. Could be we are using MB a lot, but need a way to limit this. I would rather some users not see MB reports, than other processes not be able to connect to my DB.

This is happening on both PostgreSQL DB instance (which is the meta data DB also) as well as a mySQL DB instance.

So + 1 to what @etlweather said.

BTW : great work on the multi-line graphs ! thanks.

@agilliland

This comment has been minimized.

Show comment
Hide comment
@agilliland

agilliland Mar 16, 2016

Contributor

There is a separate issue from this one that still needs to be corrected, so I plan to address it as part of 0.16 and include it in our next release. #2038

Feel free to add additional comments there, but after fixing that issue i'm confident the connection usage will be more normal.

Contributor

agilliland commented Mar 16, 2016

There is a separate issue from this one that still needs to be corrected, so I plan to address it as part of 0.16 and include it in our next release. #2038

Feel free to add additional comments there, but after fixing that issue i'm confident the connection usage will be more normal.

@MrMauricioLeite

This comment has been minimized.

Show comment
Hide comment
@MrMauricioLeite

MrMauricioLeite commented Mar 17, 2016

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment