New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep connection pool errors occurring #1628

Closed
DJingSeo opened this Issue Jun 30, 2017 · 8 comments

Comments

Projects
None yet
3 participants
@DJingSeo

DJingSeo commented Jun 30, 2017

Steps to reproduce

Currently I have been using npgsql to populate pgsql records about 120,000 records per hour.

It happens approximately once two weeks so it's hard to reproduce and the npgsql detail log itself is too big to store.

The issue

Once npgsql throws connection pool exhausted exceptions, it keeps throwing the exceptions continuously by the time I reset the IIS manually. In the meantime, the API for resetting connection pools of npgsql doesn't solve the symptoms, either.

Exception message: The connection pool has been exhausted, either raise MaxPoolSize
Stack trace:

Further technical details

Npgsql version: 3.2.2
.NET framework version: 4.5.2
Operating system: Windows Server 2008
Only async methods are used

It would be appreciated if you advise me anything related to the issue.

@roji

This comment has been minimized.

Show comment
Hide comment
@roji

roji Jun 30, 2017

Member

@DJingSeo just so I'm sure I understand:

  • Is the first pool exhaustion legitimate? I mean, is it possible that your application does actually exhaust the pool?
  • Are you saying that once you got that first (legitimate) exhaustion exception, the pool somehow enters a bad state where it starts throwing non-legitimate exceptions (i.e. you're sure that you've closed other connections which are now idle in the pool and which should get reused)?
  • Calling NpgsqlConnection.ClearPool() in this state doesn't change anything, after this call non-legitimate exhaustion exceptions continue to be thrown
  • The only way to resolve this is to restart the application entirely (IIS reset)
Member

roji commented Jun 30, 2017

@DJingSeo just so I'm sure I understand:

  • Is the first pool exhaustion legitimate? I mean, is it possible that your application does actually exhaust the pool?
  • Are you saying that once you got that first (legitimate) exhaustion exception, the pool somehow enters a bad state where it starts throwing non-legitimate exceptions (i.e. you're sure that you've closed other connections which are now idle in the pool and which should get reused)?
  • Calling NpgsqlConnection.ClearPool() in this state doesn't change anything, after this call non-legitimate exhaustion exceptions continue to be thrown
  • The only way to resolve this is to restart the application entirely (IIS reset)
@DJingSeo

This comment has been minimized.

Show comment
Hide comment
@DJingSeo

DJingSeo Jun 30, 2017

  • It happens when applications get a lot of traffic not always but sometimes so I believe I can say the first pool exhaustion is a kind of legitimation.

  • Yes and I'm pretty sure that I open every single connection right before executing a query and close the connection right after the execution. (I'm executing queries by using Dapper and Npgsql) I can post the snipet of code which I've used if you want.

  • Good to know

  • Correct

I have some pgsql databases in Amazon. One of them is used as a global database and the others are for each region like Oregon, N. Virginia and so on. For this reason, a application has four npgsql connection strings and the details are as followings:

  1. selecting records from the global database
  2. populating records to the global database
  3. selecting records from a regional database
  4. populating records to a regional database

The majority of the issue has occurred during 'Populating records to a regional database' but selecting has also rarely suffered from the same issue.

One more thing is that the issue wih selection looks being restored a few mins after it gets the first exhaustion.(I mean it didn't require IIS reset) The issue with insertion, however, doesn't look being restored as I posted first. The issue can be related to the execution time but I'm not sure about the cause.

Feel free to let me know if you need anything from me.

DJingSeo commented Jun 30, 2017

  • It happens when applications get a lot of traffic not always but sometimes so I believe I can say the first pool exhaustion is a kind of legitimation.

  • Yes and I'm pretty sure that I open every single connection right before executing a query and close the connection right after the execution. (I'm executing queries by using Dapper and Npgsql) I can post the snipet of code which I've used if you want.

  • Good to know

  • Correct

I have some pgsql databases in Amazon. One of them is used as a global database and the others are for each region like Oregon, N. Virginia and so on. For this reason, a application has four npgsql connection strings and the details are as followings:

  1. selecting records from the global database
  2. populating records to the global database
  3. selecting records from a regional database
  4. populating records to a regional database

The majority of the issue has occurred during 'Populating records to a regional database' but selecting has also rarely suffered from the same issue.

One more thing is that the issue wih selection looks being restored a few mins after it gets the first exhaustion.(I mean it didn't require IIS reset) The issue with insertion, however, doesn't look being restored as I posted first. The issue can be related to the execution time but I'm not sure about the cause.

Feel free to let me know if you need anything from me.

@roji

This comment has been minimized.

Show comment
Hide comment
@roji

roji Jun 30, 2017

Member

Calling NpgsqlConnection.ClearPool() in this state doesn't change anything, after this call non-legitimate exhaustion exceptions continue to be thrown

Good to know

That was a question, not a piece of information :) I just wanted to make sure that after calling NpgsqlConnection.ClearPool() the pool remains in the bad state.

Unfortunately there's not much I can do without some sort of hint or code sample which would allow me to reproduce the issue... A look at the pool code doesn't reveal anything obvious bug that would explain this, and it's the first time someone has complained about it.

Are you sure there's no chance all these exceptions are simply legitimate, i.e. that you're trying to open more connections than allowed by your Max Pool Size parameter? The fact that the issue disappears after a few minutes seems to point that way - if an actual bug in the pool led it to enter a bad state, it's unlikely it would exit that bad state on its own after some time...

Member

roji commented Jun 30, 2017

Calling NpgsqlConnection.ClearPool() in this state doesn't change anything, after this call non-legitimate exhaustion exceptions continue to be thrown

Good to know

That was a question, not a piece of information :) I just wanted to make sure that after calling NpgsqlConnection.ClearPool() the pool remains in the bad state.

Unfortunately there's not much I can do without some sort of hint or code sample which would allow me to reproduce the issue... A look at the pool code doesn't reveal anything obvious bug that would explain this, and it's the first time someone has complained about it.

Are you sure there's no chance all these exceptions are simply legitimate, i.e. that you're trying to open more connections than allowed by your Max Pool Size parameter? The fact that the issue disappears after a few minutes seems to point that way - if an actual bug in the pool led it to enter a bad state, it's unlikely it would exit that bad state on its own after some time...

@DJingSeo

This comment has been minimized.

Show comment
Hide comment
@DJingSeo

DJingSeo Jun 30, 2017

I tried to make the pools reviving by calling "Npgsql.NpgsqlConnection.ClearAllPools();" few weeks ago but it kept throwing the exceptions.

I had tried to reproduce this issue for more than 4 months by using Jmeter and other load test tools but I failed. That's why I didn't raise this issue before. :-(

I'm not sure that the exceptions are legitimate. The issue doesn't disappear in most times without resetting IIS.

I can get the detail log of npgsql when it happens not from the beginning of a web application but during the issue is happening in a server. Do you think this log can help you to investigate somehow? If so, I believe I might be able to provide it in few weeks.

DJingSeo commented Jun 30, 2017

I tried to make the pools reviving by calling "Npgsql.NpgsqlConnection.ClearAllPools();" few weeks ago but it kept throwing the exceptions.

I had tried to reproduce this issue for more than 4 months by using Jmeter and other load test tools but I failed. That's why I didn't raise this issue before. :-(

I'm not sure that the exceptions are legitimate. The issue doesn't disappear in most times without resetting IIS.

I can get the detail log of npgsql when it happens not from the beginning of a web application but during the issue is happening in a server. Do you think this log can help you to investigate somehow? If so, I believe I might be able to provide it in few weeks.

@roji

This comment has been minimized.

Show comment
Hide comment
@roji

roji Jun 30, 2017

Member

I'm not sure what the log would teach us, but it's worth a try. First, if there's some edge case where you're forgetting to properly close connections (i.e. connection leak), the logs could help diagnose that - although that doesn't sit well with the exhausted state disappearing on its own. If we find some exception in the log obviously that would be great, but other than that....

But it's worth a try, especially if you can get logs with loglevel trace. Other than that there's really nothing much I can do.

Member

roji commented Jun 30, 2017

I'm not sure what the log would teach us, but it's worth a try. First, if there's some edge case where you're forgetting to properly close connections (i.e. connection leak), the logs could help diagnose that - although that doesn't sit well with the exhausted state disappearing on its own. If we find some exception in the log obviously that would be great, but other than that....

But it's worth a try, especially if you can get logs with loglevel trace. Other than that there's really nothing much I can do.

@DJingSeo

This comment has been minimized.

Show comment
Hide comment
@DJingSeo

DJingSeo Jul 1, 2017

I will come back with the log when it happens.

Thanks for your consideration on this matter :-)

DJingSeo commented Jul 1, 2017

I will come back with the log when it happens.

Thanks for your consideration on this matter :-)

@roji

This comment has been minimized.

Show comment
Hide comment
@roji

roji Nov 12, 2017

Member

Closing for age, but don't hesitate to post back here if you have more info.

Member

roji commented Nov 12, 2017

Closing for age, but don't hesitate to post back here if you have more info.

@roji roji closed this Nov 12, 2017

@roji roji added invalid and removed waiting for answer labels Nov 12, 2017

@patrickkferguson

This comment has been minimized.

Show comment
Hide comment
@patrickkferguson

patrickkferguson Nov 12, 2017

@roji This was probably the same issue that I raised in #1657 (which you have fixed). When the task transition exception is thrown, the busy counter is not decremented. If it happens enough, then busy will be stuck at max and any attempt to open a new connection will timeout.

patrickkferguson commented Nov 12, 2017

@roji This was probably the same issue that I raised in #1657 (which you have fixed). When the task transition exception is thrown, the busy counter is not decremented. If it happens enough, then busy will be stuck at max and any attempt to open a new connection will timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment