Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate way to close or log warnings for unclosed connections #39

Closed
xxgreg opened this issue Oct 28, 2014 · 10 comments
Closed

Investigate way to close or log warnings for unclosed connections #39

xxgreg opened this issue Oct 28, 2014 · 10 comments

Comments

@xxgreg
Copy link
Owner

xxgreg commented Oct 28, 2014

Original discussion on Dart mailing list.

https://groups.google.com/a/dartlang.org/forum/#!topic/misc/0MKZAIxvme0

I wonder what the typical solution to this is in other mature database libraries? i.e. JDBC drivers. Probably some sort of inactivity timeout on a connection. Perhaps a first step would be just to log warnings for unclosed connections after a minute or two, so that it's possible to diagnose this before running out of connections in production.

There is a huge pressure on me to fix this, so these days I'll work on it heavily and if I come up with a solution will let you know. Here is an interesting post on stackoverflow -> http://stackoverflow.com/questions/13236160/is-there-a-timeout-for-idle-postgresql-connections. It seems that JDBC drivers can leak connections too, so it should be managed in the app i guess.

@xxgreg
Copy link
Owner Author

xxgreg commented Oct 28, 2014

I wonder if Zones could be help solve this.

So in a web application handle each HTTP request in a new zone. On leaving a zone check that all connections are closed, if not close them and log an error message.

@pisabev
Copy link
Contributor

pisabev commented Oct 28, 2014

Yes, I use zones to watch out for this problem and is exactly in the way you said - a zone wraps each http request and is a life saver. But as the application grows, and database access can be initiated on some server events too (not only http requests) you need to be sure to wrap it every time with a zone - which is perfectly understandable but on rapid development on new features we can miss this, that's why I started to look on some lower level "backup solution". The app is build in a way that logs each Exception, and for now the log is clear of errors, but still have some random pool locks and is very hard to track the problem. To give you some more insight ot the app complexity - we run 3 http listeners - one for admin panel, one for rest api communication and one for public e-commerce.

I've used this recursive method to obtain connection and assure "never locking process":
Future connect() {
return _pool.connect()
.timeout(new Duration(milliseconds:500), onTimeout:() {
//on timeout we close all connections and call connect again
_pool.destroy();
_createPool();
log.warning('pool destroyed (probably connections leak)');
return connect();
})
.catchError((e) => log.severe(e));
}
But is too "hacky and rude" and sometimes kill working connections.

This reminds me of another think that should be done to the driver - it's marked as TODO but it seems really easy to be done, not sure for what reasons you left it with print - logging?

@xxgreg
Copy link
Owner Author

xxgreg commented Nov 6, 2014

Sorry about the slow reply.

Are you using the timeout setting on the connection pool? This is supposed to automatically close a connection if it is not returned to the pool within the defined timeout period. This can be set for the pool, or for a specific connection. I say supposed, because I have not tested this feature thoroughly. To use this pass the timeout parameter to the Pool constructor, and the pool.connect() method.

This should help if there is occasionally a leaked connection. However if connections are leaked too quickly then there will still be a problem with running out of connections.

I've also made a new ticket for the print statements #40.

I'm spending a bit of time re-reading the pooling code and trying to understand this problem. Just to make sure I understand it - are you saying that Pool.connect() is returning a Future which never completes, and this is what causes the application to hang - and your suspicion is that this is caused by connections accidentally not being closed.

@pisabev
Copy link
Contributor

pisabev commented Nov 6, 2014

Hi again,
About your last paragraph - that's exactly what I think. I know for sure that this future is "taking forever", and I'm not sure for the exact reason, but most logical is connection not closed and eventualy leading to leakage. I revised the pool connection code too several times and found nothing suspicous.

I use this method and not the connection timeout provided with the pool, as it seems to me that it's better to check for a connection creation timeout than forcing timeout on a connection execution.
These days I tried some testings - a pool with 2 max connections (to get a quick leake) and heavy siege and ab testing, but everything seems ok and yet on a production server I once a day hit the problem. It seems that there is some special request situation which I'm missing.

@pisabev
Copy link
Contributor

pisabev commented Nov 6, 2014

One other thing... Would you consider adding public getters for available/connecting/connections on the Pool object - I think it will be easier for profiling the pool than using the toString() and regexing.

@xxgreg
Copy link
Owner Author

xxgreg commented Nov 6, 2014

I've written some tests. The timeout feature isn't working. Once I get a fix for that, I'd recommend trying that out. Set it to a really high value, such as a few minutes.

Another idea which might help with tracking down unclosed connections - add an optional string id parameter to the connect() method. Each place you call connect in your code pass a different id. Then add some diagnostics to the pool to print out a list of the current used connections.

I'll see if I can whip something up.

@xxgreg
Copy link
Owner Author

xxgreg commented Nov 6, 2014

I've committed a fix for the connection close timeout feature.

I've also added public getters for the diagnostics, including the ability to set a "debugId" for a connection. This should be able to help you track which connections are leaking.

@xxgreg
Copy link
Owner Author

xxgreg commented Nov 6, 2014

Do you think implementing some of the features in this Java connection pool would help you track down the leaks?

https://github.com/brettwooldridge/HikariCP#frequently-used

@pisabev
Copy link
Contributor

pisabev commented Nov 7, 2014

Interesting list.
I think these two will solve or help find problems like this:

  • maxLifetime
    "This property controls the maximum lifetime of a connection in the pool. When a connection reaches this timeout, even if recently used, it will be retired from the pool. An in-use connection will never be retired, only when it is idle will it be removed. We strongly recommend setting this value, and using something reasonable like 30 minutes or 1 hour. A value of 0 indicates no maximum lifetime (infinite lifetime), subject of course to the idleTimeout setting. Default: 1800000 (30 minutes)"
    I think this will prevent great deal of applications from leakage. They "strongly" recommend this option.
  • leakDetectionThreshold - a variant of the current pool timeout
    "This property controls the amount of time that a connection can be out of the pool before a message is logged indicating a possible connection leak. A value of 0 means leak detection is disabled. Lowest acceptable value for enabling leak detection is 10000 (10 secs). Default: 0"
    This seems almost the same as the current connection timeout setting with the difference that it will not kill the connection just will log a possible leak. I'm not sure which is best - logging and killing (as is now) or just logging, but the logging will be of great help to detect (now with debugId - a great add) a possible leak request.

So the first I think is a mandatory for a long running stable pooling application. It will not prevent from rapid leakage but this kind of problems should be tracked easy anyway.

@xxgreg
Copy link
Owner Author

xxgreg commented Dec 2, 2014

These are all implemented now. I will be releasing 0.3.0 soon. I'll close this bug for now. But feel free to reopen if you have more feedback regarding the problems you've be having with connection leaks.

@xxgreg xxgreg closed this as completed Dec 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants