New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible resource leak or MySQL concurrent connection issue #76
Comments
Sorry for the collapsed stacktraces, copied those from elk, which seems to have removed the line endings |
Setting useSSL=false in the connection string worked. Candidates for why useSSL made a difference are currently:
A factor that may be related to useSSL=false but not to the issues above (as it occurred after the log entries above) is:
|
For avoidance of doubt, the pool exhaustion errors are still happening after "useSSL=false" was added, with the following from an hour ago:
|
@ansell raise timeout to 60 seconds? |
I am not sure what steps to take, just collecting the debugging information to improve the lists server availability. If it is a resource leak, it is possible that some of the 6 connections in the pool are permanently blocked, but difficult to say without taking a heap dump or attaching a debugger. The log file above is from bie-b4-b if you want to log into now and do some forensics. Forensics could help to distinguish between the long query case (where the increased timeout could help) and the connection pool size and resource leak possibilities, where and increased timeout wouldn't matter, and could possibly accentuate the issue by blocking some of the connections for longer. |
This is an example of a stacktrace taken while the lists server was having issues: bie-b4-a-connectionpooldebug-stacktraces.txt The process for generating one of those is to run ps to find the process id:
Then use the process id in the following (replace "PID" with the process id you found above):
|
Tracked down the issue to |
Executed one of the dr652 queries on the mysql command line tool and it took 9 seconds, so there is room for performance improvements:
|
The query examining the most records is:
|
The following query has an embedded count and an embedded concat, and is passing a large number of records:
|
Both biocache-b4-a and biocache-b4-b, which are behind a load balancer, are showing symptoms of a possible resource leak or a MySQL concurrent connection issue. The symptoms reappeared after a full system restart.
Some stacktraces when it fails looks like:
On restart, the following occurred
The text was updated successfully, but these errors were encountered: