Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonXa connection factory exhausts the pool during oracle db connection reap in a particular scenario #66

Closed
rmaheedharan opened this issue Oct 31, 2018 · 5 comments

Comments

@rmaheedharan
Copy link

jta timeout = 120 seconds
non-xa conn factory reap timeout = 180 seconds.

Application has a transaction that runs 1000+ update statements one after the other. The application is expected to complete running all the update statement within 60 seconds. In one particular case, one of the update statement (in our case 119th update statement) was running long due to a row level lock created by a different application. This caused the transaction to take more than 120 seconds and so the jta marked that transaction as timed out, but the update statement still runs in the backend to complete it. When it reaches 180 seconds, the connection is reaped and atomikos sets that connection as erroneous connection.

Still the update statements are being executed, 120th update asks for a connection for this transaction and since this connection is marked as erroneous, a new connection is created and given to application. This connection is not released to pool after the update is complete (my assumption is that the global transaction is not yet complete). When 121st update asks for a connection, expectation is that the pool will give the connection used for 120th update, but it again sees the erroneous connection as the first item in the connection iterator and returns it and eventually fails when attempting to recycling and hence gives a new connection again. So as the updates continues to run, it keeps creating new connections in the pool and it exhausts when it reaches the max pool setting.

Below is the error it puts in the log when it fails to recycle the erroneous connection. My understanding is that this connection must not considered as recyclable in first place.

2018-10-30 00:46:46 DEBUG ConnectionPool:45 - atomikos connection pool '<conn pool>': error while trying to recycle
com.atomikos.datasource.pool.CreateConnectionException: AtomikosNonXAPooledConnection: connection is erroneous

Please let me know if we can overcome this situation by some means, but to me it appears like a gap in the code.

Thanks
Maheedharan

@GuyPardon
Copy link
Contributor

Thanks,

So basically your point / problem is: recycleConnectionIfPossible should not give up after an exception, but rather iterate over the remainder of the pool.

Correct?

@rmaheedharan
Copy link
Author

Yes, Correct. I would go one step back and consider the connection to be not eligible for recycling, as the connections that are marked as erroneous cannot be recycled as per the code.

@GuyPardon GuyPardon added this to the TransactionsEssentials 4.0.7 milestone Nov 1, 2018
@GuyPardon GuyPardon added bug It doesn't work as could be reasonably expected. and removed bug It doesn't work as could be reasonably expected. labels Nov 1, 2018
@GuyPardon GuyPardon removed this from the TransactionsEssentials 4.0.7 milestone Nov 1, 2018
@GuyPardon
Copy link
Contributor

OK thanks - created a feature request for this. I am closing this issue for now - re-open if you disagree?

@lewisdavidcole
Copy link

lewisdavidcole commented Nov 1, 2018

HI Guy,

I came across this thread and the description is very close to what we have been experiencing with our application. Similar to what @rmaheedharan has experienced, we see this behavior when we have database blocks. When we have a long block that doesn't clear on it's own, we sometimes allow a dba to terminate a session to clear the block. At first it seems as if everything clears up. Other blocked transactions flow through and the issue looks to clear up. However, in cases where a process continues to run, we will start to see the connection pool grow rapidly until we exhaust a pool configured to have 400 max connections.

This seems likely related and when it occurs, our only recourse is to restart the application. Would you not consider this to be a legitimate and somewhat nasty bug instead of a feature request?

@GuyPardon
Copy link
Contributor

Hi @lewisdavidcole - thanks for the feedback.

I would consider it more of an efficiency enhancement, but I can add a "bug" label if you prefer.

However, your description sounds like there are other problems (i.e., long DB blocks). Are you using XA or non-XA for that?

Let me know...
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants