Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDBC connection leak on JDBC pool when using Oracle AQ in combination with DefaultMessageListenerContainer [DATAJDBC-8] #265

Closed
spring-projects-issues opened this issue May 5, 2011 · 9 comments
Assignees
Labels
in: core type: bug
Milestone

Comments

@spring-projects-issues
Copy link

@spring-projects-issues spring-projects-issues commented May 5, 2011

Koen Serneels opened DATAJDBC-8 and commented

We are using the Oracle AQ support from spring-data to have both JMS and JDBC over the same datasource, with local transactions instead of XA.
The big picture of our setup is basicly what is described in the reference manual:

  • on the orcl:aq-jms-connection-factory: use-local-data-source-transaction="true" and native-jdbc-extractor="oracleNativeJdbcExtractor"
  • HibernateTransactionManager (we use our single DataSource now threefold: plain SQL, AQ and Hibernate)
  • Then we have numerous DefaultMessageListenerContainers to receive (JMS) messages over AQ and invokes our listeners.

Everything is working, but after a couple of minutes we get indications that the JDBC connection pool was exhausted and everything got stuck. In the JDBC connection pool monitor, we could see that all connections where in use: so something was clearly leaking connections. Since we were using the glassfish internal JDBC pool, we tried replacing the glassfish JDBC pool with commons DBCP, directly connecting from Spring to the db, bypassing any glassfish pool service. This did not resolve the problem: we got the exact same issue: exhausted pool after some minutes.

Through further investigation we learned that it was not a normal leak, since the physical connections to the database remained steady. So even if the container/DBCP showed us that 54 connections (of for instance max 100) were in use, we only saw for instance 6 connections to the database at a given time (6 x DefaultMessageListenerContainer on empty queues, just 1 thread checking for msg'es). Even more interesting was that when the pool got exhausted, and everything got stuck, we (immediately after) saw NO more open connections to the database.

So, this told us that their was no 'physical' leak, as the connections were all managed/closed properly.
It seemed that the pool was never informed that a connection is closed, and so never marked the connection as 'released' (allthough the connection did physically close). This seemed weird, as the transaction manager was doing its work correctly: we saw connections being 'returned' to the pool at the end of the transaction.

After some more searching, we found out that it was only some of the DefaultMessageListenerContainers that were leaking (the other all returned the connections at TX commit). The difference was that the leaking containers were 'in error': its corresponding Queue was not defined within Oracle. It is in these error scenario's that following connection leaking scenario is triggered in DefaultMessageListenerContainer:

  1. DefaultMessageListenerContainer (run() method L939) encouters somehow an error receiving the message/setting up the connection (in our case Q not defined)
  2. In the catch block: recoverAfterListenerSetupFailure() is invoked (L969)
  3. recoverAfterListenerSetupFailure() calls refreshConnectionUntilSuccessful()
  4. refreshConnectionUntilSuccessful calls: Connection con = createConnection(); and then : JmsUtils.closeConnection(con); (L859 & L860; note: completely outside of any TX!)

Now, here it becomes interesting. At L859 the javax.jms.Connection is obtained through the oracle AQjmsConnectionFactory.
The AQjmsConnectionFactory was created by AqJmsFactoryBeanFactory, passing in our configured DataSource, but wrapped in a AqJmsFactoryBeanFactory$TransactionAwareDataSource. So, in the end AQjmsConnectionFactory internally obtains a new javax.sql.Connection from AqJmsFactoryBeanFactory$TransactionAwareDataSource (delegating the the JDBC connection pool). However, at L146 of AqJmsFactoryBeanFactory$TransactionAwareDataSource you see this:

[code]
if (TransactionSynchronizationManager.isActualTransactionActive()) {
if (logger.isDebugEnabled()) {
logger.debug("Using Proxied JDBC Connection [" + conToUse + "]");
}
return getCloseSuppressingConnectionProxy(conToUse);
}
[/code]

The "if" is in this case NOT executed and the connection obtained thus NOT proxied, since there was NO transaction started in refreshConnectionUntilSuccessful (and there was also none active). So, when at L860 of refreshConnectionUntilSuccessful the close() is propagated, it is executed on the RAW oracle connection (T4CConnection.logOff()). Making the connection physically disconnect.
However, [incorrect - see comment]since it was not proxied,[/incorrect] there was no call to the pool indicating that the connection was released! This brings the pool in an artificial state thinking its connections are somewhere in use while they are not.

Our temporary fix will be to customize DefaultMessageListenerContainer by overriding refreshConnectionUntilSuccessful() to do nothing.
But I suspect a better/cleaner fix is possible. Can this be looked at?


Affects: Ext 1.0 M1

Referenced from: commits spring-attic/spring-data-jdbc-ext@1fc1c7a

1 votes, 2 watchers

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented May 5, 2011

Koen Serneels commented

Can someone change the priority to high please?

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented May 5, 2011

Koen Serneels commented

Remark: my explanation about the CloseSuppressingConnectionProxy is not correct of course.
When it was in a transaction, proxying it with CloseSuppressingConnectionProxy would make the close NOT be propagated to the underlying connection.
Probably everything is dealt with by the TX manager in that case (committing, closing,...). The TX manager works on the wrapped java.sql.Connection.

So in this case (no transaction) it seems ok that the proxy is not applied, it seems also correct that the close is propagated to the underlying connection.
The only problem is that this way the pool is bypassed, since the DataSource passed to the AQjmsConnectionFactory (AqJmsFactoryBeanFactory$TransactionAwareDataSource) is returning the RAW Oracle connection when getConnection() is called.
So this bypasses the pool...
May

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 25, 2011

Thomas Risberg commented

Not having seen this run, I'm guessing that what happens is that the "refreshConnectionUntilSuccessful" method attempts to open another connection, but this throws an exception so the JmsUtils.closeConnection(con) call is not executed for this connection and after the logging is done another connection attempt is made. It's possible that the Oracle JDBC connection is still open at this point, so this would in this case lead to an exhausted pool at some point.

If you are still able to reproduce this, could you enable debugging for the DMLC and post the debug logs.

Thanks,
Thomas

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 26, 2011

Koen Serneels commented

Yes, this is still an issue.
As I explained above Thomas, there is no exception being thrown in the 'refreshConnectionUntilSuccessful' method.
I'll make again a short summary of what happens:

We had a configuration issue (a Q did not exist). Because of this, reading a message from the Q failed.
When an exception happens reading from the Q (in DefaultMessageListenerContainer) 'recoverAfterListenerSetupFailure' is executed in the catch block.
'recoverAfterListenerSetupFailure' on its turn calls 'refreshConnectionUntilSuccessful'.

Now, when 'refreshConnectionUntilSuccessful' is called, it tries to perform a basic check to find out if the connection to the underlying JMS provider is ok.
In our case (Oracle AQ) it means that a datasource connection is obtained from the pool.

everything ok at this point, the connection is obtained/made without issues

Next, it immediately closes the obtained connection.
Here it goes bad; the connection obtained is the bare Oracle connection.
This is because of the fact that the AQ infrastructure only works with an 'Oracle connection' and not with any wrapped connection (only with the 'CloseSuppressingInvocationHandler' in 'AqJmsFactoryBeanFactory' which returns the target connection. this proxy is however not applied when ran outside a transaction).

So, the 'close' that JmsUtils issues is called on the physically database connection.
Doing so, the pool is never informed that the connection was actually released.
This creates a 'virtual connection leak', as the connection are all nicely closed, but the pool is not aware of it.
At a given point the pool thinks it handed out all connections, but that is not the case as they are all perfectly closed, but the result is the same: no more connections are returned and everything stalls.

One solution would be to create a connection proxy that is applied when OUTSIDE a transaction.
This proxy could then have a reference to the pool, so that closes can be signaled to the pool.
Another solution might be to run 'refreshConnectionUntilSuccessful' inside a transaction.
This will however require changes to DefaultMessageListenerContainer

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 26, 2011

Thomas Risberg commented

Thanks for the feedback. I'm setting up a test to reproduce this - could you post your connection configuration.

Thanks,
Thomas

P.S. fixed your edit problem with the strikethrough :)

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 26, 2011

Koen Serneels commented

Ok, thanks.
The setup should be very easy, just setup a DMLC and configure it to listen on a non existing queue.
Make sure that the oracle connection etc is working. My setup was like this:

<bean id="oracleNativeJdbcExtractor" class="org.springframework.jdbc.support.nativejdbc.SimpleNativeJdbcExtractor"/>
<orcl:aq-jms-connection-factory id="connectionFactory" data-source="dataSource" use-local-data-source-transaction="true" native-jdbc-extractor="oracleNativeJdbcExtractor"/>
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" lazy-init="true">
<property name="driverClassName" value="oracle.jdbc.driver.OracleDriver" />
<property name="url" value="jdbc:oracle:thin:@host:port:sid"/>
<property name="username" value="username" />
<property name="password" value="password" />
</bean>
<bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager" lazy-init="true">
<property name="sessionFactory" ref="sessionFactory" />
</bean>

<bean id="sessionFactory" .......

  • The annotation transaction config is not required as the listener (which you configure on the DMLC) will never be executed in this case (since the Q does not exist).
  • You could probably also leave out hibernate SF and use a DataSourceTransactionManager instead, this should yield no differences, as in the test here you should never get to the point using hibernate.
  • You can also explicitly specify the max nr of connections on the pool 'maxActive'. The default should be 8.
  • I uses the AQAPI and OJDBC6 drivers version 11.2.0.1.0 on a 10.x database (they are backward compatible). However these are normally not relevant.

When running the test you should see the number of connections in use rising on the pool, however, if you check the real active connections on oracle (querying v$session or something) you'll see it should remain steady.
For test purposes you can extend DMLC and override the 'refreshConnectionUntilSuccessful' to do nothing.
You'll see that everything works

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 26, 2011

Thomas Risberg commented

Haven't been able to duplicate this quite yet. Noticed you are using a DBCP pool. Could you try using an OracleDataSource pool -

<bean id="dataSource" class="oracle.jdbc.pool.OracleDataSource">
	<property name="URL" value="${jdbc.url}" />
	<property name="user" value="${jdbc.username}" />
	<property name="password" value="${jdbc.password}" />
	<property name="connectionCachingEnabled" value="true" />
</bean>

Also can you either post or email me your configuration for the messageListenerContainer. My email is trisberg AT vmware.com

Thanks,
Thomas

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 27, 2011

Thomas Risberg commented

I was able to configure my test to reproduce this. You are absolutely correct, we need to provide a proxy that will propagate the close call to the pool instead of using the unwrapped raw connection.

Using the Oracle pool implementation bypasses this issue since we don't have to unwrap the pooled connection to use it with the AQ JMS connection factory.

-Thomas

@spring-projects-issues
Copy link
Author

@spring-projects-issues spring-projects-issues commented Jul 27, 2011

Thomas Risberg commented

This is now fixed by wrapping any unwrapped native connections used outside of an existing Tx in a proxy that delegates close calls to the original pool connection.

Thanks for reporting this issue.

-Thomas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core type: bug
Projects
None yet
Development

No branches or pull requests

2 participants