Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problem with concurrent access to different ConnectionPool instances #737

Closed
wants to merge 1 commit into from

Conversation

pb00067
Copy link

@pb00067 pb00067 commented Jul 1, 2024

The Tomcat-JDBC-Connection-Pool is designed to handle more pool (access to different databases) instances.
However there's a single static instance for cleaning the pool(s) named poolCleanTimer which apparently has to handle the cleaning for all used pools.

When accessing (opening/closing) different connection pools out from different threads we often run into the exception/stacktrace reported below. We use Hibernate version 6.4 and Tomcat version 9.0.90.
This pull request fixes this by using a common lock instance for both methods registerCleaner and unregisterCleaner.

org.hibernate.service.spi.ServiceException: Unable to create requested service [org.hibernate.engine.jdbc.env.spi.JdbcEnvironment] due to: Could not create a TomcatJDBC pool
at org.hibernate.service.internal.AbstractServiceRegistryImpl.createService(AbstractServiceRegistryImpl.java:276)
at org.hibernate.service.internal.AbstractServiceRegistryImpl.initializeService(AbstractServiceRegistryImpl.java:238)
at org.hibernate.service.internal.AbstractServiceRegistryImpl.getService(AbstractServiceRegistryImpl.java:215)
at org.hibernate.boot.model.relational.Database.<init>(Database.java:45)
at org.hibernate.boot.internal.InFlightMetadataCollectorImpl.getDatabase(InFlightMetadataCollectorImpl.java:223)
at org.hibernate.boot.internal.InFlightMetadataCollectorImpl.<init>(InFlightMetadataCollectorImpl.java:191)
at org.hibernate.boot.model.process.spi.MetadataBuildingProcess.complete(MetadataBuildingProcess.java:170)
at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.metadata(EntityManagerFactoryBuilderImpl.java:1432)
at org.hibernate.jpa.boot.internal.EntityManagerFactoryBuilderImpl.build(EntityManagerFactoryBuilderImpl.java:1503)
at org.hibernate.jpa.HibernatePersistenceProvider.createEntityManagerFactory(HibernatePersistenceProvider.java:55)
at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:80)
at com.wuerth.phoenix.bcserver.base.DatabaseServer.initServer(DatabaseServer.java:325)
at com.wuerth.phoenix.bcserver.base.DatabaseServer.<init>(DatabaseServer.java:92)
at com.wuerth.phoenix.bcserver.base.ComponentServer.init(ComponentServer.java:140)
at com.wuerth.phoenix.bcserver.base.PeerContextFactory.<init>(PeerContextFactory.java:21)
at com.wuerth.phoenix.bcutil.ContextFactory.instantiate(ContextFactory.java:115)
at com.wuerth.phoenix.bcutil.ContextFactory.init(ContextFactory.java:141)
at com.wuerth.phoenix.cis.database.DatabaseController.<init>(DatabaseController.java:11)
at com.wuerth.phoenix.cis.database.DatabaseController.getDatabaseController(DatabaseController.java:17)
at com.wuerth.phoenix.cis.test.CTSTransationManager$CTSContext.<init>(Unknown Source)
at com.wuerth.phoenix.cis.test.CTSTransationManager.createContext(Unknown Source)
at com.wuerth.phoenix.cis.test.core.database.CTCDBAdapter.beforeEach(Unknown Source)
at com.wuerth.phoenix.cis.test.core.CTCJunit5.beforeEach(Unknown Source)
at com.wuerth.phoenix.cis.test.CISTestBase.setUp(Unknown Source)
at com.wuerth.phoenix.cis.esg.input.gri.EsgInputGriModelTest.setUp(Unknown Source)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Caused by: org.hibernate.HibernateException: Could not create a TomcatJDBC pool
at org.hibernate.connection.TomcatJDBCConnectionProvider.configure(TomcatJDBCConnectionProvider.java:191)
at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.configureService(StandardServiceRegistryImpl.java:136)
at org.hibernate.service.internal.AbstractServiceRegistryImpl.initializeService(AbstractServiceRegistryImpl.java:247)
at org.hibernate.service.internal.AbstractServiceRegistryImpl.getService(AbstractServiceRegistryImpl.java:215)
at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.buildJdbcConnectionAccess(JdbcEnvironmentInitiator.java:395)
at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.getJdbcEnvironmentUsingJdbcMetadata(JdbcEnvironmentInitiator.java:262)
at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:107)
at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:68)
at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.initiateService(StandardServiceRegistryImpl.java:130)
at org.hibernate.service.internal.AbstractServiceRegistryImpl.createService(AbstractServiceRegistryImpl.java:263)
... 32 more
Caused by: java.lang.IllegalStateException: Timer already cancelled.
at java.base/java.util.Timer.sched(Timer.java:409)
at java.base/java.util.Timer.schedule(Timer.java:260)
at org.apache.tomcat.jdbc.pool.ConnectionPool.registerCleaner(ConnectionPool.java:1537)
at org.apache.tomcat.jdbc.pool.ConnectionPool.access$400(ConnectionPool.java:57)
at org.apache.tomcat.jdbc.pool.ConnectionPool$PoolCleaner.start(ConnectionPool.java:1618)
at org.apache.tomcat.jdbc.pool.ConnectionPool.initializePoolCleaner(ConnectionPool.java:556)
at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:465)
at org.apache.tomcat.jdbc.pool.ConnectionPool.<init>(ConnectionPool.java:155)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:118)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:107)
at org.hibernate.connection.TomcatJDBCConnectionProvider.configure(TomcatJDBCConnectionProvider.java:157)
... 41 more

P.S.: Since a user might use a different pool-configuration for each database, it is questionable if we should keep just one common poolCleanTimer. I think from architecture point of view it would be better to have one poolCleanTimer per each single pool instance.

Caused by: org.hibernate.HibernateException: Could not create a TomcatJDBC pool
	at org.hibernate.connection.TomcatJDBCConnectionProvider.configure(TomcatJDBCConnectionProvider.java:191)
	at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.configureService(StandardServiceRegistryImpl.java:136)
	at org.hibernate.service.internal.AbstractServiceRegistryImpl.initializeService(AbstractServiceRegistryImpl.java:247)
	at org.hibernate.service.internal.AbstractServiceRegistryImpl.getService(AbstractServiceRegistryImpl.java:215)
	at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.buildJdbcConnectionAccess(JdbcEnvironmentInitiator.java:395)
	at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.getJdbcEnvironmentUsingJdbcMetadata(JdbcEnvironmentInitiator.java:262)
	at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:107)
	at org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService(JdbcEnvironmentInitiator.java:68)
	at org.hibernate.boot.registry.internal.StandardServiceRegistryImpl.initiateService(StandardServiceRegistryImpl.java:130)
	at org.hibernate.service.internal.AbstractServiceRegistryImpl.createService(AbstractServiceRegistryImpl.java:263)
	... 32 more
Caused by: java.lang.IllegalStateException: Timer already cancelled.
	at java.base/java.util.Timer.sched(Timer.java:409)
	at java.base/java.util.Timer.schedule(Timer.java:260)
	at org.apache.tomcat.jdbc.pool.ConnectionPool.registerCleaner(ConnectionPool.java:1537)
	at org.apache.tomcat.jdbc.pool.ConnectionPool.access$400(ConnectionPool.java:57)
	at org.apache.tomcat.jdbc.pool.ConnectionPool$PoolCleaner.start(ConnectionPool.java:1618)
	at org.apache.tomcat.jdbc.pool.ConnectionPool.initializePoolCleaner(ConnectionPool.java:556)
	at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:465)
	at org.apache.tomcat.jdbc.pool.ConnectionPool.&lt;init&gt;(ConnectionPool.java:155)
	at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:118)
	at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:107)
	at org.hibernate.connection.TomcatJDBCConnectionProvider.configure(TomcatJDBCConnectionProvider.java:157)
	... 41 more
@markt-asf
Copy link
Contributor

I have two concerns with this PR.

  1. Given the existing synchronized keyword on both registerCleaner() and unregisterCleaner() I don't see how execution could reach line 1537 with poolCleanTimer both non-null and cancelled. The only place poolCleanTimer si cancelled is from within unregisterCleaner() where it is immediately set to null. Not that it should matter since everything is within a synchronized but poolCleanTimer is also volatile.

  2. The proposed code change has no functional change since all it does is changed the monitor from the class to a specific object. Actually it does change the behaviour of getPoolCleaners() and getPoolTimer() but since those are only used by Tomcat's unit tests that should be OK.

I wonder if some code somewhere is calling getPoolTimer() and cancelling it. If that is the case, making those test methods package private rather than public should help.

@pb00067
Copy link
Author

pb00067 commented Jul 2, 2024

Hi Thomas,

Given the existing synchronized keyword on both registerCleaner() and unregisterCleaner() I don't see how execution could >>reach line 1537 with poolCleanTimer both non-null and cancelled.

please notice that both methods registerCleaner() and unregisterCleaner() having the synchronized keyword does not prevent that one thread executes registerCleaner() while another thread simulteneously executes unregisterCleaner() .

where it is immediately set to null

yes, after the cancel call the variable is set to null immediately, but apparently the cancel call takes some time to finish,
so the time window between terminating the timer and resetting the variable to null is big enough to create trouble.
The exception I posted is not something that happened sporadic, we have it systematically running automatic tests.

Not that it should matter since everything is within a synchronized

Anyway I don't have handwritten the exeption/stacktrace I've posted. It's a real issue we face every single automatic-test-run.

@pb00067
Copy link
Author

pb00067 commented Jul 2, 2024

I wonder if some code somewhere is calling getPoolTimer() and cancelling it. If that is the case, making those test methods >>package private rather than public should help.

We are not calling getPoolTimer() from outside and cancelling it explicitely.
We just close the connectionPool at the end of a unit.
Closing a connectionPool cancels the PoolTimer which is wrong from architechure point of view since there might coexist other running session pools. Currently to me it seems the share a common pooltimer which get canceled once we close a connection pool.

@markt-asf
Copy link
Contributor

please notice that both methods registerCleaner() and unregisterCleaner() having the synchronized keyword does not prevent that one thread executes registerCleaner() while another thread simulteneously executes unregisterCleaner() .

Yes, it does.

The chances of changes being applied without a valid explanation of the root cause are extremely low.

@pb00067
Copy link
Author

pb00067 commented Jul 3, 2024

please notice that both methods registerCleaner() and unregisterCleaner() having the synchronized keyword does not prevent that one thread executes registerCleaner() while another thread simulteneously executes unregisterCleaner() .

I guess it is possible (the exception & stacktrace in fact proves that it is possible) when the 2 threads use 2 different class-loaders. Thats the unique explanation I have for the moment. I will investigate later, for the moment I have other works to do.

@markt-asf
Copy link
Contributor

That the threads have different context class loaders doesn't explain the behaviour you are seeing. As per section 8.4.3.6 of the JLS, there is a single monitor for both methods.

Classes are uniquely defined by name and class loader so it is possible you have two classes named org.apache.tomcat.jdbc.pool.ConnectionPool loaded but that wouldn't explain the behaviour you are seeing since each has its own static fields.

What is required to progress this is a minimal test case that reproduces the issue. Ideally, this would be as an addition to the existing JDBC Pool test cases.

@pizzi80
Copy link

pizzi80 commented Jul 18, 2024

It would be nice to replace all these synchronized methods/blocks
with a real Lock to be more VirtualThread friendly.

https://stackoverflow.com/questions/76618264/synchronize-block-pinning-virtual-thread

In other projects I use a mini util methods to simplify the typical Lock pattern usage

for example take a look at this PR:

https://github.com/omnifaces/omnifaces/pull/815/files

@ChristopherSchultz
Copy link
Contributor

There isn't much here that isn't Virtual Thread friendly.

VTs will indeed be pinned while running a synchronized block (at least in today's JVMs), but these blocks are short and shouldn't cause much delay in any kind of processing. Adding the use of a Lock could be done but IMHO isn't necessary as it merely introduces more lines of code at the expense of readability and potential bugs (it's difficult to screw-up a synchronized block and much easier to do use the Lock incorrectly).

This just looks like a patch that is looking for a reason to apply it. I see no reason whatsoever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants