Description
Driver version
Driver 7.2.2-jre8
SQL Server version
SQL 2016 Enterprise with multi-subnet HA
Client Operating System
Centos 7.6
JAVA/JVM version
Oracle JDK 1.8.192
Table schema
N/A
Problem description
-
Expected behaviour:
Invoking isValid operation on connection with the timeout parameter should not take longer than the timeout configured -
Actual behaviour:
If DB failover happens, sometimes we are getting TCP socket read hanging forever in JDBC Driver (see stacktrace below).
This causes our connection pool to completely hang, as write lock acquired on connection pool object is never released during testing if connections in pool are healthy.
Currently as per implementation isValid timeout is set as query timeout, but if I understand the code correctly, this is server-side timeout, while our problem happens on the client side.
Therefore the current implementation of isValid timeout is not handling correctly client-side problems. -
Error message/stack trace:
N/A as everything hangs -
Any other details that can be helpful:
I believe this is similar problem to that one: Database connection hangs on failover from one node to another #764
Except we are getting it almost every time.
I think the solution might be to temporarily change TCP Socket timeout from the one set by connection parameter, to the one set in isValid parameter to make sure TCP timeout equals to isValid timeout, then after successful validation change TCP timeout to original value.
JDBC trace logs
Thread dump:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:2023)
at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6418)
- locked <0x00000007266a5fd0> (a com.microsoft.sqlserver.jdbc.TDSReader)
at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7579)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:866)
at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:768)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7194)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2935)
- locked <0x00000007266a62b0> (a java.lang.Object)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:248)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:223)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQueryInternal(SQLServerStatement.java:700)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.isValid(SQLServerConnection.java:5591)
at com.microsoft.sqlserver.jdbc.SQLServerConnectionPoolProxy.isValid(SQLServerConnectionPoolProxy.java:480)
at org.apache.aries.transaction.jdbc.internal.ValidatingDelegatingManagedConnectionFactory.isValidConnection(ValidatingDelegatingManagedConnectionFactory.java:59)
at org.apache.aries.transaction.jdbc.internal.ValidatingDelegatingManagedConnectionFactory.getInvalidConnections(ValidatingDelegatingManagedConnectionFactory.java:84)
at org.apache.aries.transaction.jdbc.internal.ValidatingGenericConnectionManager$ValidatingTask.run(ValidatingGenericConnectionManager.java:155)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
Reproduction code
- Establish approx 100 connections to DB
- Failover multisubnet HA installation.
- During failover, keep invoking isValid on all connections
Metadata
Metadata
Assignees
Type
Projects
Status