Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql.mysql.JdbcMysqlJ8IntegrationTests: pooledConnectionTest failed #1695

Closed
flaky-bot bot opened this issue Nov 14, 2023 · 4 comments · Fixed by #1696 or #1697
Closed

sql.mysql.JdbcMysqlJ8IntegrationTests: pooledConnectionTest failed #1695

flaky-bot bot opened this issue Nov 14, 2023 · 4 comments · Fixed by #1696 or #1697
Assignees
Labels
flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@flaky-bot
Copy link

flaky-bot bot commented Nov 14, 2023

Note: #1535 was also for this test, but it was closed more than 10 days ago. So, I didn't mark it flaky.


commit: 0440272
buildURL: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/actions/runs/6868024506
status: failed

@flaky-bot flaky-bot bot added flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 14, 2023
@flaky-bot flaky-bot bot added the flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. label Nov 14, 2023
Copy link
Author

flaky-bot bot commented Nov 14, 2023

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (0440272), this test passed in one build (https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/actions/runs/6868024506) and failed in another build (https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/actions/runs/6868024506).

@enocom
Copy link
Member

enocom commented Nov 14, 2023

com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Could not create connection to database server.
	at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:596)
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:582)
	at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
	at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
	at com.google.cloud.sql.mysql.JdbcMysqlJ8IntegrationTests.setUpPool(JdbcMysqlJ8IntegrationTests.java:78)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.sql.SQLNonTransientConnectionException: Could not create connection to database server.
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
	at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:1002)
	at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:819)
	at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:449)
	at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:242)
	at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:198)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:121)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364)
	at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
	at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476)
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561)
	... 16 more
Caused by: java.lang.RuntimeException: Unable to get valid instance data within 30000 ms. Last refresh attempt failed:java.lang.RuntimeException: [***] Failed to create ephemeral certificate for the Cloud SQL instance.
	at com.google.cloud.sql.core.Refresher.getConnectionInfo(Refresher.java:114)
	at com.google.cloud.sql.core.DefaultConnectionInfoCache.getConnectionInfo(DefaultConnectionInfoCache.java:92)
	at com.google.cloud.sql.core.DefaultConnectionInfoCache.createSslSocket(DefaultConnectionInfoCache.java:101)
	at com.google.cloud.sql.core.Connector.connect(Connector.java:115)
	at com.google.cloud.sql.core.InternalConnectorRegistry.connect(InternalConnectorRegistry.java:179)
	at com.google.cloud.sql.mysql.SocketFactory.connect(SocketFactory.java:63)
	at com.google.cloud.sql.mysql.SocketFactory.connect(SocketFactory.java:45)
	at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:63)
	at com.mysql.cj.NativeSession.connect(NativeSession.java:120)
	at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:949)
	... 25 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: [***] Failed to create ephemeral certificate for the Cloud SQL instance.
	at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:592)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:551)
	at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:111)
	at com.google.cloud.sql.core.Refresher.handleRefreshResult(Refresher.java:196)
	at com.google.cloud.sql.core.Refresher.lambda$startRefreshAttempt$1(Refresher.java:188)
	at com.google.common.util.concurrent.CombinedFuture$AsyncCallableInterruptibleTask.runInterruptibly(CombinedFuture.java:165)
	at com.google.common.util.concurrent.CombinedFuture$AsyncCallableInterruptibleTask.runInterruptibly(CombinedFuture.java:153)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	... 1 more
Caused by: java.lang.RuntimeException: [***] Failed to create ephemeral certificate for the Cloud SQL instance.
	at com.google.cloud.sql.core.DefaultConnectionInfoRepository.addExceptionContext(DefaultConnectionInfoRepository.java:384)
	at com.google.cloud.sql.core.DefaultConnectionInfoRepository.fetchEphemeralCertificate(DefaultConnectionInfoRepository.java:281)
	at com.google.cloud.sql.core.DefaultConnectionInfoRepository.lambda$getConnectionInfo$1(DefaultConnectionInfoRepository.java:113)
	at com.google.common.util.concurrent.CombinedFuture$CallableInterruptibleTask.runInterruptibly(CombinedFuture.java:196)
	... 7 more
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 502 Bad Gateway
POST https://sqladmin.googleapis.com/sql/v1beta4/projects/***/instances/***:generateEphemeralCert
<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 502 (Server Error)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
  </style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>502.</b> <ins>That’s an error.</ins>
  <p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.  <ins>That’s all we know.</ins>

	at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:466)
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:552)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:493)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:603)
	at com.google.cloud.sql.core.DefaultConnectionInfoRepository.fetchEphemeralCertificate(DefaultConnectionInfoRepository.java:279)
	... 9 more

@enocom
Copy link
Member

enocom commented Nov 14, 2023

Looks like a 502 brought the refresh algorithm down.

@hessjcg
Copy link
Collaborator

hessjcg commented Nov 14, 2023

I think we can fix this by increasing the default timeout to connnect from 30 seconds to 45 seconds.

There are two timeouts that interact with each other:

  • Connect timeout: how long the socket factory will wait to establish a socket connection before throwing an error.
  • Refresh retry delay: how long the refresher will wait after a failed refresh attempt before trying again.

At the moment both are set to 30 seconds. Which means that the test will fail if the first refresh attempt fails. This is a timeline of what happens:

  • 00:00 The test starts and starts to connect to the database. Connect blocks until refresh succeeds or the connect request times out at 00:30
  • 00:01 The first attempt to refresh certificates begins.
  • 00:03 The first attempt fails, scheduling retry at 00:33.
  • 00:30 The test's connect times out, throwing the exception.
  • 00:33 Second attempt to refresh certificate begins.

If we change the default connect timeout to 45 seconds, then the tests will be able to try a second refresh attempt without failing. This is the timeline of events if the connect timeout is set to 45 seconds:

  • 00:00 The test starts and starts to connect to the database. Connect blocks until refresh succeeds or the connect request times out at 00:45
  • 00:01 The first attempt to refresh certificates begins.
  • 00:03 The first attempt fails, scheduling retry at 00:33.
  • 00:33 Second refresh attempt to refresh certificate begins.
  • 00:36 The second refresh attempt succeeds.
  • 00:36 The test successfully connects to the database. The test succeeds.

Increasing the connect timeout will not significantly impact the number of refresh requests sent to the server.

hessjcg added a commit that referenced this issue Nov 15, 2023
…#1696)

Set the default connect timeout to 45 seconds. This allows the Refresher to retry at least once after
a failed refresh attempt. (Refresher delay between failed attempts is 30 sec.)

Rename and restructure connect timeout variables to make the code easier to understand.

Fixes #1695.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
2 participants