-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed as not planned
Description
The integration test ITHighAvailabilityTest failed in this build:
[ERROR] Failures:
[ERROR] ITHighAvailabilityTest.testCoordinatorCluster:207 » ISE Max number of retries[...
Details:
2022-06-14T18:33:50,356 INFO [main] org.apache.druid.testing.utils.DruidClusterAdminClient - 307 Temporary Redirect
2022-06-14T18:33:50,356 INFO [main] org.apache.druid.testing.utils.ITRetryUtil - Trying attempt[0/240]...
2022-06-14T18:33:50,358 WARN [HttpClient-Netty-Worker-14] org.apache.druid.java.util.http.client.pool.ResourcePool - Resource at key[http://127.0.0.1:8590] was returned multiple times?
2022-06-14T18:33:50,358 ERROR [main] org.apache.druid.testing.utils.DruidClusterAdminClient - Error while waiting for [http://127.0.0.1:8590] to be ready
java.util.concurrent.ExecutionException: java.io.IOException: Connection reset by peer
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
at org.apache.druid.testing.utils.DruidClusterAdminClient.lambda$waitUntilInstanceReady$1(DruidClusterAdminClient.java:268) ~[druid-integration-tests-0.24.0-SNAPSHOT.jar:0.24.0-SNAPSHOT]
at org.apache.druid.testing.utils.ITRetryUtil.retryUntil(ITRetryUtil.java:61) ~[druid-integration-tests-0.24.0-SNAPSHOT.jar:0.24.0-SNAPSHOT]
at org.apache.druid.testing.utils.ITRetryUtil.retryUntilTrue(ITRetryUtil.java:39) ~[druid-integration-tests-0.24.0-SNAPSHOT.jar:0.24.0-SNAPSHOT]
at org.apache.druid.testing.utils.DruidClusterAdminClient.waitUntilInstanceReady(DruidClusterAdminClient.java:262) ~[druid-integration-tests-0.24.0-SNAPSHOT.jar:0.24.0-SNAPSHOT]
at org.apache.druid.testing.utils.DruidClusterAdminClient.waitUntilOverlordTwoReady(DruidClusterAdminClient.java:140) ~[druid-integration-tests-0.24.0-SNAPSHOT.jar:0.24.0-SNAPSHOT]
at org.apache.druid.tests.leadership.ITHighAvailabilityTest.lambda$swapLeadersAndWait$7(ITHighAvailabilityTest.java:263) ~[test-classes/:?]
at org.apache.druid.tests.leadership.ITHighAvailabilityTest.swapLeadersAndWait(ITHighAvailabilityTest.java:266) ~[test-classes/:?]
at org.apache.druid.tests.leadership.ITHighAvailabilityTest.testLeadershipChanges(ITHighAvailabilityTest.java:125) ~[test-classes/:?]
This PR did change this particular test case, but in a different test function. Some things to note:
- This test passed on a previous run for this PR. The change that triggered the re-run was trivial: a change in a documentation file.
- The test failed on retry 0 of 240: somehow the retry mechanism (which is generally over-aggressive) didn't kick in this time, yet the failure is that the number of retries was exceeded.
- There is a 307 redirect error in the log. Perhaps the tests don't handle the transient case in which a redirect occurs?
- Perhaps unrelated, but there is an entry for "Resource at key[http://127.0.0.1:8590] was returned multiple times?"
This particular test has been redone in the "new IT" PR, but we're stuck with the old version in the present PR.
Reactions are currently unavailable