Skip to content

Commit

Permalink
Make ClientInvocation_ExceptionTest use of separate client and member…
Browse files Browse the repository at this point in the history
… instances (#19197)

This test failed on a couple of builds. The reason for failures was due to the fact
that we were using one static member for all tests cases and that member
was terminating after a test, causing the following tests to fail. Failing tests
were trying to connect the shutdown member, timeouting along the way while
waiting for the future returned by `executorService.submit` to resolve.

Now, that explains how the tests are failed but didn't explain why the
member was shutting down abruptly in one of the tests cases.

The reasoning is as follows.

The test case that causes member to shutdown is
```
new Object[]{JOIN_INTERNAL, new OutOfMemoryError("message"), OutOfMemoryError.class, IsNull.nullValue(Throwable.class)}
```
So, our executable throws OOME in its `call` method and we were using `.joinInternal()` to
wait on the future returned by `executorService.submit`. The thing about this test case
is that OOME is thrown directly on `.joinInternal()`, without being wrapped by
some other exception.

That triggers `OutOfMemoryErrorDispatcher#onOutOfMemory` and we check whether or not
we should react to it. Now, generally what happens is that
https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/instance/impl/DefaultOutOfMemoryHandler.java#L95
returns false, and we finish the test happily. In the failing builds, this probably
returns true, due to high memory usage in the build environment. When this
returns true, what we do is, we close the connections and terminate the member.

If we take a look at the logs, that is in fact what we do.

```
// Close the connections https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/instance/impl/DefaultOutOfMemoryHandler.java#L83
07:11:11,202  WARN |test[2 - java.lang.OutOfMemoryError: message]| - [TestClientRegistry$MockedServerConnection] test - Server connection closed: null
07:11:11,202  INFO |test[2 - java.lang.OutOfMemoryError: message]| - [MockServer] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Removed connection to endpoint: [127.0.0.1]:40001, connection: MockedNodeConnection{ remoteAddress = [127.0.0.1]:40001, localAddress = [127.0.0.1]:5701, connectionId = 1}

// Terminate the node forcefully https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/instance/impl/DefaultOutOfMemoryHandler.java#L84
07:11:11,203  WARN |test[2 - java.lang.OutOfMemoryError: message]| - [Node] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Terminating forcefully...
07:11:11,203  INFO |test[2 - java.lang.OutOfMemoryError: message]| - [Node] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Shutting down connection manager...
07:11:11,203  INFO |test[2 - java.lang.OutOfMemoryError: message]| - [Node] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Shutting down node engine...

// Client side now closes the its local connection to server as the server is shutdown
07:11:11,203  WARN || - [TcpClientConnection] pool-11594-thread-1 - hz.client_3587 [dev] [5.0-SNAPSHOT] MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=false, connectionId=1, channel=null, remoteAddress=[127.0.0.1]:5701, lastReadTime=2021-07-14 07:11:11.002, lastWriteTime=2021-07-14 07:11:11.002, closedTime=2021-07-14 07:11:11.203, connected server version=5.0-SNAPSHOT}} closed. Reason: com.hazelcast.spi.exception.TargetDisconnectedException[Mocked Remote socket closed]
com.hazelcast.spi.exception.TargetDisconnectedException: Mocked Remote socket closed
	at com.hazelcast.client.test.TestClientRegistry$MockedTcpClientConnection$4.run(TestClientRegistry.java:315) [test-classes/:?]
	at com.hazelcast.client.test.TwoWayBlockableExecutor$BlockableRunnable.run(TwoWayBlockableExecutor.java:98) [test-classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_291]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_291]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_291]
07:11:11,203  WARN || - [TwoWayBlockableExecutor] pool-11594-thread-1 - Dropping outgoing runnable since other end closed. Client Closed EOF. MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=false, connectionId=1, channel=null, remoteAddress=[127.0.0.1]:5701, lastReadTime=2021-07-14 07:11:11.002, lastWriteTime=2021-07-14 07:11:11.002, closedTime=2021-07-14 07:11:11.203, connected server version=5.0-SNAPSHOT}}
07:11:11,203  INFO || - [ClientConnectionManager] pool-11594-thread-1 - hz.client_3587 [dev] [5.0-SNAPSHOT] Removed connection to endpoint: [127.0.0.1]:5701:d1625e8c-dfeb-437e-98e8-c4d417dc8b6c, connection: MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=false, connectionId=1, channel=null, remoteAddress=[127.0.0.1]:5701, lastReadTime=2021-07-14 07:11:11.002, lastWriteTime=2021-07-14 07:11:11.002, closedTime=2021-07-14 07:11:11.203, connected server version=5.0-SNAPSHOT}}
07:11:11,203  INFO || - [LifecycleService] pool-11594-thread-1 - hz.client_3587 [dev] [5.0-SNAPSHOT] HazelcastClient 5.0-SNAPSHOT (20210714 - d372406) is CLIENT_DISCONNECTED
07:11:11,204  INFO || - [ClientConnectionManager] hz.client_3587.internal-2 - hz.client_3587 [dev] [5.0-SNAPSHOT] Trying to connect to cluster: dev
07:11:11,204  INFO || - [ClientConnectionManager] hz.client_3587.internal-2 - hz.client_3587 [dev] [5.0-SNAPSHOT] Trying to connect to Member [127.0.0.1]:5701 - d1625e8c-dfeb-437e-98e8-c4d417dc8b6c
07:11:11,204  WARN || - [ClientConnectionManager] hz.client_3587.internal-2 - hz.client_3587 [dev] [5.0-SNAPSHOT] Exception during initial connection to Member [127.0.0.1]:5701 - d1625e8c-dfeb-437e-98e8-c4d417dc8b6c: com.hazelcast.core.HazelcastException: java.io.IOException: Can not connected to [127.0.0.1]:5701: instance does not exist
07:11:11,204  WARN || - [ClientConnectionManager] hz.client_3587.internal-2 - hz.client_3587 [dev] [5.0-SNAPSHOT] Unable to get live cluster connection, retry in 1000 ms, attempt: 1, cluster connect timeout: INFINITE, max backoff: 30000 ms

// Termination process finished
07:11:11,206  INFO |test[2 - java.lang.OutOfMemoryError: message]| - [NodeExtension] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Destroying node NodeExtension.
07:11:11,207  INFO |test[2 - java.lang.OutOfMemoryError: message]| - [Node] test - [127.0.0.1]:5701 [dev] [5.0-SNAPSHOT] Hazelcast Shutdown is completed in 4 ms.

// Print the stacktrace https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/instance/impl/DefaultOutOfMemoryHandler.java#L88
java.lang.OutOfMemoryError: message
	at com.hazelcast.client.impl.spi.impl.ClientInvocation_ExceptionTest.parameters(ClientInvocation_ExceptionTest.java:83)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.runners.Parameterized$RunnersFactory.allParameters(Parameterized.java:424)
	at org.junit.runners.Parameterized$RunnersFactory.<init>(Parameterized.java:375)
	at org.junit.runners.Parameterized$RunnersFactory.<init>(Parameterized.java:360)
	at org.junit.runners.Parameterized.<init>(Parameterized.java:303)
	at sun.reflect.GeneratedConstructorAccessor27.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
	at org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
	at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
	at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:37)
	at org.junit.runner.Computer.getRunner(Computer.java:50)
	at org.junit.runner.Computer$1.runnerForClass(Computer.java:31)
	at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:70)
	at org.junit.runners.model.RunnerBuilder.runners(RunnerBuilder.java:125)
	at org.junit.runners.model.RunnerBuilder.runners(RunnerBuilder.java:111)
	at org.junit.runners.Suite.<init>(Suite.java:81)
	at org.junit.runner.Computer$2.<init>(Computer.java:33)
	at org.junit.runner.Computer.getSuite(Computer.java:28)
	at org.junit.runner.Request.classes(Request.java:77)
	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:126)
	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158)
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
```

To solve this problem, what I did is to make use of separate instances on each test case.
So, if we shutdown the member after the test case I mentioned, the others won't fail.
Note that, doing it doesn't cause a huge increase in the test duration.

In my local, the old static version was taking ~4 seconds, while the new separate
instance version takes ~8 seconds and since its run parallelly, we shouldn't even
feel it.
  • Loading branch information
mdumandag committed Jul 28, 2021
1 parent 2d27fc3 commit f2bee19
Showing 1 changed file with 8 additions and 8 deletions.
Expand Up @@ -30,8 +30,8 @@
import com.hazelcast.test.annotation.QuickTest;
import org.hamcrest.Matcher;
import org.hamcrest.core.IsNull;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.After;
import org.junit.Before;
import org.junit.Rule;
import org.junit.Test;
import org.junit.experimental.categories.Category;
Expand Down Expand Up @@ -145,17 +145,17 @@ public static Object[] parameters() {
@Rule
public ExpectedException expected = ExpectedException.none();

static TestHazelcastFactory hazelcastFactory = new TestHazelcastFactory();
static HazelcastInstance client;
private final TestHazelcastFactory hazelcastFactory = new TestHazelcastFactory();
private HazelcastInstance client;

@BeforeClass
public static void init() {
@Before
public void init() {
hazelcastFactory.newHazelcastInstance();
client = hazelcastFactory.newHazelcastClient();
}

@AfterClass
public static void tearDown() {
@After
public void tearDown() {
hazelcastFactory.terminateAll();
}

Expand Down

0 comments on commit f2bee19

Please sign in to comment.