AssertionError: nodeId out of range: -1 invokeWithArguments(MethodHandle.java:627) #16609

Danny-Hazelcast · 2020-02-05T08:45:23Z

http://jenkins.hazelcast.com/view/freeze/job/freeze-all/13/console

Hazelcast Enterprise 4.1-SNAPSHOT (20200205 - 37bffaf, 1e23c41)

02:21:26.259 /disk1/jenkins/workspace/freeze-all/4.0/2020_02_04-11_09_34/stable/destroy/create-use-destroy Failed
02:21:26.265 fail HzClient4HZ _create-use-destroy_createUseDestroy_reliableId hzcmd.id.flake.CreateUseDestroy threadId=0 java.lang.AssertionError: nodeId out of range: -1 �[0m
02:21:26.500 http://54.147.27.51/~jenkins/workspace/freeze-all/4.0/2020_02_04-11_09_34/stable/destroy/create-use-destroy

java.lang.AssertionError: nodeId out of range: -1
	at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.tryWrapInSameClass(AbstractInvocationFuture.java:1954)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.wrapError(AbstractInvocationFuture.java:1945)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.wrapOrPeel(AbstractInvocationFuture.java:1931)
	at com.hazelcast.spi.impl.AbstractInvocationFuture$ExceptionalResult.wrapForJoinInternal(AbstractInvocationFuture.java:1395)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.resolveAndThrowForJoinInternal(AbstractInvocationFuture.java:592)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.joinInternal(AbstractInvocationFuture.java:576)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.newIdBatch(ClientFlakeIdGeneratorProxy.java:58)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.access$000(ClientFlakeIdGeneratorProxy.java:33)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy$1.newIdBatch(ClientFlakeIdGeneratorProxy.java:45)
	at com.hazelcast.flakeidgen.impl.AutoBatcher.newId(AutoBatcher.java:58)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.newId(ClientFlakeIdGeneratorProxy.java:52)
	at hzcmd.id.flake.CreateUseDestroy.timeStep(CreateUseDestroy.java:11)
	at remote.bench.marker.MetricsMarker.flatOut(MetricsMarker.java:53)
	at remote.bench.marker.MetricsMarker.bench(MetricsMarker.java:40)
	at remote.bench.BenchThread.call(BenchThread.java:41)
	at remote.bench.BenchThread.call(BenchThread.java:14)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

The text was updated successfully, but these errors were encountered:

Danny-Hazelcast · 2020-02-27T11:52:54Z

4.1-SNAPSHOT (20200225 - 8b8c625, 4fa4827)

00:10:20.761 /disk1/jenkins/workspace/freeze-x/4.1-SNAPSHOT/2020_02_27-11_38_20/create-use-destroy Failed
00:10:20.769 fail HzClient1HZ _create-use-destroy_createUseDestroy_reliableId hzcmd.id.flake.CreateUseDestroy threadId=0 java.lang.AssertionError: nodeId out of range: -1 �[0m
00:10:20.961 http://54.147.27.51/~jenkins/workspace/freeze-x/4.1-SNAPSHOT/2020_02_27-11_38_20/create-use-destroy

java.lang.AssertionError: nodeId out of range: -1
	at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.tryWrapInSameClass(AbstractInvocationFuture.java:1954)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.wrapError(AbstractInvocationFuture.java:1945)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.wrapOrPeel(AbstractInvocationFuture.java:1931)
	at com.hazelcast.spi.impl.AbstractInvocationFuture$ExceptionalResult.wrapForJoinInternal(AbstractInvocationFuture.java:1395)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.resolveAndThrowForJoinInternal(AbstractInvocationFuture.java:592)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.joinInternal(AbstractInvocationFuture.java:576)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.newIdBatch(ClientFlakeIdGeneratorProxy.java:58)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.access$000(ClientFlakeIdGeneratorProxy.java:33)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy$1.newIdBatch(ClientFlakeIdGeneratorProxy.java:45)
	at com.hazelcast.flakeidgen.impl.AutoBatcher.newId(AutoBatcher.java:58)
	at com.hazelcast.client.impl.proxy.ClientFlakeIdGeneratorProxy.newId(ClientFlakeIdGeneratorProxy.java:52)
	at hzcmd.id.flake.CreateUseDestroy.timeStep(CreateUseDestroy.java:11)
	at remote.bench.marker.MetricsMarker.flatOut(MetricsMarker.java:53)
	at remote.bench.marker.MetricsMarker.bench(MetricsMarker.java:40)
	at remote.bench.BenchThread.call(BenchThread.java:41)
	at remote.bench.BenchThread.call(BenchThread.java:14)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

jerrinot · 2020-09-29T14:12:10Z

There appears to be a race in a proxy initialization.

Consider this: There is a newly created flake id proxy and 2 user threads accessing the same proxy ends up in the getNode() method. At this point the nextNodeIdUpdate field is set to Long.MIN_VALUE and nodeId field to -1 (=NODE_ID_NOT_YET_SET)

The first thread enters the if branch and set the nextNodeIdUpdate field before setting the nodeId field.

Then the 2nd thread will NOT enter the if branch (because nextNodeIdUpdate <= nanoTime is not true) and will return uninitialized nodeId which still points it -1 and this violates the assertion

Possible solution:

Fix the race inside getNode().
Initialize nodeId in the proxy initialization hook so it never stays as -1

Fixes #16609 Reasoning: The `nextNodeIdUpdate` timestamp was updated BEFORE the `nodeId` field. This means when 2 threads were initializing the proxy concurrently then the 2nd thread could see the timestamp already updated then skip the IF branch and return a stale version of nodeId - which was not updated yet. This problem is fixed by updating the timestamp after the nodeId field. Additionally when the IF branch is not taken we still have to dereference the nodeId field. Otherwise we would return a stale value we had dereferenced before the IF branch.

Fixes #16609 Reasoning: The `nextNodeIdUpdate` timestamp was updated BEFORE the `nodeId` field. This means when 2 threads were initializing the proxy concurrently then the 2nd thread could see the timestamp already updated then skip the IF branch and return a stale version of nodeId - which was not updated yet. This problem is fixed by updating the timestamp after the nodeId field.

Danny-Hazelcast added Type: Defect Team: Core labels Feb 5, 2020

Danny-Hazelcast modified the milestones: 4.0.1, 4.1 Feb 5, 2020

mmedenjak added the Source: Internal PR or issue was opened by an employee label Feb 11, 2020

jerrinot self-assigned this Sep 29, 2020

jerrinot mentioned this issue Oct 2, 2020

Fix race in NodeId innitialization in FlakeId proxy #17671

Merged

mmedenjak closed this as completed in #17671 Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: nodeId out of range: -1 invokeWithArguments(MethodHandle.java:627) #16609

AssertionError: nodeId out of range: -1 invokeWithArguments(MethodHandle.java:627) #16609

Danny-Hazelcast commented Feb 5, 2020

Danny-Hazelcast commented Feb 27, 2020

jerrinot commented Sep 29, 2020

AssertionError: nodeId out of range: -1 invokeWithArguments(MethodHandle.java:627) #16609

AssertionError: nodeId out of range: -1 invokeWithArguments(MethodHandle.java:627) #16609

Comments

Danny-Hazelcast commented Feb 5, 2020

Danny-Hazelcast commented Feb 27, 2020

jerrinot commented Sep 29, 2020