Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client IOException escapes to user #9696

Closed
sancar opened this issue Jan 19, 2017 · 10 comments
Closed

Client IOException escapes to user #9696

sancar opened this issue Jan 19, 2017 · 10 comments

Comments

@sancar
Copy link
Member

@sancar sancar commented Jan 19, 2017

Background on the issue for anyone come across this exception:
IOException is a retryable Exception for HazelcastClient. It should only send to user when hazelcast.client.invocation.timeout.seconds is passed. This value is 2 minutes by default.
From reports, we dont really know if this value is passed or not. We may need to have a log to tell if invocation timeout is violated. Issue is not reproduced so far in local.

Reports so far indicates issue appeared in 3.7

Sample reported stack traces

	java.io.IOException: No available connection to address [REDACTED]:5702
	com.hazelcast.core.HazelcastException: java.io.IOException: No available connection to address [REDACTED]:5702
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:73) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:63) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:52) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:83) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.client.spi.ClientProxy.invokeOnPartition(ClientProxy.java:155) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.client.spi.ClientProxy.invoke(ClientProxy.java:147) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.client.proxy.ClientMapProxy.getInternal(ClientMapProxy.java:245) ~[hazelcast-all-3.7.jar!/:3.7]
	at com.hazelcast.client.proxy.ClientMapProxy.get(ClientMapProxy.java:240) ~[hazelcast-all-3.7.jar!/:3.7]

and

		com.hazelcast.core.HazelcastException: java.io.IOException: Not able to setup owner connection!
	    at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:73)
	    at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:63)
	    at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:52)
	    at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:83)
	    at com.hazelcast.client.spi.ClientProxy.invokeOnPartition(ClientProxy.java:155)
	    at com.hazelcast.client.spi.ClientProxy.invoke(ClientProxy.java:147)
	    at com.hazelcast.client.proxy.ClientMapProxy.setInternal(ClientMapProxy.java:557)
	    at com.hazelcast.client.proxy.ClientMapProxy.set(ClientMapProxy.java:550)
	    at com.hazelcast.client.proxy.ClientMapProxy.set(ClientMapProxy.java:1364)

For details see following links:
a complaint from google group
https://groups.google.com/d/msg/hazelcast/ALfDuwXIgD8/e2ZILDe4EgAJ

Two related issues found by @Danny-Hazelcast that we thought fixed but it seems issue remains.
#8859
#8919

@sancar sancar added this to the 3.7.6 milestone Jan 19, 2017
@sancar sancar self-assigned this Jan 27, 2017
@sancar
Copy link
Member Author

@sancar sancar commented Feb 9, 2017

Closing the issue since the related shutdown builds shows that issue has gone away.
I made both similar local tests and runs with hzCmd-bench, issue is not reproduced.

@sancar sancar closed this Feb 9, 2017
@sancar sancar reopened this Feb 10, 2017
@Danny-Hazelcast
Copy link
Member

@Danny-Hazelcast Danny-Hazelcast commented Feb 10, 2017

com.hazelcast.core.HazelcastException: java.io.IOException: No available connection to address [10.0.0.22]:5701
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:73)
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:63)
	at com.hazelcast.util.ExceptionUtil.peel(ExceptionUtil.java:52)
	at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:83)
	at com.hazelcast.client.spi.ClientProxy.invokeOnPartition(ClientProxy.java:156)
	at com.hazelcast.client.proxy.PartitionSpecificClientProxy.invokeOnPartition(PartitionSpecificClientProxy.java:47)
	at com.hazelcast.client.proxy.ClientLockProxy.lock(ClientLockProxy.java:93)
	at hz.lock.LeaseLock.timeStep(LeaseLock.java:15)
	at remote.bench.marker.MetricsMarker.flatOut(MetricsMarker.java:52)
	at remote.bench.marker.MetricsMarker.bench(MetricsMarker.java:39)
	at remote.bench.BenchThread.call(BenchThread.java:38)
	at remote.bench.BenchThread.call(BenchThread.java:12)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: No available connection to address [10.0.0.22]:5701
	at com.hazelcast.client.spi.impl.ClientSmartInvocationServiceImpl.getOrTriggerConnect(ClientSmartInvocationServiceImpl.java:88)
	at com.hazelcast.client.spi.impl.ClientSmartInvocationServiceImpl.invokeOnPartitionOwner(ClientSmartInvocationServiceImpl.java:47)
	at com.hazelcast.client.spi.impl.ClientInvocation.invokeOnSelection(ClientInvocation.java:133)
	at com.hazelcast.client.spi.impl.ClientInvocation.invoke(ClientInvocation.java:114)
	at com.hazelcast.client.spi.impl.ClientInvocation.run(ClientInvocation.java:144)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at com.hazelcast.util.executor.LoggingScheduledExecutor$LoggingDelegatingFuture.run(LoggingScheduledExecutor.java:128)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
	at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
	at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92)
	at ------ submitted from ------.(Unknown Source)
	at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrowIfException(ClientInvocationFuture.java:74)
	at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrowIfException(ClientInvocationFuture.java:31)
	at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:155)
	at com.hazelcast.client.spi.ClientProxy.invokeOnPartition(ClientProxy.java:154)
	... 13 more
@sancar sancar removed their assignment Feb 10, 2017
@enesakar enesakar modified the milestones: 3.8.1, 3.7.6 Feb 13, 2017
@Danny-Hazelcast
Copy link
Member

@Danny-Hazelcast Danny-Hazelcast commented Feb 13, 2017

same fail again https://hazelcast-l337.ci.cloudbees.com/view/shutdown/job/shutdown-lease-lock/175/console

it all ways seams to be the leas lock test, that hits the exception. @mmedenjak was saying some thing about lease lock operations not being retried in case of partition "loss". could this be related to this IOException escapes to users ?

@mmedenjak
Copy link
Contributor

@mmedenjak mmedenjak commented Feb 13, 2017

@Danny-Hazelcast @sancar
The issue that we have detected is that the lock could remain locked after the lease expires.

This is because unlock operation isn't retried if the operation failed for some reason (e.g. partition is being migrated because of members joining/leaving).

I am not sure if this is related to this issue.

@jl2008
Copy link

@jl2008 jl2008 commented Mar 8, 2017

It happened again for us, with no particular trigger, not even a long GC. A client with smart-routing enabled, I'll disable it and see how it goes.

@Danny-Hazelcast
Copy link
Member

@Danny-Hazelcast Danny-Hazelcast commented Mar 21, 2017

are java.io.IOException
now wrapped by
com.hazelcast.core.HazelcastException: java.io.IOException:

@sancar
Copy link
Member Author

@sancar sancar commented Mar 21, 2017

It was always wrapped to HazelcastException.
Since IOException is not a RuntimeException, we can not throw it without wrapping it anyway.

@sancar
Copy link
Member Author

@sancar sancar commented Jun 6, 2017

I did improvements on invocation system to the places I suspected. We prevent IOException and throw OperationTimeoutException when client.invocation.timeout.seconds expires. Now we can differentiate timeout from unexpected IOException. Closing for now, we will take a look if we see IOException and this time we will be sure it is unexpected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.