Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ehcache3 Client side code Ignoring hash invalidation requests when not fully constructed #2437

Closed
rkavanap opened this issue Jul 17, 2018 · 5 comments
Assignees
Milestone

Comments

@rkavanap
Copy link
Contributor

rkavanap commented Jul 17, 2018

The following messages are still occassionally seen on ehcache3 clients:

[Clients][client0] [WorkerThread(multi_request_ack_stage, 0)] WARN org.ehcache.clustered.client.internal.store.SimpleClusterTierClientEntity - Ignoring the response org.ehcache.clustered.common.internal.messages.EhcacheEntityResponse$ClientInvalidateHash@1a39c335 as no registered response listener could be found.

This looks like a race condition situation as after this message typically the client timesout awaiting the hash invalidation message.

@rkavanap
Copy link
Contributor Author

Note that the above was seen in a version that was 1 week old. Testing is in progress on the latest version to see if it still exists.

@rkavanap
Copy link
Contributor Author

rkavanap commented Jul 17, 2018

Some more analysis data:

[Clients][client1] org.ehcache.spi.resilience.StoreAccessException: java.util.concurrent.TimeoutException
[Clients][client1] 	at org.ehcache.core.exceptions.StorePassThroughException.handleException(StorePassThroughException.java:78)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.ClusteredStore.putIfAbsent(ClusteredStore.java:268)
[Clients][client1] 	at org.ehcache.core.Ehcache.doPutIfAbsent(Ehcache.java:133)
[Clients][client1] 	at org.ehcache.core.EhcacheBase.putIfAbsent(EhcacheBase.java:296)
[Clients][client1] 	at com.terracottatech.ehcache.clustered.frs.BasicCacheOpsMultiClientIT.runTest(BasicCacheOpsMultiClientIT.java:83)
[Clients][client1] 	at org.terracotta.testing.client.TestClientStub.main(TestClientStub.java:101)
[Clients][client1] Caused by: java.util.concurrent.TimeoutException
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.StrongServerStoreProxy.awaitOnLatch(StrongServerStoreProxy.java:175)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.StrongServerStoreProxy.performWaitingForHashInvalidation(StrongServerStoreProxy.java:114)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.StrongServerStoreProxy.getAndAppend(StrongServerStoreProxy.java:214)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.ReconnectingServerStoreProxy.lambda$getAndAppend$2(ReconnectingServerStoreProxy.java:69)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.ReconnectingServerStoreProxy.onStoreProxy(ReconnectingServerStoreProxy.java:99)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.ReconnectingServerStoreProxy.getAndAppend(ReconnectingServerStoreProxy.java:69)
[Clients][client1] 	at org.ehcache.clustered.client.internal.store.ClusteredStore.putIfAbsent(ClusteredStore.java:251)
[Clients][client1] 	... 4 more

@rkavanap
Copy link
Contributor Author

From the above set of comments, looks like the following is happening:

Client 1 completed its lifecycle, created its cache and did a putIfAbsent...

In the meantime, client 2 started its lifecycle causing platform to think that another client is connected to this entity. Now this client is not yet ready to accept messages (its response listeners are not setup)..So it totally ignores the ClientInvalidationHash message.

Client 1 times out waiting for the ClientInvalidationDone message

Myself and @AbfrmBlr chatted a bit about this and both of us feel this is where the window exists.

@rkavanap rkavanap changed the title ehcache3 Client side code Ignoring responses ehcache3 Client side code Ignoring hash invalidation requests when not fully constructed Jul 17, 2018
@rkavanap
Copy link
Contributor Author

just confirmed that this happens with the latest version as well.

rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 18, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 19, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 19, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 20, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 20, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 20, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 23, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 23, 2018
rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Jul 24, 2018
henri-tremblay added a commit that referenced this issue Jul 25, 2018
Fix #2437 : Wait for the client entity to complete init while processing server messages
@rkavanap rkavanap reopened this Aug 13, 2018
@rkavanap
Copy link
Contributor Author

This issue was not properly addressed as it was later realized that the core may re-order a fetchEntity response with the clientInvalidateHash and after this it was also realized that it is a single threaded SEDA stage, which means a wait inside responselistener is not correct.

rkavanap pushed a commit to rkavanap/ehcache3 that referenced this issue Aug 22, 2018
AbfrmBlr referenced this issue Aug 25, 2018
Fix#2437 :- Proper fix for clientinvalidation race
@henri-tremblay henri-tremblay added this to the 3.6.1 milestone Sep 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants