Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent system in case of operation retry #7270

Closed
pveentjer opened this issue Jan 8, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@pveentjer
Copy link
Member

commented Jan 8, 2016

If the system retries on operation because a member is leaving the cluster, the invocation can be retried due to the response but also due to the member-left-event. In most cases this should not lead to problem, but it can happen that the invocation is executed twice. This can be a problem and can lead to a permanent inconsistent system.

The simplest way I have come up to deal with this properly, making 1 thread responsible for dealing with retrying request; the InvocationMonitorThread. It scans all invocations periodically anyway. When an invocation needs retrying, any thread can set a flag on the invocation (volatile Object retry e.g.). The InvocationMonitorThread can check for this flag and trigger a retry.

The InvocationMonitorThread will also be in charge of modifying the fields of the Invocation/Operation and this should resolve a whole bunch of potential race problems in case of retrying.

For this to work propperly, all invocations need to be registered in the InvocationRegistry so that the InvocationMonitor can detect the retry request. For the time being this is a problem since readonly local calls skip the InvocationRegistry for performance reasons, but in 3.7 we'll get an improved InvocationRegistry where registration/deregistration is very cheap and doesn't generate any litter.

@pveentjer pveentjer self-assigned this Jan 8, 2016

@pveentjer pveentjer added this to the 3.7 milestone Jan 8, 2016

@pveentjer pveentjer modified the milestones: 3.8, 3.7 Jun 13, 2016

@mdogan

This comment has been minimized.

Copy link
Member

commented Dec 28, 2016

I think this has been already fixed in 3.8.

  • Duplicate retry problem on member left is fixed by #9296 #9303
  • Skipping local readonly invocation registration is removed by #8610

@mdogan mdogan closed this Dec 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.