Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resiliency: Perform write consistency check just before writing on the primary shard #7873

Conversation

martijnvg
Copy link
Member

Before this change the write consistency change was performed on the node that receives the write request and the node that holds the primary shard. This change removes the check on the node that receives the request, since it is redundant.

Also this change moves the write consistency check on the node that holds the primary shard to a later moment after forking of the thread to perform the actual write on the primary shard.

@@ -506,6 +484,7 @@ public void handleException(TransportException exp) {
void retry(@Nullable final Throwable failure) {
if (observer.isTimedOut()) {
// we running as a last attempt after a timeout has happened. don't retry
raiseTimeoutFailure(internalRequest.request().timeout(), failure);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note sure, but this seems really crucial (and thus a good catch). Do we have a test that verifies indexing timeout works?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh , I see. This is a protection against recursion during a last attempt we do on timeout, where the timeout flag is set (isTimeOut doesn't actually check for timeouts). This pattern smells (and is probably my fault, if I didn't take over an existing pattern) but I'm worried this change is no good? (and we need a comment explaining why).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, without this change the WriteConsistencyLevelTests failed all the time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one line change is okay and not bad at all. Because the write validation check now happens on in performOnPrimary() the doStart() will return true even if the check fails, so the following check in the onTimeout() method on the observer listener doesn't help in case the write check fails:

if (doStart()) {
   return;
}
raiseTimeoutFailure(timeout, failure);

If the write check fails we always invoke the retry(...) method, so I think it is fine to invoke raiseTimeoutFailure() there as well.

@clintongormley clintongormley changed the title Perform write consistency check on node that holds primary shard instead of the node that receives the request Resiliency: Perform write consistency check on node that holds primary shard instead of the node that receives the request Sep 26, 2014
@martijnvg martijnvg changed the title Resiliency: Perform write consistency check on node that holds primary shard instead of the node that receives the request Resiliency: Perform write consistency check only on the node that holds primary shard Sep 26, 2014
@martijnvg martijnvg changed the title Resiliency: Perform write consistency check only on the node that holds primary shard Resiliency: Perform write consistency just before writing on the primary shard Sep 29, 2014
@martijnvg
Copy link
Member Author

@bleskes I updated the PR.

@s1monw
Copy link
Contributor

s1monw commented Oct 2, 2014

this LGTM @bleskes your call

}
blockException = checkRequestBlock(observer.observedState(), internalRequest);
if (blockException != null) {
if (blockException.retryable()) {
logger.trace("cluster is blocked ({}), scheduling a retry", blockException.getMessage());
retry(blockException);
return false;
return;
} else {
throw blockException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the change, but I think this is bad - we don't always catch this exceptions - for example where we're on the onTimeout handler of the observer. I think we should change the observer listeners to have an on failure - but that would be another change. In the meantime - can we make sure doStart doesn't throw exceptions by design but rather calls the listener.onFailure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratch that - it's caught later on in the code.

@bleskes
Copy link
Contributor

bleskes commented Oct 6, 2014

The new retry logic is much cleaner. I left some minor comments. O.w. LGTM.

@martijnvg
Copy link
Member Author

@bleskes Thanks for the review. I've updated the PR.

@bleskes
Copy link
Contributor

bleskes commented Oct 6, 2014

Thx @martijnvg . Left one little comment about the change. I'm good with this, +1

Before this change the write consistency change was performed on the node that receives the write request and the node that holds the primary shard. This change removes the check on the node that receives the request, since it is redundant.

Also this change moves the write consistency check on the node that holds the primary shard to a later moment after forking of the thread to perform the actual write on the primary shard.

Closes elastic#7873
@martijnvg martijnvg force-pushed the improvements/write-consistency-check-on-node-holding-primary-shard branch from 0fd8232 to e1a8b02 Compare October 6, 2014 17:00
@martijnvg martijnvg merged commit e1a8b02 into elastic:master Oct 6, 2014
martijnvg added a commit that referenced this pull request Oct 6, 2014
Before this change the write consistency change was performed on the node that receives the write request and the node that holds the primary shard. This change removes the check on the node that receives the request, since it is redundant.

Also this change moves the write consistency check on the node that holds the primary shard to a later moment after forking of the thread to perform the actual write on the primary shard.

Closes #7873
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Oct 6, 2014
Before this change the write consistency change was performed on the node that receives the write request and the node that holds the primary shard. This change removes the check on the node that receives the request, since it is redundant.

Also this change moves the write consistency check on the node that holds the primary shard to a later moment after forking of the thread to perform the actual write on the primary shard.

Closes elastic#7873
@martijnvg martijnvg removed the review label Oct 6, 2014
@clintongormley clintongormley changed the title Resiliency: Perform write consistency just before writing on the primary shard Resiliency: Perform write consistency check just before writing on the primary shard Nov 3, 2014
@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Mar 19, 2015
@martijnvg martijnvg deleted the improvements/write-consistency-check-on-node-holding-primary-shard branch May 18, 2015 23:29
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
Before this change the write consistency change was performed on the node that receives the write request and the node that holds the primary shard. This change removes the check on the node that receives the request, since it is redundant.

Also this change moves the write consistency check on the node that holds the primary shard to a later moment after forking of the thread to perform the actual write on the primary shard.

Closes elastic#7873
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants