New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3031. OM HA- Client requests get LeaderNotReadyException after OM's restart. #564
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bharat for fixing this.
The following requests were missed. They also need a check for replay before adding to doubleBuffer.
- OMVolumeDeleteRequest
- OMVolumeSetOwnerRequest
- OMVolumeSetQuotaRequest
@@ -164,7 +164,7 @@ public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager, | |||
// Replay implies the response has already been returned to | |||
// the client. So take no further action and return a dummy | |||
// OMClientResponse. | |||
LOG.debug("Replayed Transaction {} ignored. Request: {}", | |||
LOG.info("Replayed Transaction {} ignored. Request: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping this log message at info level will flood the log file. It prints the Request body also.
ozoneManagerDoubleBufferHelper.add(omClientResponse, | ||
transactionLogIndex)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed adding addResponseToDoubleBuffer() here.
Thank You @hanishakoneru for the review. |
1f09442
to
b09ecce
Compare
LGTM. +1. |
What changes were proposed in this pull request?
Fix LeaderNotReady exception which is thrown for write requests after OM restarts.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3031
How was this patch tested?
Deployed on the cluster and tried the scenario. With the PR fix, now able to execute client requests after OM restart.
I will see if I can add UT for this.