-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the patch looks good.
Added some minor comments.
} else if (closeCommand.getForce()) { | ||
// SCM told us to force close the container. | ||
controller.closeContainer(containerId); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly related to this patch, but this part of code has become a little bit messy.
We should be able to refactor this.
switch (container.getContainerState()) {
case OPEN:
controller.markContainerForClose(containerId);
case CLOSING:
final HddsProtos.PipelineID pipelineID = closeCommand.getPipelineID();
final XceiverServerSpi writeChannel = ozoneContainer.getWriteChannel();
if (writeChannel.isExist(pipelineID)) {
writeChannel.submitRequest(getContainerCommandRequestProto(
datanodeDetails, containerId), pipelineID);
} else {
controller.markContainerUnhealthy(containerId);
}
break;
case QUASI_CLOSED:
if (closeCommand.getForce()) {
controller.closeContainer(containerId);
break;
}
case CLOSED:
case UNHEALTHY:
case INVALID:
LOG.debug("Cannot close the container #{}, the container is" +
" in {} state.", containerId, container.getContainerState());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 2nd comit
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If markContainerForClose fails, quasiCloseContainer will definitely fail. We can put both of the calls into same try catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 2nd comit
💔 -1 overall
This message was automatically generated. |
@nandakumar131 Thanks for reviewing the PR! 2nd commit addresses checkstyle issues and review comments. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Test failures are not related. |
Right now, if a pipeline is destroyed by SCM, all the container on the pipeline are marked as quasi closed when datanode received close container command. SCM while processing these containers reports, marks these containers as closed once majority of the nodes are available.
This is however not a sufficient condition in cases where the raft log directory is missing or corrupted. As the containers will not have all the applied transaction.
To solve this problem, we should QUASI_CLOSE the containers in datanode as part of ratis groupRemove. If a container is in OPEN state in datanode without any active pipeline, it will be marked as Unhealthy while processing close container command.