HDDS-9735. Datanodes to retry close pipeline commands until pipeline is removed.#5643
Merged
sodonnel merged 15 commits intoapache:masterfrom Nov 22, 2023
Merged
HDDS-9735. Datanodes to retry close pipeline commands until pipeline is removed.#5643sodonnel merged 15 commits intoapache:masterfrom
sodonnel merged 15 commits intoapache:masterfrom
Conversation
…anode to enable token verififcation after restart.
sodonnel
approved these changes
Nov 21, 2023
Contributor
There was a problem hiding this comment.
This change LGTM. I am a little concerned that we retrieve the pipelineList for each action, as it is a newly formed list of new objects each time getPipelineReport() is called (see XceiverServerRatis), and then we have to scan that list for each pipelineAction. However I think these lists should be small and hence performance should not be a concern.
Contributor
seems related |
Contributor
Author
|
Thanks for the review @sodonnel and @adoroszlai. |
sodonnel
approved these changes
Nov 22, 2023
jojochuang
pushed a commit
to jojochuang/ozone
that referenced
this pull request
Feb 1, 2024
…il pipeline is removed. (apache#5643) (cherry picked from commit cf47339) Change-Id: Ic1ece669a42fd1be22a0ae52df0e66f666bbb6c4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Data node does not retry close pipeline command enough. If the SCM has just been restarted and leader election has not completed yet, the close pipeline request might be dropped by SCM.
In this case the Datanode does not retry sending the close pipeline action and the pipeline remains open until a manual close command is sent.
If the datanode triggered the pipeline close because the pipeline is bad then the new writes were still coming to this pipeline and continued to fail. This causes writes to become slow.
Proposed changes ensures that close pipeline requests are not removed from pending pipeline action queue until they are removed from the datanode.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9735
How was this patch tested?
Unit Test.