-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17231.HA: Safemode should exit when resources are from low to available #6207
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on the missing function to exit from safe mode when resource becomes available again. Clearly, the original author forgot to implement this function in code, while only mentioning this in the comment.
If resources are later found to have returned to acceptable levels, this daemon will cause the NN to exit safe mode.
LGTM. +1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1 from my side. Thanks @gp1314 for your great catch. I try to trigger Yetus manually, Let's wait what it will say.
410d3e5
to
9cc24f8
Compare
💔 -1 overall
This message was automatically generated. |
9cc24f8
to
8fce419
Compare
Thank you very much @haiyang1987 for fixing the issue #6214. I have rebased trunk and now I am unable to proceed with the jenkins/pr-merge step as it is giving an error "Could not checkout". I wanted to ask, what should I do now? Do I need to open a new pull request? |
🎊 +1 overall
This message was automatically generated. |
…vailable. (apache#6207). Contributed by Gu Peng. Reviewed-by: Xing Lin <xinglin@linkedin.com> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Description of PR
The NameNodeResourceMonitor automatically enters safe mode when it detects that the resources are not sufficient. When zkfc detects insufficient resources, it triggers failover. Consider the following scenario:
reproduction
How was this patch tested?
unit test.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?