-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16735. Reduce the number of HeartbeatManager loops. #4780
Conversation
LGTM. Should we add |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangshuyan0 Great catch here. It is very useful for large cluster which is time cost to traverse all node many loops. Thanks.
@@ -557,7 +557,8 @@ public class DFSConfigKeys extends CommonConfigurationKeys { | |||
// This value uses the times of heartbeat interval to define the minimum value for stale interval. | |||
public static final String DFS_NAMENODE_STALE_DATANODE_MINIMUM_INTERVAL_KEY = "dfs.namenode.stale.datanode.minimum.interval"; | |||
public static final int DFS_NAMENODE_STALE_DATANODE_MINIMUM_INTERVAL_DEFAULT = 3; // i.e. min_interval is 3 * heartbeat_interval = 9s | |||
|
|||
public static final String DFS_NAMENODE_REMOVE_BAD_BATCH_NUM = "dfs.namenode.remove.bad.batch.num"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DFS_NAMENODE_REMOVE_DEAD_DATANODE_BATCHNUM_KEY = "dfs.namenode.remove.dead.datanode.batchnum";
this will be more readable?
@@ -71,6 +71,7 @@ class HeartbeatManager implements DatanodeStatistics { | |||
/** Heartbeat monitor thread. */ | |||
private final Daemon heartbeatThread = new Daemon(new Monitor()); | |||
private final StopWatch heartbeatStopWatch = new StopWatch(); | |||
private final int removeBatchNum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just suggest to define one more readable variable name, such as numOfDeadDatanodesRemove
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll change the names.
@@ -96,6 +97,9 @@ class HeartbeatManager implements DatanodeStatistics { | |||
enableLogStaleNodes = conf.getBoolean( | |||
DFSConfigKeys.DFS_NAMENODE_ENABLE_LOG_STALE_DATANODE_KEY, | |||
DFSConfigKeys.DFS_NAMENODE_ENABLE_LOG_STALE_DATANODE_DEFAULT); | |||
this.removeBatchNum = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.removeBatchNum = conf.getInt(
DFSConfigKeys.DFS_NAMENODE_REMOVE_BAD_BATCH_NUM,
DFSConfigKeys.DFS_NAMENODE_REMOVE_BAD_BATCH_NUM_DEFAULT);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've fixed these code style issues.
...adoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
Show resolved
Hide resolved
@@ -492,12 +498,12 @@ void heartbeatCheck() { | |||
// log nodes detected as stale since last heartBeat | |||
dumpStaleNodes(staleNodes); | |||
|
|||
allAlive = dead == null && failedStorage == null; | |||
allAlive = deadDatanodes.size() == 0 && failedStorages.size() == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isEmpty()
Thanks @slfan1989 , I have added the relevant configuration to the xml file. This patch is mainly to improve the processing performance of HeartbeatManager in large-scale clusters, and has no impact on existing functions, so I did not add a ut. |
💔 -1 overall
This message was automatically generated. |
@zhangshuyan0 fix checkstyle |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1.
Committed to trunk. Thanks @zhangshuyan0 for your contributions! Thanks @goiri , @slfan1989 for your reviews! |
…. Contributed by Shuyan Zhang. Signed-off-by: Inigo Goiri <inigoiri@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Description of PR
How was this patch tested?
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?