HDDS-5712. make it configurable to trigger refresh datanode usage info before start a new balance iteration#2944
Conversation
|
@lokeshj1703 @siddhantsangwan can you please take a look? thanks |
f0fe2e3 to
df2b9af
Compare
lokeshj1703
left a comment
There was a problem hiding this comment.
@JacksonYao287 Thanks for working on this! The changes look good to me.
Could we also add a UT in SCM to check the functionality of refresh command?
df124e7 to
2d56fdd
Compare
|
thanks @lokeshj1703 for the review! will add UT soon |
siddhantsangwan
left a comment
There was a problem hiding this comment.
@JacksonYao287 Looks good overall! I've added some review comments.
| * @param containerManager ContainerManager | ||
| * @param replicationManager ReplicationManager | ||
| * @param ozoneConfiguration OzoneConfiguration | ||
| * @param scm the storage container manager |
There was a problem hiding this comment.
NIT: extra whitespace
| this.containerManager = scm.getContainerManager(); | ||
| this.replicationManager = scm.getReplicationManager(); | ||
| this.ozoneConfiguration = scm.getConfiguration(); | ||
| this.config = new ContainerBalancerConfiguration(); |
There was a problem hiding this comment.
This should probably remain:
this.config = ozoneConfiguration.getObject(ContainerBalancerConfiguration.class);
There was a problem hiding this comment.
thanks , will fix this
| // this is helpful for container balancer to make more appropriate | ||
| // decisions. this will increase the disk io load of data nodes, so | ||
| // please enable it with caution. | ||
| sendRefreshUsageCommandToAllDNs(); |
There was a problem hiding this comment.
Since balancer will only use healthy, in-service DNs, do we need to trigger DU in all the DNs?
There was a problem hiding this comment.
good point, will fix this, thanks
2442a5d to
cc75e47
Compare
| // reporting back make it like this for now, a more suitable | ||
| // value. can be set in the future if needed | ||
| wait(3 * nodeReportInterval); | ||
| } catch (InterruptedException e) { |
There was a problem hiding this comment.
Forgot to mention, we'll also need to ensure the interrupted state of the thread isn't lost. Could add Thread.currentThread().interrupt();
There was a problem hiding this comment.
good point , will add this!
…ding refresh command
lokeshj1703
left a comment
There was a problem hiding this comment.
@JacksonYao287 Thanks for updating the PR! I have few minor comments.
...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
Outdated
Show resolved
Hide resolved
...hadoop/ozone/container/common/statemachine/commandhandler/TestRefreshVolumeUsageHandler.java
Outdated
Show resolved
Hide resolved
|
@lokeshj1703 thanks for the review, i have updated this patch , please take a look |
siddhantsangwan
left a comment
There was a problem hiding this comment.
@JacksonYao287 Thanks for updating! I have a few minor comments.
| getAllNodes().stream().filter(dn -> { | ||
| boolean isHealthy = false; | ||
| try { | ||
| isHealthy = getNodeStatus(dn).isHealthy(); | ||
| } catch (NodeNotFoundException nnfe) { | ||
| LOG.warn("datanode {} is not found", dn.getIpAddress()); | ||
| } | ||
| return isHealthy; |
There was a problem hiding this comment.
Can we use the getNodes( NodeOperationalState opState, NodeState health) method here? Is there a reason for not filtering DNs with NodeOperationalState as well as NodeState?
There was a problem hiding this comment.
thanks for the comment , will fix it
| description = "whether to send command to all the data nodes to run du " + | ||
| "immediately before starting a balance iteration. note that " + | ||
| "running du is very time consuming , especially when the disk " + | ||
| "usage rate of a data node is very high") |
There was a problem hiding this comment.
We can update the description to all healthy, in-service datanodes or something similar.
There was a problem hiding this comment.
sure , will fix it!
lokeshj1703
left a comment
There was a problem hiding this comment.
@JacksonYao287 Thanks for updating the PR! The changes look good to me. +1
Will merge once pending comments are addressed.
|
@siddhantsangwan thanks for the review, i have updated this patch according to you comments, please take a look ! |
siddhantsangwan
left a comment
There was a problem hiding this comment.
@JacksonYao287 Thanks for updating. Looks good to me!
|
@JacksonYao287 Thanks for the contribution! @siddhantsangwan Thanks for the reviews! I have committed the PR to master branch. |
What changes were proposed in this pull request?
make it configurable to trigger refresh datanode usage info before start a new balance iteration
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-5712
How was this patch tested?
ut