-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17037. Consider nonDfsUsed when running balancer. #5715
Conversation
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave comments inline. JFYI.
I think we try to balance based on usedSpace (only HDFS used) is not the perfect solution, Maybe we should consider some other information involved.
@@ -104,21 +104,21 @@ void accumulateSpaces(DatanodeStorageReport r) { | |||
for(StorageReport s : r.getStorageReports()) { | |||
final StorageType t = s.getStorage().getStorageType(); | |||
totalCapacities.add(t, s.getCapacity()); | |||
totalUsedSpaces.add(t, s.getDfsUsed()); | |||
totalUsedSpaces.add(t, (s.getCapacity() - s.getRemaining())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could not be true when using s.getCapacity() - s.getRemaining()
directly. For instance, it will be negative if one storage with total capacity 10MB,but configured Capacity 1MB for hdfs, and it remain 8MB after run for a while, then s.getCapacity() - s.getRemaining()
will be -710241024.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion. I checked that FsVolumeImpl#configuredCapacity
is set only by FsVolumeImpl#setCapacityForTesting
, and FsVolumeImpl#setCapacityForTesting
is used for unit tests. So s.getCapacity() - s.getRemaining()
will never be negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Suggest to leave some annotations about this here.
} | ||
} | ||
return capacity == 0L? null: dfsUsed*100.0/capacity; | ||
return capacity == 0L? null: totalUsed*100.0/capacity; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should fix the codestyle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -104,21 +104,21 @@ void accumulateSpaces(DatanodeStorageReport r) { | |||
for(StorageReport s : r.getStorageReports()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the BalancingPolicy.Pool also have the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of BalancingPolicy.Pool
is to balance each pool in each datanode. If we use (capacity - remaining)/capacity
to compute util
here, it must be greater or equal than blockPoolUsed/capacity
. In this way, each block pool will get the same util, so the initial goal will not be achieved. All in all, s.getRemaining()
is at Node level, so I think it not very properly to modify BalancingPolicy.Pool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use blockPoolUsed / (remaining + blockPoolUsed)
may be a proper solution. What about your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
for(StorageReport s : r.getStorageReports()) { | ||
if (s.getStorage().getStorageType() == t) { | ||
capacity += s.getCapacity(); | ||
dfsUsed += s.getDfsUsed(); | ||
totalUsed += (s.getCapacity() - s.getRemaining()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From #5708 report, s.getRemaining()
could be negative in some corner case. So if we should defend it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
💔 -1 overall
This message was automatically generated. |
@@ -104,21 +104,21 @@ void accumulateSpaces(DatanodeStorageReport r) { | |||
for(StorageReport s : r.getStorageReports()) { | |||
final StorageType t = s.getStorage().getStorageType(); | |||
totalCapacities.add(t, s.getCapacity()); | |||
totalUsedSpaces.add(t, s.getDfsUsed()); | |||
totalUsedSpaces.add(t, (s.getCapacity() - s.getRemaining())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not necessary to involve extra outer brackets here.
totalUsedSpaces.add(t, (s.getCapacity() - s.getRemaining()));
-> totalUsedSpaces.add(t, s.getCapacity() - s.getRemaining());
for(StorageReport s : r.getStorageReports()) { | ||
if (s.getStorage().getStorageType() == t) { | ||
capacity += s.getCapacity(); | ||
dfsUsed += s.getDfsUsed(); | ||
totalUsed += (s.getCapacity() - s.getRemaining()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above.
@@ -138,7 +138,7 @@ String getName() { | |||
void accumulateSpaces(DatanodeStorageReport r) { | |||
for(StorageReport s : r.getStorageReports()) { | |||
final StorageType t = s.getStorage().getStorageType(); | |||
totalCapacities.add(t, s.getCapacity()); | |||
totalCapacities.add(t, s.getRemaining() + s.getBlockPoolUsed()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it's difficult to understand the meaning of this improvement, just suggest add some annotations for Pool policy and describe what and why.
@Hexiaoqiao Thanks for your suggestion. The code style problems are fixed and some extra comments are added. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1. Will commit if no more other comments.
Committed to trunk. Thanks @zhangshuyan0 for your contribution! |
Description of PR
When we run balancer with
BalancingPolicy.Node
policy, our goal is to make each datanode storage balanced. But in the current implementation, the balancer doesn't account for storage used by non-dfs on the datanodes, which can make the situation worse for datanodes that are already strained on storage.How was this patch tested?
Add a new UT.