-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ISSUE-468] Put unavailable servers to the end of the list when sending shuffle data #470
Conversation
…nding shuffle data
Codecov Report
@@ Coverage Diff @@
## master #470 +/- ##
============================================
+ Coverage 58.74% 59.92% +1.17%
- Complexity 1664 1785 +121
============================================
Files 199 205 +6
Lines 11236 11557 +321
Branches 999 1043 +44
============================================
+ Hits 6601 6925 +324
+ Misses 4243 4226 -17
- Partials 392 406 +14
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
Could you modify your title and description? |
Done |
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me. ping @jerqi to see if he has more input/comments.
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
for (ShuffleServerInfo ssi : serverList) { | ||
if (!includeBlockList && replica > 1 && !shuffleServerBlockList.isEmpty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it allocate the servers which is less than replicaNum
if exclude nodes are too many?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xianjingfeng for the work, I have left some comments, PTAL.
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
client/src/main/java/org/apache/uniffle/client/impl/ShuffleWriteClientImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @xianjingfeng.
@@ -105,6 +106,7 @@ public class ShuffleWriteClientImpl implements ShuffleWriteClient { | |||
private final ExecutorService dataTransferPool; | |||
private final int unregisterThreadPoolSize; | |||
private final int unregisterRequestTimeSec; | |||
private Set<ShuffleServerInfo> shuffleServerBlocklist; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use the name shuffleServerBlocklist
? What's the meaning of this variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocklist is equal to Blacklist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block is a concept of our system. This name make me confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatives: disallowlist
, denylist
, excludelist
.
Or maybe just blacklist
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block is a concept of our system. This name make me confused.
I think so too, but i have not good idea. How about defectiveServerList? We need to unify our opinions. 😂 @jerqi @kaijchen @advancedxy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too, but i have not good idea. How about defectiveServerList? We need to unify our opinions. 😂 @jerqi @kaijchen @advancedxy
Maybe just defectiveServers
? Because it's actually a Set
.
And withBlocklist
may be changed to excludeDefectiveServers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so too, but i have not good idea. How about defectiveServerList? We need to unify our opinions. 😂 @jerqi @kaijchen @advancedxy
Maybe just
defectiveServers
? Because it's actually aSet
. AndwithBlocklist
may be changed toexcludeDefectiveServers
.
It's ok for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for defectiveServers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
if (assignedNum < replicaNum && withBlocklist) { | ||
genServerToBlocks(sbi, serverList, replicaNum - assignedNum, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do this cause that one shuffle server will be allocated twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Assigned server will be added to excludeServers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @xianjingfeng @advancedxy @kaijchen
What changes were proposed in this pull request?
Put unavailable shuffle servers to the end of the server list when sending shuffle data if replica=1.
Why are the changes needed?
If we use multiple replicas and the first shuffle server becomes unavailable, sending data will take a lot of time. Because the client will always send to the first shuffle server firstly. #468
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT