Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to specify custom tags to decide the assignment of servers #30

Merged
merged 5 commits into from Jul 8, 2022

Conversation

zuston
Copy link
Member

@zuston zuston commented Jul 5, 2022

What changes were proposed in this pull request?

Allow to specify custom tags to decide data placement.
Changelog

  1. Introduce the conf of rss.server.tags for shuffle server
  2. Introduce the conf of rss.client.assignment.tags for client to choose the data placement

Why are the changes needed?

Sometimes, we hope the specified spark/mr job's shuffle data can be stored the specified group of shuffle servers, maybe due to DC location and so on.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

UTs.

@zuston
Copy link
Member Author

zuston commented Jul 5, 2022

@colinmjj @jerqi Could u help review?

@jerqi
Copy link
Contributor

jerqi commented Jul 5, 2022

You should add some integration test cases.

@zuston
Copy link
Member Author

zuston commented Jul 5, 2022

You should add some integration test cases.

Yes, i’m doing it. Sorry not to add it firstly.

@jerqi
Copy link
Contributor

jerqi commented Jul 5, 2022

Em...If we use HDFS to store data, the shuffle server will not store data, data placement is a little inappropriate, Could we give a better name?

@@ -56,6 +58,9 @@ public class RssClientConfig {
// When the size of read buffer reaches the half of JVM region (i.e., 32m),
// it will incur humongous allocation, so we set it to 14m.
public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m";
// The tags specified by rss client to determine shuffle data placement.
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we give it a better name ?
The reason is as below:
#30 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rss.client.shuffle-server.assignment.tags ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rss.client.assignment.tags will be better.

@@ -56,6 +58,9 @@ public class RssClientConfig {
// When the size of read buffer reaches the half of JVM region (i.e., 32m),
// it will incur humongous allocation, so we set it to 14m.
public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m";
// The tags specified by rss client to determine shuffle data placement.
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags";
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS_DEFAULT_VALUES = Constants.SHUFFLE_SERVER_VERSION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A shuffle server can have multiple tags. The version tag shouldn't be overrided the configurable tags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,I will remove the default value of this conf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, our client tags should contain version tag in any time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got your thought.

@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2022

Codecov Report

Merging #30 (d7bb0fc) into master (bba68a9) will increase coverage by 0.03%.
The diff coverage is 67.74%.

@@             Coverage Diff              @@
##             master      #30      +/-   ##
============================================
+ Coverage     56.83%   56.87%   +0.03%     
- Complexity     1204     1207       +3     
============================================
  Files           152      152              
  Lines          8401     8429      +28     
  Branches        813      816       +3     
============================================
+ Hits           4775     4794      +19     
- Misses         3368     3376       +8     
- Partials        258      259       +1     
Impacted Files Coverage Δ
...apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java 0.00% <0.00%> (ø)
...rg/apache/uniffle/client/util/RssClientConfig.java 0.00% <0.00%> (ø)
.../java/org/apache/uniffle/server/ShuffleServer.java 72.26% <71.42%> (-0.30%) ⬇️
.../java/org/apache/hadoop/mapreduce/RssMRConfig.java 91.66% <100.00%> (+0.75%) ⬆️
.../java/org/apache/spark/shuffle/RssSparkConfig.java 88.88% <100.00%> (+1.38%) ⬆️
...org/apache/spark/shuffle/RssSparkShuffleUtils.java 49.20% <100.00%> (+6.34%) ⬆️
...he/uniffle/client/impl/ShuffleWriteClientImpl.java 25.98% <100.00%> (ø)
...a/org/apache/uniffle/server/ShuffleServerConf.java 99.30% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bba68a9...d7bb0fc. Read the comment docs.

@zuston zuston requested a review from jerqi July 8, 2022 05:02
Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contribution.

@jerqi jerqi changed the title Allow to specify custom tags to decide data placement Allow to specify custom tags to decide the assignment of servers Jul 8, 2022
@jerqi
Copy link
Contributor

jerqi commented Jul 8, 2022

Would you like to add some docs about this feature in another pr? Because this feature brings some user-facing changes, we need some docs.

@zuston
Copy link
Member Author

zuston commented Jul 8, 2022

Would you like to add some docs about this feature in another pr? Because this feature brings some user-facing changes, we need some docs.

Glad to do. I will open another PR to finish.

@jerqi jerqi merged commit 4a51ff3 into apache:master Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants