Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Reduce task binary by removing 'partitionToServers' from RssShuffleHandle #615

Closed
3 tasks done
jiafuzha opened this issue Feb 16, 2023 · 3 comments · Fixed by #637
Closed
3 tasks done

Comments

@jiafuzha
Copy link
Contributor

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

Both map and reduce tasks reference RssShuffleHandle wrapping 'partitionToServers' which is usually relatively far bigger than original task binary. E.g., we have a shuffle with 10,000 partitions. The 'patitionToServers' could easily reach to 250,000 bytes assuming each map entry has size of 25 bytes.

Large task binary causes long task delay and task serialization time. We can replace it with something else like a mapping function to map partitions to shuffle servers.

How should we improve?

Instead, we can replace 'partitionToServers' with something else like a mapping function which map parition ID to shuffle servers. We only get shuffle servers once from the first shuffle task and cache them for later shuffle tasks with same shuffle ID per executor.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@jiafuzha
Copy link
Contributor Author

jiafuzha commented Feb 17, 2023

As tested with 10,000 partitions and 2 shuffle servers, binary size of vanilla spark task is about 4KB. whilst Rss shuffle task is more than 670KB.

@advancedxy
Copy link
Contributor

👍, thanks, this is good for optimization. And I believe once the stage recompute is finished, we will have a communication layer between executor and driver, that would be much easier to reduce task binary size.

@jiafuzha
Copy link
Contributor Author

@advancedxy PR is ready. please help reivew.

xianjingfeng pushed a commit that referenced this issue Mar 14, 2023
…s' from RssShuffleHandle (#637)

### What changes were proposed in this pull request?
move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size.

### Why are the changes needed?
to reduce task delay and task serialize/deserialize time by reduce task binary size

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB.
tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
advancedxy pushed a commit to advancedxy/incubator-uniffle that referenced this issue Mar 21, 2023
…Servers' from RssShuffleHandle (apache#637)

### What changes were proposed in this pull request?
move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size.

### Why are the changes needed?
to reduce task delay and task serialize/deserialize time by reduce task binary size

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB.
tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
xianjingfeng pushed a commit to xianjingfeng/incubator-uniffle that referenced this issue Apr 5, 2023
…Servers' from RssShuffleHandle (apache#637)

### What changes were proposed in this pull request?
move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size.

### Why are the changes needed?
to reduce task delay and task serialize/deserialize time by reduce task binary size

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB.
tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants