-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement] Reduce task binary by removing 'partitionToServers' from RssShuffleHandle #615
Closed
3 tasks done
Comments
As tested with 10,000 partitions and 2 shuffle servers, binary size of vanilla spark task is about 4KB. whilst Rss shuffle task is more than 670KB. |
👍, thanks, this is good for optimization. And I believe once the stage recompute is finished, we will have a communication layer between executor and driver, that would be much easier to reduce task binary size. |
@advancedxy PR is ready. please help reivew. |
xianjingfeng
pushed a commit
that referenced
this issue
Mar 14, 2023
…s' from RssShuffleHandle (#637) ### What changes were proposed in this pull request? move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size. ### Why are the changes needed? to reduce task delay and task serialize/deserialize time by reduce task binary size ### Does this PR introduce any user-facing change? No. ### How was this patch tested? tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB. tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
advancedxy
pushed a commit
to advancedxy/incubator-uniffle
that referenced
this issue
Mar 21, 2023
…Servers' from RssShuffleHandle (apache#637) ### What changes were proposed in this pull request? move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size. ### Why are the changes needed? to reduce task delay and task serialize/deserialize time by reduce task binary size ### Does this PR introduce any user-facing change? No. ### How was this patch tested? tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB. tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
xianjingfeng
pushed a commit
to xianjingfeng/incubator-uniffle
that referenced
this issue
Apr 5, 2023
…Servers' from RssShuffleHandle (apache#637) ### What changes were proposed in this pull request? move partition -> shuffle servers mapping from direct field of RssShuffleHandle to a broadcast variable to reduce task binary size. ### Why are the changes needed? to reduce task delay and task serialize/deserialize time by reduce task binary size ### Does this PR introduce any user-facing change? No. ### How was this patch tested? tested with 10000 partitions shuffle. Task binary size reduced from more than 670KB to less than 6KB. tested with multiple shuffle stages in same job to verify ShuffleHandleInfo cache logic
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Code of Conduct
Search before asking
What would you like to be improved?
Both map and reduce tasks reference RssShuffleHandle wrapping 'partitionToServers' which is usually relatively far bigger than original task binary. E.g., we have a shuffle with 10,000 partitions. The 'patitionToServers' could easily reach to 250,000 bytes assuming each map entry has size of 25 bytes.
Large task binary causes long task delay and task serialization time. We can replace it with something else like a mapping function to map partitions to shuffle servers.
How should we improve?
Instead, we can replace 'partitionToServers' with something else like a mapping function which map parition ID to shuffle servers. We only get shuffle servers once from the first shuffle task and cache them for later shuffle tasks with same shuffle ID per executor.
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: