-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement] Implement the interface ShuffleDataIO #802
Comments
I'm Interested in this, could you assign it to me? |
OK, I have assigned to you. |
The patch [SPARK-42689] that allows ShuffleDriverComponent to declare if shuffle data is reliably stored has been fixed in Spark 3.5, which has not yet been released. Do I need to test it? |
Yes, you can build a spark client with master branch to test it. |
Okay, I will do that. |
volunteering |
@Kwafoor If you're busy recently, let @summaryzb finish this issue first. |
### What changes were proposed in this pull request? Implement ShuffleDataIo ### Why are the changes needed? #802 ### Does this PR introduce _any_ user-facing change? Yes. To use spark dynamicAllocation, user should set `spark.shuffle.sort.io.plugin.class` to `org.apache.spark.shuffle.RssShuffleDataIo`, otherwise spark3.5 will fail ### How was this patch tested? Integration test Manual test
Code of Conduct
Search before asking
What would you like to be improved?
We can implement ShuffleDataIO
https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/shuffle/api/ShuffleDataIO.java
https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/shuffle/api/ShuffleDriverComponents.java
Dynamic Allocation don't need patch any more.
How should we improve?
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: