New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19659][CORE][FOLLOW-UP] Fetch big blocks to disk when shuffle-read #18117
Conversation
cc @jinxing64 |
Test build #77415 has started for PR 18117 at commit |
retest this please |
Test build #77416 has finished for PR 18117 at commit
|
retest this please |
// TODO: Encryption and compression should be considered. | ||
// Fetch remote shuffle blocks to disk when the request is too large. Since the shuffle data is | ||
// already encrypted and compressed over the wire(w.r.t. the related configs), we can just fetch | ||
// the data and write it to file directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is really good. Sorry for my ambiguous.
def fetchShuffleBlock(blocksByAddress: Seq[(BlockManagerId, Seq[(BlockId, Long)])]): Unit = { | ||
// Set `maxBytesInFlight` and `maxReqsInFlight` to `Int.MaxValue`, so that during the | ||
// construction of `ShuffleBlockFetcherIterator`, all requests to fetch remote shuffle blocks | ||
// are issued. The `maxReqSizeShuffleToMem` is hard-coded as 200 here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@cloud-fan |
Test build #77425 has finished for PR 18117 at commit
|
thanks for the review, merging to master/2.2! |
…read ## What changes were proposed in this pull request? This PR includes some minor improvement for the comments and tests in #16989 ## How was this patch tested? N/A Author: Wenchen Fan <wenchen@databricks.com> Closes #18117 from cloud-fan/follow. (cherry picked from commit 1d62f8a) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This PR includes some minor improvement for the comments and tests in #16989
How was this patch tested?
N/A