New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31034][CORE] ShuffleBlockFetcherIterator should always create request for last block group #27786
Conversation
@@ -339,14 +339,14 @@ final class ShuffleBlockFetcherIterator( | |||
+ s"with ${blocks.size} blocks") | |||
} | |||
|
|||
def createFetchRequests(): Unit = { | |||
def createFetchRequests(hasMore: Boolean): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: isLast
?
@@ -367,12 +367,12 @@ final class ShuffleBlockFetcherIterator( | |||
// For batch fetch, the actual block in flight should count for merged block. | |||
val mayExceedsMaxBlocks = !doBatchFetch && curBlocks.size >= maxBlocksInFlightPerAddress | |||
if (curRequestSize >= targetRemoteRequestSize || mayExceedsMaxBlocks) { | |||
createFetchRequests() | |||
createFetchRequests(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's write down the parameter name.
Test build #119285 has finished for PR 27786 at commit
|
|
||
var numResults = 0 | ||
// After initialize(), there will be 6 FetchRequests, and the each of the first 5 | ||
// includes 3 merged blocks and the last one has 1 merged block. So, only the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are 6 merged blocks in total, how can each request includes 3 merged blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or do you mean shuffle blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's shuffle blocks in this case. But it seems inconsistent with comment in the below test....I need reword it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: 3 merged blocks
-> 3 shuffle block(not batch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok let's update
var numResults = 0 | ||
// After initialize(), there will be 2 FetchRequests that one has 2 merged blocks and another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: 2 merged blocks
-> 2 ShuffleBlockBatch
Test build #119294 has finished for PR 27786 at commit
|
Test build #119287 has finished for PR 27786 at commit
|
// After initialize(), there will be 6 FetchRequests. And each of the first 5 requests | ||
// includes 1 merged block which is merged from 3 shuffle blocks. The last request has 1 merged | ||
// block which merged from 2 shuffle blocks. So, only the first 5 requests(5 * 3 * 100 >= 1500) | ||
// can be sent. The second FetchRequest will hit maxBlocksInFlightPerAddress so it won't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second
-> The 6th
?
Test build #119308 has finished for PR 27786 at commit
|
Test build #119319 has finished for PR 27786 at commit
|
thanks, merging to master/3.0! |
…request for last block group ### What changes were proposed in this pull request? This is a bug fix of #27280. This PR fix the bug where `ShuffleBlockFetcherIterator` may forget to create request for the last block group. ### Why are the changes needed? When (all blocks).sum < `targetRemoteRequestSize` and (all blocks).length > `maxBlocksInFlightPerAddress` and (last block group).size < `maxBlocksInFlightPerAddress`, `ShuffleBlockFetcherIterator` will not create a request for the last group. Thus, it will lost data for the reduce task. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Updated test. Closes #27786 from Ngone51/fix_no_request_bug. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 2257ce2) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
thanks all! |
…request for last block group ### What changes were proposed in this pull request? This is a bug fix of apache#27280. This PR fix the bug where `ShuffleBlockFetcherIterator` may forget to create request for the last block group. ### Why are the changes needed? When (all blocks).sum < `targetRemoteRequestSize` and (all blocks).length > `maxBlocksInFlightPerAddress` and (last block group).size < `maxBlocksInFlightPerAddress`, `ShuffleBlockFetcherIterator` will not create a request for the last group. Thus, it will lost data for the reduce task. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Updated test. Closes apache#27786 from Ngone51/fix_no_request_bug. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This is a bug fix of #27280. This PR fix the bug where
ShuffleBlockFetcherIterator
may forget to create request for the last block group.Why are the changes needed?
When (all blocks).sum <
targetRemoteRequestSize
and (all blocks).length >maxBlocksInFlightPerAddress
and (last block group).size <maxBlocksInFlightPerAddress
,ShuffleBlockFetcherIterator
will not create a request for the last group. Thus, it will lost data for the reduce task.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Updated test.