-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-37675][SPARK-37793] Prevent overwriting of push shuffle merged files once the shuffle is finalized #35325
Conversation
@pan3793 Thanks for testing this. So with this patch are you still seeing that the executor is requesting for merged blocks which either had no data or were overwritten by the shuffle service? |
@otterc Oops, sorry, I forgot to restart the node manager so the patch does not take effect, let me retry again. |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just couple of comments.
.../network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
Outdated
Show resolved
Hide resolved
...ork-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
Show resolved
Hide resolved
… files once the corresponding shuffle is finalized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor style comments, looks good to me.
+CC @Ngone51
if (null == mergePartitionsInfo) { | ||
//If the mergePartitions was never created then it means that there weren't any push | ||
//blocks that were ever received for this shuffle. This could be the case when the driver | ||
//doesn't wait for enough time to start the stage which reads this shuffle data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space between //
and text.
//doesn't wait for enough time to start the stage which reads this shuffle data. | ||
return new AppShuffleMergePartitionsInfo(msg.shuffleMergeId, true); | ||
} else if (msg.shuffleMergeId < mergePartitionsInfo.shuffleMergeId | ||
|| mergePartitionsInfo.isFinalized()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move ||
to prev line
// 2. finalization of indeterminate stage if the shuffleMergeId related to it is the one | ||
// for which the message is received. | ||
shuffleMergePartitionsRef.set(mergePartitionsInfo.shuffleMergePartitions); | ||
return new AppShuffleMergePartitionsInfo(msg.shuffleMergeId, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: All paths in this lambda, which are not throwing exceptions, are doing the same thing - new AppShuffleMergePartitionsInfo(msg.shuffleMergeId, true);
Make that common ?
reduceIds.add(partition.reduceId); | ||
sizes.add(partition.getLastChunkOffset()); | ||
logger.debug("{} attempt {} shuffle {} shuffleMerge {}: finalization results added" | ||
+ " for partition {} data size {} index size {} meta size {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move the +
to prev line (here and elsewhere in string concat for log msgs)
@otterc Can you remove the WIP tag, given the PR should be complete now ? Thx |
.../network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
Outdated
Show resolved
Hide resolved
...ork-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
Outdated
Show resolved
Hide resolved
...ork-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Will wait for @pan3793 to confirm it works before merging (and tests success ofcourse).
Thanks @otterc and @mridulm, I verified this patch with 2 rounds of 1TB TPC-DS query, the issue is gone. One minor question, can we suppress this warning message? I think it's a little bit noisy for users.
|
Thanks @pan3793 for verifying this change. I saw those logs on the shuffle sever as well and the commit |
Thanks @otterc, would you please add some description of how the corner case caused the file to get overwritten to help others understand how the patch works? |
Thanks for verifying @pan3793 ! |
|
It looks unrelated, based on logs. |
… files once the shuffle is finalized ### What changes were proposed in this pull request? This fixes the bugs that were reported in SPARK-37675 and SPARK-37793. - Empty merged partitions were reported by the shuffle server to the driver. - The push merged files were getting overwritten after a shuffle merge is finalized. - Throwing exception in the finalization of a shuffle for which the shuffle server didn't receive any blocks. ### Why are the changes needed? Changes are need to fix the bug. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Have added unit test. Closes #35325 from otterc/SPARK-37675. Authored-by: Chandni Singh <singh.chandni@gmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 9afb407) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
… files once the shuffle is finalized ### What changes were proposed in this pull request? This fixes the bugs that were reported in SPARK-37675 and SPARK-37793. - Empty merged partitions were reported by the shuffle server to the driver. - The push merged files were getting overwritten after a shuffle merge is finalized. - Throwing exception in the finalization of a shuffle for which the shuffle server didn't receive any blocks. ### Why are the changes needed? Changes are need to fix the bug. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Have added unit test. Closes apache#35325 from otterc/SPARK-37675. Authored-by: Chandni Singh <singh.chandni@gmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 9afb407) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
What changes were proposed in this pull request?
This fixes the bugs that were reported in SPARK-37675 and SPARK-37793.
Why are the changes needed?
Changes are need to fix the bug.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Have added unit test.