Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle #33896

Closed
wants to merge 33 commits into from

Conversation

venkata91
Copy link
Contributor

@venkata91 venkata91 commented Sep 2, 2021

What changes were proposed in this pull request?

As part of SPARK-32920 implemented a simple approach to finalization for push-based shuffle. Shuffle merge finalization is the final operation happens at the end of the stage when all the tasks are completed asking all the external shuffle services to complete the shuffle merge for the stage. Once this request is completed no more shuffle pushes will be accepted. With this approach, DAGScheduler waits for a fixed time of 10s (spark.shuffle.push.finalize.timeout) to allow some time for the inflight shuffle pushes to complete, but this adds additional overhead to stages with very little shuffles.

In this PR, instead of waiting for fixed amount of time before shuffle merge finalization now this is controlled adaptively if min threshold number of map tasks shuffle push (spark.shuffle.push.minPushRatio) completed then shuffle merge finalization will be scheduled. Also additionally if the total shuffle generated is lesser than min threshold shuffle size (spark.shuffle.push.minShuffleSizeToWait) then immediately shuffle merge finalization is scheduled.

Why are the changes needed?

This is a performance improvement to the existing functionality

Does this PR introduce any user-facing change?

Yes additional user facing configs spark.shuffle.push.minPushRatio and spark.shuffle.push.minShuffleSizeToWait

How was this patch tested?

Added unit tests in DAGSchedulerSuite, ShuffleBlockPusherSuite

Lead-authored-by: Min Shen mshen@linkedin.com
Co-authored-by: Venkata krishnan Sowrirajan vsowrirajan@linkedin.com

@venkata91 venkata91 changed the title [WIP] [SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle [WIP][SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle Sep 2, 2021
@github-actions github-actions bot added the CORE label Sep 2, 2021
@Victsm
Copy link
Contributor

Victsm commented Sep 2, 2021

Please add corresponding empty commits from authors of the original internal patch and tag that information in the PR description.

@SparkQA
Copy link

SparkQA commented Sep 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47866/

@SparkQA
Copy link

SparkQA commented Sep 16, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47866/

@SparkQA
Copy link

SparkQA commented Sep 16, 2021

Test build #143358 has finished for PR 33896 at commit 76c98ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47930/

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47930/

@SparkQA
Copy link

SparkQA commented Sep 17, 2021

Test build #143423 has finished for PR 33896 at commit 5b611bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@venkata91
Copy link
Contributor Author

Removed the WIP tag, this is now reviewable. cc @mridulm @Ngone51 @Victsm @otterc @zhouyejoe @rmcyang

@venkata91 venkata91 changed the title [WIP][SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle [SPARK-33701][SHUFFLE] Adaptive shuffle merge finalization for push-based shuffle Sep 20, 2021
@venkata91
Copy link
Contributor Author

venkata91 commented Sep 20, 2021

Also regarding the comment in the JIRA that not to finalize when there are no blocks pushed due to the threshold. I almost implemented that change, but then I thought anyway we are having a timeout (spark.shuffle.push.results.timeout (default 10 secs)) if finalize request don't get responded by then, scheduler moves forward submitting the next stages. Do you think we really get much with avoiding finalize when there are no blocks pushed? Thoughts? cc @Ngone51 @Victsm

@venkata91
Copy link
Contributor Author

Gentle ping @mridulm @Ngone51 @Victsm

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @venkata91 !
Took first pass through the PR, yet to go over the test suite.

core/src/main/scala/org/apache/spark/Dependency.scala Outdated Show resolved Hide resolved
core/src/main/scala/org/apache/spark/Dependency.scala Outdated Show resolved Hide resolved
core/src/main/scala/org/apache/spark/SparkEnv.scala Outdated Show resolved Hide resolved
task.cancel(false)
// The current task should be coming from handleShufflePushCompleted, thus the
// delay should be 0 and registerMergeResults should be true.
assert(delay == 0 && registerMergeResults)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this out of the if condition ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this not be the case outside of the if itself ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah since we are scheduling the finalize only if there is no scheduled task when the stage completes (one that checks if totalShuffleSize < shuffleMergeWaitMinSizeThreshold. We can move this assertion outside. Will make the change.

@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Test build #144974 has finished for PR 33896 at commit 04a4b6c.

  • This patch fails Scala style tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@github-actions github-actions bot added the DOCS label Nov 8, 2021
@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49445/

@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49445/

@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Test build #144975 has finished for PR 33896 at commit 9eeb7bf.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49446/

@SparkQA
Copy link

SparkQA commented Nov 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49446/

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50690/

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50690/

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Test build #146216 has finished for PR 33896 at commit d7d7546.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50735/

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50735/

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Test build #146261 has finished for PR 33896 at commit 616739c.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • (defaultdict(<class 'list'>,

@SparkQA
Copy link

SparkQA commented Dec 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50773/

@SparkQA
Copy link

SparkQA commented Dec 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50773/

@SparkQA
Copy link

SparkQA commented Dec 17, 2021

Test build #146301 has finished for PR 33896 at commit d5ff70c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a few minor comments, thanks for working on this @venkata91 - looks pretty good to me !

@mridulm
Copy link
Contributor

mridulm commented Dec 19, 2021

@Ngone51 I am fine with the pr (barring the minor testcase related comments).
Please feel free to merge when you are happy with the PR - I will be on vacation, so dont want to block your review/merge :-)

@venkata91
Copy link
Contributor Author

@Ngone51 Addressed @mridulm comments. Please take a look.

@SparkQA
Copy link

SparkQA commented Dec 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50878/

@SparkQA
Copy link

SparkQA commented Dec 20, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50878/

@SparkQA
Copy link

SparkQA commented Dec 20, 2021

Test build #146403 has finished for PR 33896 at commit 4930b3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public class RocksDB implements KVStore
  • public static class TypeAliases
  • class RocksDBIterator<T> implements KVStoreIterator<T>
  • class RocksDBTypeInfo
  • class Index
  • trait SwitchToDiskStoreListener
  • case class RegrCount(left: Expression, right: Expression)
  • public class LogDivertAppender extends AbstractWriterAppender<WriterManager>

@venkata91
Copy link
Contributor Author

Not sure why this test is failing org.apache.spark.sql.streaming.StreamingAggregationSuite. Any thoughts? @Ngone51 Let me try updating the master and merging it.

@venkata91
Copy link
Contributor Author

Not sure why this test is failing org.apache.spark.sql.streaming.StreamingAggregationSuite. Any thoughts? @Ngone51 Let me try updating the master and merging it.

I tried running it locally that succeeded though.

@venkata91
Copy link
Contributor Author

venkata91 commented Dec 29, 2021

Ok. Updating the PR with the latest master changes fixed the issue. But some of those test failures are transient and random. Every run has different failures compared to the previous one. @Ngone51 @mridulm Gentle reminder.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
+CC @Ngone51

@asfgit asfgit closed this in f6128a6 Jan 5, 2022
@mridulm
Copy link
Contributor

mridulm commented Jan 5, 2022

Thanks for working on this @venkata91 !
Thanks for the review @Ngone51 :-)

domybest11 pushed a commit to domybest11/spark that referenced this pull request Jun 15, 2022
wangyum pushed a commit that referenced this pull request May 26, 2023
…ased shuffle

As part of SPARK-32920 implemented a simple approach to finalization for push-based shuffle. Shuffle merge finalization is the final operation happens at the end of the stage when all the tasks are completed asking all the external shuffle services to complete the shuffle merge for the stage. Once this request is completed no more shuffle pushes will be accepted. With this approach, `DAGScheduler` waits for a fixed time of 10s (`spark.shuffle.push.finalize.timeout`) to allow some time for the inflight shuffle pushes to complete, but this adds additional overhead to stages with very little shuffles.

In this PR, instead of waiting for fixed amount of time before shuffle merge finalization now this is controlled adaptively if min threshold number of map tasks shuffle push (`spark.shuffle.push.minPushRatio`) completed then shuffle merge finalization will be scheduled. Also additionally if the total shuffle generated is lesser than min threshold shuffle size (`spark.shuffle.push.minShuffleSizeToWait`) then immediately shuffle merge finalization is scheduled.

This is a performance improvement to the existing functionality

Yes additional user facing configs `spark.shuffle.push.minPushRatio` and `spark.shuffle.push.minShuffleSizeToWait`

Added unit tests in `DAGSchedulerSuite`, `ShuffleBlockPusherSuite`

Lead-authored-by: Min Shen <mshenlinkedin.com>
Co-authored-by: Venkata krishnan Sowrirajan <vsowrirajanlinkedin.com>

Closes #33896 from venkata91/SPARK-33701.

Lead-authored-by: Venkata krishnan Sowrirajan <vsowrirajan@linkedin.com>
Co-authored-by: Min Shen <mshen@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants