[SPARK-13902][SCHEDULER] Make DAGScheduler.getAncestorShuffleDependencies() return in topological order to ensure building ancestor stages first. #11720

ueshin · 2016-03-15T08:52:09Z

What changes were proposed in this pull request?

DAGSchedulersometimes generate incorrect stage graph.
Some stages are generated for the same shuffleId twice or more and they are referenced by the child stages because the building order of the graph is not correct.

This patch is fixing it.

How was this patch tested?

I added the sample RDD graph to show the illegal stage graph to DAGSchedulerSuite.

…cal order to ensure building ancestor stages first.

SparkQA · 2016-03-15T11:06:43Z

Test build #53181 has finished for PR 11720 at commit f8b7910.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-16T04:59:31Z

Test build #53265 has finished for PR 11720 at commit 0ea3fc8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-16T15:16:49Z

Test build #53318 has finished for PR 11720 at commit 697b322.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-17T14:35:32Z

Test build #53425 has finished for PR 11720 at commit d6d3c34.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2016-03-28T06:14:05Z

core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

+   * Find ancestor shuffle dependencies that are not registered in shuffleToMapStage yet,
+   * in topological order to ensure building ancestor stages first.
+   */
+  private def getAncestorShuffleDependencies(rdd: RDD[_]): Seq[ShuffleDependency[_, _, _]] = {


ISTM it'd be better to check illegal overwrites in shuffleToMapStage like assert(!shuffleToMapStage.get(dep.shuffleId).isDefined) in the caller site https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L289.

maropu · 2016-03-28T06:35:08Z

@ueshin Great work. I'm not 100% sure though, one question; do we get wrong answers from this kind of incorrect stage graphs? If so, it'd be better to add tests in RDDSuite.

ueshin · 2016-03-28T08:26:40Z

@maropu Thank you for your review.
I modified as you mentioned.

do we get wrong answers from this kind of incorrect stage graphs?

I don't think this kind of incorrect stage graphs causes the wrong answers for now.

maropu · 2016-03-28T09:30:32Z

cc: @rxin

SparkQA · 2016-03-28T12:37:32Z

Test build #54298 has finished for PR 11720 at commit 8fb9a14.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

ueshin · 2016-03-28T14:56:47Z

Jenkins, retest this please.

markhamstra · 2016-03-28T15:07:29Z

core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

@@ -286,6 +286,7 @@ class DAGScheduler(
      case None =>
        // We are going to register ancestor shuffle dependencies
        getAncestorShuffleDependencies(shuffleDep.rdd).foreach { dep =>
+          assert(!shuffleToMapStage.get(dep.shuffleId).isDefined)


Why are you asserting this when you are already within a match case where this must be true? If it is actually possible for the contents of shuffleToMapStage to change between the pattern match and the foreach, then we need to fix the unsynchronized update, not just shutdown Spark on a failed assertion.

Wait... sorry, let me rethink this a bit.

The original codes possibly overwrite entries in shuffleToMapStage.

Yes, but it is not immediately obvious that that is inappropriate. I need to spend some time re-familiarizing myself with newOrUsedShuffleStage. In any event, just failing an assertion in the middle of the DAGScheduler is not likely something we want to do. At a bare minimum, we'd want to be logging a more useful error message.

Overwriting entries in shuffleToMapStage definitely isn't an outright error -- we should still get correct results after the overwrite; so we shouldn't be adding this new assertion to change an evaluation path that was generating correct results into an error condition.

There are other open efforts to clean up this creation of additional Stages that will be skipped instead of just re-using the prior-executed Stage more cleanly. That issue is orthogonal to the generation of a topological ordering of the dependencies, so I'd prefer to handle it outside of this PR.

cc @squito

@markhamstra Thank you for your review.
I'll remove the assertion for now but notice that there is a case that some stages refer the overwritten stage as its parent by the overwrite, which is undesirable for DAGScheduler.

Yes, I tend to agree that the way what really should be the same Stage gets recreated as a new Stage is undesirable in the DAGScheduler, but the more important point is that, even though it is not optimal, it does produce correct results, and we can't change that for the sake of code that seems more desirable.

That's not to say that some of the efforts that were begun to fix this kind of behavior while still generating correct results shouldn't be driven to completion.

Let me show you another PR to improve DAGScheduler performance based on this PR.

markhamstra · 2016-03-28T15:19:07Z

The basic idea is a good one, but in addition to needing to spend some time sorting out the logic around that assert, I'm a little concerned about the performance implications of the multiple scans of the data structures in the filter, forall and for. It looks to be possible to compose these so that the performance will be better at the expense of a little less legible code.

SparkQA · 2016-03-28T16:57:27Z

Test build #54319 has finished for PR 11720 at commit 8fb9a14.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

This reverts commit 1636531.

SparkQA · 2016-03-29T05:41:10Z

Test build #54407 has finished for PR 11720 at commit e2cfeaf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

ueshin · 2016-03-30T09:43:22Z

@markhamstra
Could you take a look at #12060 please?

ueshin · 2016-03-31T03:20:21Z

I'm going to close this in favor of #12060.

ueshin added 2 commits March 15, 2016 17:46

Add a test to check if the stage graph is properly built.

9a1724d

Make DAGScheduler.getAncestorShuffleDependencies() return in topologi…

f8b7910

…cal order to ensure building ancestor stages first.

Refactor getAncestorShuffleDependencies.

0ea3fc8

Fix topological sort.

697b322

Merge branch 'master' into issues/SPARK-13902

d6d3c34

maropu reviewed Mar 28, 2016
View reviewed changes

ueshin added 4 commits March 28, 2016 16:20

Add assertion to check not to overwrite illegally.

1636531

Modify to mitigate adds extra push&pop.

92e9f44

Modify comment.

4b412f5

Add a comment to explain what the test is doing.

8fb9a14

markhamstra reviewed Mar 28, 2016
View reviewed changes

Revert "Add assertion to check not to overwrite illegally."

e2cfeaf

This reverts commit 1636531.

ueshin mentioned this pull request Mar 30, 2016

[SPARK-14269][SCHEDULER] Eliminate unnecessary submitStage() call. #12060

Closed

ueshin closed this Mar 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-13902][SCHEDULER] Make DAGScheduler.getAncestorShuffleDependencies() return in topological order to ensure building ancestor stages first. #11720

[SPARK-13902][SCHEDULER] Make DAGScheduler.getAncestorShuffleDependencies() return in topological order to ensure building ancestor stages first. #11720

ueshin commented Mar 15, 2016

SparkQA commented Mar 15, 2016

SparkQA commented Mar 16, 2016

SparkQA commented Mar 16, 2016

SparkQA commented Mar 17, 2016

maropu Mar 28, 2016

maropu commented Mar 28, 2016

ueshin commented Mar 28, 2016

maropu commented Mar 28, 2016

SparkQA commented Mar 28, 2016

ueshin commented Mar 28, 2016

markhamstra Mar 28, 2016

markhamstra Mar 28, 2016

maropu Mar 28, 2016

markhamstra Mar 28, 2016

markhamstra Mar 28, 2016

ueshin Mar 29, 2016

markhamstra Mar 29, 2016

ueshin Mar 30, 2016

markhamstra commented Mar 28, 2016

SparkQA commented Mar 28, 2016

SparkQA commented Mar 29, 2016

ueshin commented Mar 30, 2016

ueshin commented Mar 31, 2016

[SPARK-13902][SCHEDULER] Make DAGScheduler.getAncestorShuffleDependencies() return in topological order to ensure building ancestor stages first. #11720

[SPARK-13902][SCHEDULER] Make DAGScheduler.getAncestorShuffleDependencies() return in topological order to ensure building ancestor stages first. #11720

Conversation

ueshin commented Mar 15, 2016

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Mar 15, 2016

SparkQA commented Mar 16, 2016

SparkQA commented Mar 16, 2016

SparkQA commented Mar 17, 2016

Choose a reason for hiding this comment

maropu commented Mar 28, 2016

ueshin commented Mar 28, 2016

maropu commented Mar 28, 2016

SparkQA commented Mar 28, 2016

ueshin commented Mar 28, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markhamstra commented Mar 28, 2016

SparkQA commented Mar 28, 2016

SparkQA commented Mar 29, 2016

ueshin commented Mar 30, 2016

ueshin commented Mar 31, 2016