[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

squito · 2015-08-24T21:39:54Z

just trying to increase test coverage in the scheduler, this already works. It includes a regression test for SPARK-9809

copied some test utils from #5636, we can wait till that is merged first

SparkQA · 2015-08-24T21:52:53Z

Test build #41475 has finished for PR 8402 at commit ccddc47.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-08-25T01:51:06Z

Test build #41487 has finished for PR 8402 at commit f768bef.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class EndListener extends SparkListener

markhamstra · 2015-08-25T21:43:18Z

core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala

+      (Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts))
+    }.toSeq
+  }
+


Try this on for size. There's some more that could be done similar to this, but as an experiment I just did the refactoring for these three methods, and I think I like the results.

private implicit class ExtendedTaskSet(taskSet: TaskSet) { /** Validate state when creating tests for task failures. */ def checkStageId(stageId: Int, attempt: Int): Unit = { assert(taskSet.stageId === stageId, s": expected stage $stageId, got ${taskSet.stageId}") assert(taskSet.stageAttemptId == attempt, s": expected stage attempt $attempt, got ${taskSet.stageAttemptId}") } /** Send the given CompletionEvent messages for the tasks in the TaskSet. */ def complete(results: Seq[(TaskEndReason, Any)]) { assert(taskSet.tasks.length >= results.size) for ((result, i) <- results.zipWithIndex) { if (i < taskSet.tasks.length) { runEvent(CompletionEvent( taskSet.tasks(i), result._1, result._2, null, createFakeTaskInfo(), null)) } } } def makeCompletions(reduceParts: Int): Seq[(Success.type, MapStatus)] = { taskSet.tasks.zipWithIndex.map { case (task, idx) => (Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts)) }.toSeq } }

I like this too:

def complete(results: (TaskEndReason, Any)*)

Gets rid of a lot of Seq( ... ) noise at the cost of a few : _*.

thanks Mark -- these are the utils I'm stealing from #5636 (where these reduce a lot of noise), might be best to move this discussion there? Then again this pr is probably the easiest to merge, so maybe we can do this here first?

I don't particularly care either way, just want to avoid a fractured discussion (which would be my fault for doing this ...)

in any case I'll take a closer look at the alternatives that you suggested here

so after taking a closer look, these extra helper methods are only used in the completeNextStage... helpers (at least, here and in #5636). I I feel like using an implicit seems a little too heavy for that -- its a pretty minor change to the code, and it will the code harder to understand for a lot of people. I'd like to try to at least keep the tests as accessible as possible.

I agree that there are bazillion calls to complete that will benefit from varargs.

SparkQA · 2015-08-31T19:42:41Z

Test build #41833 has finished for PR 8402 at commit 8818b3f.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class EndListener extends SparkListener

SparkQA · 2015-09-04T16:36:49Z

Test build #41999 has finished for PR 8402 at commit cfcf4e6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mateiz · 2015-09-04T19:08:21Z

LGTM

mateiz · 2015-09-04T19:09:35Z

Although this seems to have failed another test?

mateiz · 2015-09-05T00:02:30Z

retest this please

SparkQA · 2015-09-05T02:17:18Z

Test build #42031 has finished for PR 8402 at commit cfcf4e6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2015-10-16T20:48:36Z

@squito can you bring this up to date?

Conflicts: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala

andrewor14 · 2015-10-20T19:01:31Z

retest this please

SparkQA · 2015-10-20T19:39:02Z

Test build #43990 has finished for PR 8402 at commit e93e35e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-20T20:57:43Z

Test build #44005 has finished for PR 8402 at commit e93e35e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2015-11-11T00:50:13Z

Merging into master and 1.6

just trying to increase test coverage in the scheduler, this already works. It includes a regression test for SPARK-9809 copied some test utils from #5636, we can wait till that is merged first Author: Imran Rashid <irashid@cloudera.com> Closes #8402 from squito/test_retry_in_shared_shuffle_dep. (cherry picked from commit 33112f9) Signed-off-by: Andrew Or <andrew@databricks.com>

JoshRosen · 2015-11-11T05:55:59Z

This appears to have broken the build. I'm going to spend 5 minutes trying to hotfix and will revert if I can't come up with a quick fix.

JoshRosen · 2015-11-11T06:21:27Z

Verified locally that this fixes the tests, so I'm merging to master and 1.6.

This fixes an NPE introduced in SPARK-10192 / #8402. Author: Josh Rosen <joshrosen@databricks.com> Closes #9620 from JoshRosen/SPARK-10192-hotfix. (cherry picked from commit fac53d8) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

This fixes an NPE introduced in SPARK-10192 / #8402. Author: Josh Rosen <joshrosen@databricks.com> Closes #9620 from JoshRosen/SPARK-10192-hotfix.

markhamstra reviewed Aug 25, 2015
View reviewed changes

squito mentioned this pull request Aug 26, 2015

[SPARK-5945] Spark should not retry a stage infinitely on a FetchFailedException #5636

Closed

squito force-pushed the test_retry_in_shared_shuffle_dep branch from f768bef to 8818b3f Compare August 31, 2015 16:41

simple test w/ failure involving a shared dependency

cfcf4e6

squito force-pushed the test_retry_in_shared_shuffle_dep branch from 8818b3f to cfcf4e6 Compare September 4, 2015 13:59

squito mentioned this pull request Sep 4, 2015

[SPARK-9851] Support submitting map stages individually in DAGScheduler #8180

Closed

Merge branch 'master' into test_retry_in_shared_shuffle_dep

e93e35e

Conflicts: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala

asfgit closed this in 33112f9 Nov 11, 2015

JoshRosen mentioned this pull request Nov 11, 2015

[HOTFIX][SPARK-10192] Fix NPE in test that was added in #8402 #9620

Closed

asfgit pushed a commit that referenced this pull request Nov 11, 2015

[SPARK-10192][HOTFIX] Fix NPE in test that was added in #8402

fac53d8

This fixes an NPE introduced in SPARK-10192 / #8402. Author: Josh Rosen <joshrosen@databricks.com> Closes #9620 from JoshRosen/SPARK-10192-hotfix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

squito commented Aug 24, 2015

SparkQA commented Aug 24, 2015

SparkQA commented Aug 25, 2015

markhamstra Aug 25, 2015

markhamstra Aug 25, 2015

squito Aug 25, 2015

squito Aug 25, 2015

squito Aug 26, 2015

SparkQA commented Aug 31, 2015

SparkQA commented Sep 4, 2015

mateiz commented Sep 4, 2015

mateiz commented Sep 4, 2015

mateiz commented Sep 5, 2015

SparkQA commented Sep 5, 2015

andrewor14 commented Oct 16, 2015

andrewor14 commented Oct 20, 2015

SparkQA commented Oct 20, 2015

SparkQA commented Oct 20, 2015

andrewor14 commented Nov 11, 2015

JoshRosen commented Nov 11, 2015

JoshRosen commented Nov 11, 2015

[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

Conversation

squito commented Aug 24, 2015

SparkQA commented Aug 24, 2015

SparkQA commented Aug 25, 2015

markhamstra Aug 25, 2015

Choose a reason for hiding this comment

markhamstra Aug 25, 2015

Choose a reason for hiding this comment

squito Aug 25, 2015

Choose a reason for hiding this comment

squito Aug 25, 2015

Choose a reason for hiding this comment

squito Aug 26, 2015

Choose a reason for hiding this comment

SparkQA commented Aug 31, 2015

SparkQA commented Sep 4, 2015

mateiz commented Sep 4, 2015

mateiz commented Sep 4, 2015

mateiz commented Sep 5, 2015

SparkQA commented Sep 5, 2015

andrewor14 commented Oct 16, 2015

andrewor14 commented Oct 20, 2015

SparkQA commented Oct 20, 2015

SparkQA commented Oct 20, 2015

andrewor14 commented Nov 11, 2015

JoshRosen commented Nov 11, 2015

JoshRosen commented Nov 11, 2015