New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402
Conversation
Test build #41475 has finished for PR 8402 at commit
|
Test build #41487 has finished for PR 8402 at commit
|
(Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts)) | ||
}.toSeq | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try this on for size. There's some more that could be done similar to this, but as an experiment I just did the refactoring for these three methods, and I think I like the results.
private implicit class ExtendedTaskSet(taskSet: TaskSet) {
/** Validate state when creating tests for task failures. */
def checkStageId(stageId: Int, attempt: Int): Unit = {
assert(taskSet.stageId === stageId,
s": expected stage $stageId, got ${taskSet.stageId}")
assert(taskSet.stageAttemptId == attempt,
s": expected stage attempt $attempt, got ${taskSet.stageAttemptId}")
}
/** Send the given CompletionEvent messages for the tasks in the TaskSet. */
def complete(results: Seq[(TaskEndReason, Any)]) {
assert(taskSet.tasks.length >= results.size)
for ((result, i) <- results.zipWithIndex) {
if (i < taskSet.tasks.length) {
runEvent(CompletionEvent(
taskSet.tasks(i), result._1, result._2, null, createFakeTaskInfo(), null))
}
}
}
def makeCompletions(reduceParts: Int): Seq[(Success.type, MapStatus)] = {
taskSet.tasks.zipWithIndex.map { case (task, idx) =>
(Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts))
}.toSeq
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this too:
def complete(results: (TaskEndReason, Any)*)
Gets rid of a lot of Seq( ... )
noise at the cost of a few : _*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Mark -- these are the utils I'm stealing from #5636 (where these reduce a lot of noise), might be best to move this discussion there? Then again this pr is probably the easiest to merge, so maybe we can do this here first?
I don't particularly care either way, just want to avoid a fractured discussion (which would be my fault for doing this ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in any case I'll take a closer look at the alternatives that you suggested here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so after taking a closer look, these extra helper methods are only used in the completeNextStage...
helpers (at least, here and in #5636). I I feel like using an implicit seems a little too heavy for that -- its a pretty minor change to the code, and it will the code harder to understand for a lot of people. I'd like to try to at least keep the tests as accessible as possible.
I agree that there are bazillion calls to complete
that will benefit from varargs.
f768bef
to
8818b3f
Compare
Test build #41833 has finished for PR 8402 at commit
|
8818b3f
to
cfcf4e6
Compare
Test build #41999 has finished for PR 8402 at commit
|
LGTM |
Although this seems to have failed another test? |
retest this please |
Test build #42031 has finished for PR 8402 at commit
|
@squito can you bring this up to date? |
Conflicts: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
retest this please |
Test build #43990 has finished for PR 8402 at commit
|
Test build #44005 has finished for PR 8402 at commit
|
Merging into master and 1.6 |
just trying to increase test coverage in the scheduler, this already works. It includes a regression test for SPARK-9809 copied some test utils from #5636, we can wait till that is merged first Author: Imran Rashid <irashid@cloudera.com> Closes #8402 from squito/test_retry_in_shared_shuffle_dep. (cherry picked from commit 33112f9) Signed-off-by: Andrew Or <andrew@databricks.com>
This appears to have broken the build. I'm going to spend 5 minutes trying to hotfix and will revert if I can't come up with a quick fix. |
Verified locally that this fixes the tests, so I'm merging to master and 1.6. |
This fixes an NPE introduced in SPARK-10192 / #8402. Author: Josh Rosen <joshrosen@databricks.com> Closes #9620 from JoshRosen/SPARK-10192-hotfix. (cherry picked from commit fac53d8) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
This fixes an NPE introduced in SPARK-10192 / #8402. Author: Josh Rosen <joshrosen@databricks.com> Closes #9620 from JoshRosen/SPARK-10192-hotfix.
just trying to increase test coverage in the scheduler, this already works. It includes a regression test for SPARK-9809
copied some test utils from #5636, we can wait till that is merged first