Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10192] [core] simple test w/ failure involving a shared dependency #8402

Closed
wants to merge 2 commits into from

Conversation

squito
Copy link
Contributor

@squito squito commented Aug 24, 2015

just trying to increase test coverage in the scheduler, this already works. It includes a regression test for SPARK-9809

copied some test utils from #5636, we can wait till that is merged first

@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #41475 has finished for PR 8402 at commit ccddc47.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41487 has finished for PR 8402 at commit f768bef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class EndListener extends SparkListener

(Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts))
}.toSeq
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this on for size. There's some more that could be done similar to this, but as an experiment I just did the refactoring for these three methods, and I think I like the results.

private implicit class ExtendedTaskSet(taskSet: TaskSet) {
    /** Validate state when creating tests for task failures. */
    def checkStageId(stageId: Int, attempt: Int): Unit = {
      assert(taskSet.stageId === stageId,
        s": expected stage $stageId, got ${taskSet.stageId}")
      assert(taskSet.stageAttemptId == attempt,
        s": expected stage attempt $attempt, got ${taskSet.stageAttemptId}")
    }

    /** Send the given CompletionEvent messages for the tasks in the TaskSet. */
    def complete(results: Seq[(TaskEndReason, Any)]) {
      assert(taskSet.tasks.length >= results.size)
      for ((result, i) <- results.zipWithIndex) {
        if (i < taskSet.tasks.length) {
          runEvent(CompletionEvent(
            taskSet.tasks(i), result._1, result._2, null, createFakeTaskInfo(), null))
        }
      }
    }

    def makeCompletions(reduceParts: Int): Seq[(Success.type, MapStatus)] = {
      taskSet.tasks.zipWithIndex.map { case (task, idx) =>
        (Success, makeMapStatus("host" + ('A' + idx).toChar, reduceParts))
      }.toSeq
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this too:

def complete(results: (TaskEndReason, Any)*)

Gets rid of a lot of Seq( ... ) noise at the cost of a few : _*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Mark -- these are the utils I'm stealing from #5636 (where these reduce a lot of noise), might be best to move this discussion there? Then again this pr is probably the easiest to merge, so maybe we can do this here first?

I don't particularly care either way, just want to avoid a fractured discussion (which would be my fault for doing this ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in any case I'll take a closer look at the alternatives that you suggested here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so after taking a closer look, these extra helper methods are only used in the completeNextStage... helpers (at least, here and in #5636). I I feel like using an implicit seems a little too heavy for that -- its a pretty minor change to the code, and it will the code harder to understand for a lot of people. I'd like to try to at least keep the tests as accessible as possible.

I agree that there are bazillion calls to complete that will benefit from varargs.

@SparkQA
Copy link

SparkQA commented Aug 31, 2015

Test build #41833 has finished for PR 8402 at commit 8818b3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class EndListener extends SparkListener

@SparkQA
Copy link

SparkQA commented Sep 4, 2015

Test build #41999 has finished for PR 8402 at commit cfcf4e6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mateiz
Copy link
Contributor

mateiz commented Sep 4, 2015

LGTM

@mateiz
Copy link
Contributor

mateiz commented Sep 4, 2015

Although this seems to have failed another test?

@mateiz
Copy link
Contributor

mateiz commented Sep 5, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42031 has finished for PR 8402 at commit cfcf4e6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

@squito can you bring this up to date?

Conflicts:
	core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Oct 20, 2015

Test build #43990 has finished for PR 8402 at commit e93e35e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 20, 2015

Test build #44005 has finished for PR 8402 at commit e93e35e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

Merging into master and 1.6

asfgit pushed a commit that referenced this pull request Nov 11, 2015
just trying to increase test coverage in the scheduler, this already works.  It includes a regression test for SPARK-9809

copied some test utils from #5636, we can wait till that is merged first

Author: Imran Rashid <irashid@cloudera.com>

Closes #8402 from squito/test_retry_in_shared_shuffle_dep.

(cherry picked from commit 33112f9)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in 33112f9 Nov 11, 2015
@JoshRosen
Copy link
Contributor

This appears to have broken the build. I'm going to spend 5 minutes trying to hotfix and will revert if I can't come up with a quick fix.

@JoshRosen
Copy link
Contributor

Verified locally that this fixes the tests, so I'm merging to master and 1.6.

asfgit pushed a commit that referenced this pull request Nov 11, 2015
This fixes an NPE introduced in SPARK-10192 / #8402.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #9620 from JoshRosen/SPARK-10192-hotfix.

(cherry picked from commit fac53d8)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
asfgit pushed a commit that referenced this pull request Nov 11, 2015
This fixes an NPE introduced in SPARK-10192 / #8402.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #9620 from JoshRosen/SPARK-10192-hotfix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants