Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10081][Core ] Skip re-computing getMissingParentStages #8269

Closed
wants to merge 1 commit into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Aug 18, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-10081

This is a small patch that skips one unnecessary getMissingParentStages calling in handleJobSubmitted.

@SparkQA
Copy link

SparkQA commented Aug 18, 2015

Test build #41105 has finished for PR 8269 at commit 8416b7f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ElementwiseProduct(JavaTransformer, HasInputCol, HasOutputCol):

@squito
Copy link
Contributor

squito commented Aug 18, 2015

lgtm

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41509 has finished for PR 8269 at commit 8416b7f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -732,7 +732,8 @@ class DAGScheduler(
job.jobId, callSite.shortForm, partitions.length))
logInfo("Final stage: " + finalStage + "(" + finalStage.name + ")")
logInfo("Parents of final stage: " + finalStage.parents)
logInfo("Missing parents: " + getMissingParentStages(finalStage))
val missingStages = getMissingParentStages(finalStage)
logInfo("Missing parents: " + missingStages)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor but should this use string interpolation? as a general principle for speed and conciseness, though it won't matter here.

@andrewor14
Copy link
Contributor

How much time does this actually save? This seems fine I wonder if this is actually worth doing since you only traverse the stage hierarchy again, which most likely isn't that deep.

@andrewor14
Copy link
Contributor

On second thought I don't think the optimization here is worth the extra complexity. How much time do you actually save? Even in something like ALS you only have say 20 stages or something. The new signature is actually somewhat confusing: what does it mean to pass in both a stage and its missing stages? What if we accidentally pass in the wrong missing stages?

I would recommend we close this PR since the scheduling delay it saves is negligible.

@asfgit asfgit closed this in 804a012 Sep 4, 2015
@viirya viirya deleted the skip-recomp-missingstages branch December 27, 2023 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants