Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8560][UI] The Executors page will have negative if having resubmitted tasks #6950

Closed
wants to merge 1 commit into from

Conversation

XuTingjun
Copy link
Contributor

when the taskEnd.reason is Resubmitted, it shouldn't do statistics. Because this tasks has a SUCCESS taskEnd before.

@SparkQA
Copy link

SparkQA commented Jun 23, 2015

Test build #35538 has finished for PR 6950 at commit af35dc3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

This looks fine to me. I'll leave it a little in case others have more to add.

@XuTingjun
Copy link
Contributor Author

@tdas @pwendell Can you have a look on this patch, thanks!

@tdas
Copy link
Contributor

tdas commented Jun 27, 2015

@JoshRosen is a better person to check this out.

taskEnd.reason match {
case Resubmitted =>
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we still want to update the shuffle metrics down there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, because these Resubmitted SparkListenerTaskEnd tasks have post a Success SparkListenerTaskEnd before, all the metrics have been updated. If update here, the metrics will calculate twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, technically this is still incorrect because it uses the metrics of the first attempt of this task, which might be incomplete. However, a correct fix would be much more complicated (you have to essentially revert the metrics added in the failed task).

We should at least add a comment:

// Note: For resubmitted tasks, we continue to use the metrics that belong to the
// first attempt of this task. This may not be 100% accurate because the first attempt
// could have failed half-way through. The correct fix would be to keep track of the
// metrics added by each attempt, but this is much more complicated.

@andrewor14
Copy link
Contributor

Alright, this is good to go after you add that comment. Thanks @XuTingjun

@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36166 has finished for PR 6950 at commit af35dc3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

Alright, I fixed the comment myself and merged this into master. Thanks @XuTingjun.

@asfgit asfgit closed this in 79f0b37 Jun 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants