Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2299] Consolidate various stageIdTo* hash maps in JobProgressListener #1262

Closed
wants to merge 6 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Jun 29, 2014

This should reduce memory usage for the web ui as well as slightly increase its speed in draining the UI event queue.

@andrewor14

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@@ -67,28 +61,28 @@ private[ui] class StagePage(parent: JobProgressTab) extends WebUIPage("stage") {
<ul class="unstyled">
<li>
<strong>Total task time across all tasks: </strong>
{UIUtils.formatDuration(listener.stageIdToTime.getOrElse(stageId, 0L) + activeTime)}
{UIUtils.formatDuration(stageData.executorRunTime)}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I dropped activeTime here (time taken for currently active tasks) because I'm not sure if the extra data structure required to track this is worth the benefit (I don't know if anybody really looks at this ...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tasks that have already finished, doesn't activeTime already include executorRunTime? If so, aren't we double counting for those tasks? It seems to me that what it was before was plain wrong, though maybe I'm misunderstanding something.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16240/

@@ -17,6 +17,8 @@

package org.apache.spark.ui.jobs

import org.apache.spark.ui.jobs.UIData.StageUIData

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mind grouping this with other org.apache.spark imports?

@tsudukim
Copy link
Contributor

Should we have only "stageId" as key of these HashMap?
Related to SPARK-2298 at JIRA, I think both stage and attemptId should be included into key in order to discriminate the original stage from re-submitted (attemptId is incremented) stage.

@pwendell
Copy link
Contributor

Yeah I think @rxin is going to change this so that we index on both the stage and attempt. Also, we'll need to extend the listener interface to give both the attempt and stage for a task.

rxin added 2 commits July 16, 2014 22:43
Conflicts:
	core/src/main/scala/org/apache/spark/ui/jobs/ExecutorSummary.scala
	core/src/main/scala/org/apache/spark/ui/jobs/ExecutorTable.scala
	core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
	core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
	core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala
@rxin
Copy link
Contributor Author

rxin commented Jul 17, 2014

I pushed a new version. I'd first merge this and then have a separate PR to index the hash table by stageId + attempt.

Now it includes @kayousterhout's change. Please take another look.

@SparkQA
Copy link

SparkQA commented Jul 17, 2014

QA tests have started for PR 1262. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16768/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 17, 2014

QA results for PR 1262:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16768/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 18, 2014

QA tests have started for PR 1262. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16795/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 18, 2014

QA results for PR 1262:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16795/consoleFull

@rxin
Copy link
Contributor Author

rxin commented Jul 18, 2014

Merging in master. Thanks for reviewing.

@asfgit asfgit closed this in 72e9021 Jul 18, 2014
@rxin rxin deleted the ui-consolidate-hashtables branch July 18, 2014 02:45
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…istener

This should reduce memory usage for the web ui as well as slightly increase its speed in draining the UI event queue.

@andrewor14

Author: Reynold Xin <rxin@apache.org>

Closes apache#1262 from rxin/ui-consolidate-hashtables and squashes the following commits:

1ac3f97 [Reynold Xin] Oops. Properly handle description.
f5736ad [Reynold Xin] Code review comments.
b8828dc [Reynold Xin] Merge branch 'master' into ui-consolidate-hashtables
7a7b6c4 [Reynold Xin] Revert css change.
f959bb8 [Reynold Xin] [SPARK-2299] Consolidate various stageIdTo* hash maps in JobProgressListener to speed it up.
63256f5 [Reynold Xin] [SPARK-2320] Reduce <pre> block font size.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants