Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4266] [Web-UI] Reduce stage page load time. #3328

Closed
wants to merge 1 commit into from

Conversation

kayousterhout
Copy link
Contributor

The commit changes the java script used to show/hide additional
metrics in order to reduce page load time. SPARK-4016 significantly
increased page load time for the stage page when stages had a lot
(thousands or tens of thousands) of tasks, due to the additional
Javascript to hide some metrics by default and stripe the tables.
This commit reduces page load time in two ways:

(1) Now, all of the metrics that are hidden by default are
hidden by setting "display: none;" using CSS for the page,
rather than hiding them using javascript after the page loads.
Without this change, for stages with thousands of tasks, there
was a few second delay after page load, where first the additional
metrics were shown, and then after a delay were hidden once the
relevant JS finished running.

(2) CSS is used to stripe all of the tables except for the summary
table. The summary table needs javascript to do the striping because
some rows are hidden, but the javascript striping is slower, which
again resulted in a delay when it was used for the task table (where
for a few seconds after page load, all of the rows in the task table
would be white, while the browser finished running the JS to stripe
the table).

cc @pwendell

This change is intended to be backported to 1.2 to avoid a regression in
UI performance when users run large jobs.

@SparkQA
Copy link

SparkQA commented Nov 17, 2014

Test build #23512 has started for PR 3328 at commit 5d98669.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 18, 2014

Test build #23512 has finished for PR 3328 at commit 5d98669.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23512/
Test PASSed.

listingTableClass += " table-fixed"
var listingTableClass = {
if (stripeRowsWithCss) TABLE_CLASS_STRIPED
else TABLE_CLASS_NOT_STRIPED
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do

var listingTableClass =
  if (stripeRowsWithCss) {
    TABLE_CLASS_STRIPED
  } else {
    TABLE_CLASS_NOT_STRIPED
  }

@kayousterhout
Copy link
Contributor Author

Thanks Andrew! On further inspection I realized this fits in one line which is, I think, the cleanest way to do this.

@SparkQA
Copy link

SparkQA commented Nov 19, 2014

Test build #23618 has started for PR 3328 at commit e2d2bb1.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 19, 2014

Test build #23618 has finished for PR 3328 at commit e2d2bb1.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23618/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Nov 19, 2014

Test build #23629 has started for PR 3328 at commit 9470b4f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 19, 2014

Test build #23629 has finished for PR 3328 at commit 9470b4f.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23629/
Test FAILed.

@andrewor14
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23647 has started for PR 3328 at commit 9470b4f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 20, 2014

Test build #23647 has finished for PR 3328 at commit 9470b4f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23647/
Test PASSed.

@kayousterhout
Copy link
Contributor Author

@andrewor14 @pwendell any more comments here?

@pwendell
Copy link
Contributor

LGTM

@andrewor14
Copy link
Contributor

Yeah, LGTMT. Feel free to merge it

* An ID selector (rather than a class selector) is used to ensure this runs quickly even on pages
* with thousands of task rows (ID selectors are much faster than class selectors). */
function stripeSummaryTable() {
$("#task-summary-table").find("tr:not(:hidden)").each(function (index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering - is using this selector more expensive than just traversing elements in children() and checking manually for each one and then maintaining your own counter? It seems like interpreting this string could be expensive inside of javascript.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a table that has at most 6 or 7 rows (it's the summary table for each metric) -- so I don't think that level of optimization matters (the most important thing here is the ID selector, so you don't need to traverse the task table, which can have infinitely many rows)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't follow that. Thanks for explaining.

The commit changes the java script used to show/hide additional
metrics in order to reduce page load time. SPARK-4016 significantly
increased page load time for the stage page when stages had a lot
(thousands or tens of thousands) of tasks, due to the additional
Javascript to hide some metrics by default and stripe the tables.
This commit reduces page load time in two ways:

(1) Now, all of the metrics that are hidden by default are
hidden by setting "display: none;" using CSS for the page,
rather than hiding them using javascript after the page loads.
Without this change, for stages with thousands of tasks, there
was a few second delay after page load, where first the additional
metrics were shown, and then after a delay were hidden once the
relevant JS finished running.

(2) CSS is used to stripe all of the tables except for the summary
table. The summary table needs javascript to do the striping because
some rows are hidden, but the javascript striping is slower, which
again resulted in a delay when it was used for the task table (where
for a few seconds after page load, all of the rows in the task table
would be white, while the browser finished running the JS to stripe
the table).
@kayousterhout
Copy link
Contributor Author

Ok I rebased this and will let Jenkins test it one more time (since it was last tested 5 days ago and I know lots has been merged since then) -- then I'll merge it assuming all of the tests pass.

@SparkQA
Copy link

SparkQA commented Nov 24, 2014

Test build #23799 has started for PR 3328 at commit f964091.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23799 has finished for PR 3328 at commit f964091.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23799/
Test PASSed.

@asfgit asfgit closed this in d24d5bf Nov 25, 2014
asfgit pushed a commit that referenced this pull request Nov 25, 2014
The commit changes the java script used to show/hide additional
metrics in order to reduce page load time. SPARK-4016 significantly
increased page load time for the stage page when stages had a lot
(thousands or tens of thousands) of tasks, due to the additional
Javascript to hide some metrics by default and stripe the tables.
This commit reduces page load time in two ways:

(1) Now, all of the metrics that are hidden by default are
hidden by setting "display: none;" using CSS for the page,
rather than hiding them using javascript after the page loads.
Without this change, for stages with thousands of tasks, there
was a few second delay after page load, where first the additional
metrics were shown, and then after a delay were hidden once the
relevant JS finished running.

(2) CSS is used to stripe all of the tables except for the summary
table. The summary table needs javascript to do the striping because
some rows are hidden, but the javascript striping is slower, which
again resulted in a delay when it was used for the task table (where
for a few seconds after page load, all of the rows in the task table
would be white, while the browser finished running the JS to stripe
the table).

cc pwendell

This change is intended to be backported to 1.2 to avoid a regression in
UI performance when users run large jobs.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #3328 from kayousterhout/SPARK-4266 and squashes the following commits:

f964091 [Kay Ousterhout] [SPARK-4266] [Web-UI] Reduce stage page load time.

(cherry picked from commit d24d5bf)
Signed-off-by: Kay Ousterhout <kayousterhout@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants