Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8702][WebUI]Avoid massive concating strings in Javascript #7082

Closed
wants to merge 1 commit into from
Closed

[SPARK-8702][WebUI]Avoid massive concating strings in Javascript #7082

wants to merge 1 commit into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Jun 29, 2015

When there are massive tasks, such as sc.parallelize(1 to 100000, 10000).count(), the generated JS codes have a lot of string concatenations in the stage page, nearly 40 string concatenations for one task.

We can generate the whole string for a task instead of execution string concatenations in the browser.

Before this patch, the load time of the page is about 21 seconds.
screen shot 2015-06-29 at 6 44 04 pm

After this patch, it reduces to about 17 seconds.

screen shot 2015-06-29 at 6 47 34 pm

One disadvantage is that the generated JS codes become hard to read.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 29, 2015

/cc @sarutak

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #35980 has started for PR 7082 at commit b29231d.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 29, 2015

@sarutak I also find another issue in the console.

Because System.currentTimeMillis() is not accurate for tasks that only need several milliseconds, sometimes totalExecutionTime in makeTimeline will be 0. If totalExecutionTime is 0, there will the following error in the console.

screen shot 2015-06-29 at 7 08 55 pm

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #979 has started for PR 7082 at commit b29231d.

@sarutak
Copy link
Member

sarutak commented Jun 29, 2015

Oh, I see. could you file another one as a separate issue?
Thanks!

|'start': new Date($launchTime),
|'end': new Date($finishTime)
|}
|""".stripMargin.replaceAll("\n", " ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need stripMargin right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary. stripMargin removes the prefix "        |".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood the effect of "|" and stripMargin. Yes it's needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The \n isn't working on Windows if Git decides to check out the file using \r\n line-endings. May want to change it to [\r\n] instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Did you see it does not work? I think, it's better to replace it with a system property line.separator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It complained ILLEGAL character and that string is still shown as multiline in developer console. I think the line.separator wouldn't work at least for my case where I'm compiling on Windows but deploying on Linux. Don't ask me why :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I opened a new PR #7133 for the issue you mentioned.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #979 has finished for PR 7082 at commit b29231d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented Jun 29, 2015

retest this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #35985 has started for PR 7082 at commit b29231d.

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #35980 has finished for PR 7082 at commit b29231d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@sarutak
Copy link
Member

sarutak commented Jun 29, 2015

LGTM. After test "#35985" pass, I'll merge this into master.
Thanks @zsxwing for your contribution!

@zsxwing
Copy link
Member Author

zsxwing commented Jun 29, 2015

Oh, I see. could you file another one as a separate issue?

Opened a JIRA for this issue: https://issues.apache.org/jira/browse/SPARK-8705

@SparkQA
Copy link

SparkQA commented Jun 29, 2015

Test build #35985 has finished for PR 7082 at commit b29231d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants