[SPARK-18842][TESTS] De-duplicate paths in classpaths in processes for local-cluster mode in ReplSuite to work around the length limitation on Windows by HyukjinKwon · Pull Request #16398 · apache/spark

HyukjinKwon · 2016-12-25T13:21:08Z

What changes were proposed in this pull request?

ReplSuites hang due to the length limitation on Windows with the exception as below:

Spark context available as 'sc' (master = local-cluster[1,1,1024], app id = app-20161223114000-0000).
Spark session available as 'spark'.
Exception in thread "ExecutorRunner for app-20161223114000-0000/26995" java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.util.Arrays.copyOf(Arrays.java:3332)
	at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
	at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622)
	at java.lang.StringBuilder.append(StringBuilder.java:202)
	at java.lang.ProcessImpl.createCommandLine(ProcessImpl.java:194)
	at java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
	at java.lang.ProcessImpl.start(ProcessImpl.java:137)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
	at org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:167)
	at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)

The reason is, it keeps failing and goes in an infinite loop. This fails because it uses the paths (via getFile) from URLs in the tests whereas some added afterward are normal local paths.
(url.getFile gives /C:/a/b/c and some paths are added later as the format of C:\a\b\c. )

So, many classpaths are duplicated because normal local paths and paths from URLs are mixed. This length is up to 40K which hits the length limitation problem (32K) on Windows.

The full command line built here is - https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb

How was this patch tested?

Manually via AppVeyor.

Before
https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues

After
https://ci.appveyor.com/project/spark-test/spark/build/398-find-path-issues

…in ReplSuite to work around the length limitation on Windows

HyukjinKwon · 2016-12-25T13:23:46Z

Build started: [TESTS] org.apache.spark.repl.ReplSuite
Diff: master...spark-test:288FF0F4-F91E-4FE6-89F6-C8C8D9F47A5B

HyukjinKwon · 2016-12-25T13:35:45Z

cc @srowen, this is a similar problem in the last PR in this JIRA, which makes the test hanging. I think this is the last one. Could I please ask to take a look?

SparkQA · 2016-12-25T15:47:07Z

Test build #70573 has finished for PR 16398 at commit ac87226.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-12-27T18:50:58Z

Merged to master

…r local-cluster mode in ReplSuite to work around the length limitation on Windows ## What changes were proposed in this pull request? `ReplSuite`s hang due to the length limitation on Windows with the exception as below: ``` Spark context available as 'sc' (master = local-cluster[1,1,1024], app id = app-20161223114000-0000). Spark session available as 'spark'. Exception in thread "ExecutorRunner for app-20161223114000-0000/26995" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622) at java.lang.StringBuilder.append(StringBuilder.java:202) at java.lang.ProcessImpl.createCommandLine(ProcessImpl.java:194) at java.lang.ProcessImpl.<init>(ProcessImpl.java:340) at java.lang.ProcessImpl.start(ProcessImpl.java:137) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) at org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:167) at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73) ``` The reason is, it keeps failing and goes in an infinite loop. This fails because it uses the paths (via `getFile`) from URLs in the tests whereas some added afterward are normal local paths. (`url.getFile` gives `/C:/a/b/c` and some paths are added later as the format of `C:\a\b\c`. ) So, many classpaths are duplicated because normal local paths and paths from URLs are mixed. This length is up to 40K which hits the length limitation problem (32K) on Windows. The full command line built here is - https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb ## How was this patch tested? Manually via AppVeyor. **Before** https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues **After** https://ci.appveyor.com/project/spark-test/spark/build/398-find-path-issues Author: hyukjinkwon <gurwls223@gmail.com> Closes apache#16398 from HyukjinKwon/SPARK-18842-more.

De-duplicate paths in classpaths in processes for local-cluster mode …

ac87226

…in ReplSuite to work around the length limitation on Windows

asfgit closed this in d8e14db Dec 27, 2016

HyukjinKwon deleted the SPARK-18842-more branch January 2, 2018 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-18842][TESTS] De-duplicate paths in classpaths in processes for local-cluster mode in ReplSuite to work around the length limitation on Windows#16398

[SPARK-18842][TESTS] De-duplicate paths in classpaths in processes for local-cluster mode in ReplSuite to work around the length limitation on Windows#16398
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-18842-more

HyukjinKwon commented Dec 25, 2016 •

edited

Loading

Uh oh!

HyukjinKwon commented Dec 25, 2016

Uh oh!

HyukjinKwon commented Dec 25, 2016 •

edited

Loading

Uh oh!

SparkQA commented Dec 25, 2016

Uh oh!

srowen commented Dec 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HyukjinKwon commented Dec 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

HyukjinKwon commented Dec 25, 2016

Uh oh!

HyukjinKwon commented Dec 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Dec 25, 2016

Uh oh!

srowen commented Dec 27, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HyukjinKwon commented Dec 25, 2016 •

edited

Loading

HyukjinKwon commented Dec 25, 2016 •

edited

Loading