Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading remote resources in yarn client mode #19074

Closed

Conversation

jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is #18962.

How was this patch tested?

Tested in local UT.

…arn client mode

With SPARK-10643, Spark supports download resources from remote in client deploy mode. But the implementation overrides variables which representing added resources (like `args.jars`, `args.pyFiles`) to local path, And yarn client leverage this local path to re-upload resources to distributed cache. This is unnecessary to break the semantics of putting resources in a shared FS. So here proposed to fix it.

This is manually verified with jars, pyFiles in local and remote storage, both in client and cluster mode.

Author: jerryshao <sshao@hortonworks.com>

Closes apache#18962 from jerryshao/SPARK-21714.

(cherry picked from commit 1813c4a)
Signed-off-by: jerryshao <sshao@hortonworks.com>

Change-Id: Ib2e8cb056707b362bc1c496002bac1472dc78ea7
@SparkQA
Copy link

SparkQA commented Aug 29, 2017

Test build #81201 has started for PR 19074 at commit 9c5b562.

@jerryshao
Copy link
Contributor Author

CC @vanzin @tgravescs please review. Since 2.2 and master are quite different in this part of code, so backporting changes a lot.

@jerryshao
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 29, 2017

Test build #81205 has finished for PR 19074 at commit 9c5b562.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • * (4) the main class for the child

@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2017

LGTM. Merging to 2.2.

@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2017

Merged. @jerryshao please close manually.

asfgit pushed a commit that referenced this pull request Aug 29, 2017
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is #18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <sshao@hortonworks.com>

Closes #19074 from jerryshao/SPARK-21714-2.2-backport.
@vanzin
Copy link
Contributor

vanzin commented Aug 29, 2017

@jerryshao looks like this breaks the 2.10 build:
59529b2#commitcomment-23954704

I reverted the patch waiting for the fix.

@jerryshao
Copy link
Contributor Author

Ohh, sorry about it, I forgot to fix it in 2.10 repl code. I will push a fix soon.

BTW how do we trigger scala 2.10 build on Jenkins?

@vanzin
Copy link
Contributor

vanzin commented Aug 30, 2017

I don't think there's a way to do that (the build Sean mentions happens after PRs are committed).

@jerryshao
Copy link
Contributor Author

OK, so I will do the test locally.

Change-Id: Ic7ae6152187ce7ee2e527340ecc7e178da970d67
@jerryshao
Copy link
Contributor Author

@vanzin @srowen pushed another commit to change 2.10 repl code, I tested locally with 2.10 code, please review.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81242 has finished for PR 19074 at commit 871a005.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81243 has finished for PR 19074 at commit 6ba3367.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Aug 30, 2017

Alright, let's try again. Merging to 2.2; make sure to close this PR manually.

asfgit pushed a commit that referenced this pull request Aug 30, 2017
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is #18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <sshao@hortonworks.com>

Closes #19074 from jerryshao/SPARK-21714-2.2-backport.
@jerryshao
Copy link
Contributor Author

Thanks @vanzin , it should be passed now 😄 .

@jerryshao jerryshao closed this Aug 31, 2017
Copy link
Contributor

@loneknightpy loneknightpy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerryshao @gatorsmile This PR broke spark submit file downloading in standalone cluster mode. Basically, it doesn't use local file consistently after download. Can you take a look?

@@ -366,7 +376,7 @@ object SparkSubmit extends CommandLineUtils {
// If a python file is provided, add it to the child arguments and list of files to deploy.
// Usage: PythonAppRunner <main python file> <extra python files> [app arguments]
args.mainClass = "org.apache.spark.deploy.PythonRunner"
args.childArgs = ArrayBuffer(args.primaryResource, args.pyFiles) ++ args.childArgs
args.childArgs = ArrayBuffer(localPrimaryResource, localPyFiles) ++ args.childArgs
if (clusterManager != YARN) {
// The YARN backend distributes the primary file differently, so don't merge it.
args.files = mergeFileLists(args.files, args.primaryResource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a behavior change here. I think we should use localFiles and localPrimaryResource instead of args.files and args.primaryResource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think it should use local files/primary resource here? args.files will be assigned to "spark.files" for non-yarn deploy, and Spark's fileserver will download them to local for all the executors, so it should be fine to keep as remote resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here the changes restricted to client deploy mode, I'm not sure why it is related to standalone cluster mode?

@@ -376,8 +386,8 @@ object SparkSubmit extends CommandLineUtils {
// The YARN backend handles python files differently, so don't merge the lists.
args.files = mergeFileLists(args.files, args.pyFiles)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same reason as above.

var localPrimaryResource: String = null
var localJars: String = null
var localPyFiles: String = null
var localFiles: String = null
if (deployMode == CLIENT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to avoid download for yarn, can we just check the cluster mode it here?

@vanzin
Copy link
Contributor

vanzin commented Sep 15, 2017

@loneknightpy can you file a new bug instead of comment on a closed PR?

@jiangxb1987
Copy link
Contributor

ping @jerryshao

@jerryshao
Copy link
Contributor Author

@loneknightpy did you open a new JIRA about this issue?

AFAIK, downloading resources to local disk is not supported for cluster mode even from beginning, would you please elaborate the issue in JIRA, thanks!

@jerryshao
Copy link
Contributor Author

@loneknightpy can you please elaborate more about the issue?

I believe you brought this remote resources support in #18078. It doesn't support cluster mode from beginning. Also your original works brings in regression in running on yarn, that's why I made a change here.

Maybe there's some changes for standalone cluster mode which assumes local resources, but I don't think it's a expected behavior.

MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is apache#18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <sshao@hortonworks.com>

Closes apache#19074 from jerryshao/SPARK-21714-2.2-backport.
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
…ces in yarn client mode

## What changes were proposed in this pull request?

This is a backport PR to fix issue of re-uploading remote resource in yarn client mode. The original PR is apache#18962.

## How was this patch tested?

Tested in local UT.

Author: jerryshao <sshao@hortonworks.com>

Closes apache#19074 from jerryshao/SPARK-21714-2.2-backport.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants