Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12345][MESOS] Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode. #10332

Closed
wants to merge 1 commit into from

Conversation

tnachen
Copy link
Contributor

@tnachen tnachen commented Dec 16, 2015

SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.

We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.

@tnachen
Copy link
Contributor Author

tnachen commented Dec 16, 2015

@dragos PTAL

@dragos
Copy link
Contributor

dragos commented Dec 16, 2015

There seems to be a race condition :) @skyluc opened #10329, but the change is in SparkSubmit. I wonder which one we should take. We tested #10329 locally and it passed. This will take a while to re-test

@marmbrus
Copy link
Contributor

I would lean towards this patch since it only affects mesos and not standalone mode.

@dragos
Copy link
Contributor

dragos commented Dec 16, 2015

Agreed.

@skyluc
Copy link

skyluc commented Dec 16, 2015

Code LGTM. Unfortunately, I cannot try it before a couple of hours.

@tnachen
Copy link
Contributor Author

tnachen commented Dec 16, 2015

Yes I would also want to just make changes on the Mesos side and not cause any possible regression on standalone.

// with Mesos cluster mode since it's populated by default on the client and it will
// cause spark-submit script to look for files in SPARK_HOME instead.
// We only need the ability to specify where to find spark-submit script
// which user can user spark.executor.home or spark.home configurations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add (SPARK-12345) here, but I'll fix this myself on merge.

@andrewor14
Copy link
Contributor

LGTM merging into master and 1.6. Just FYI I might revert this patch in master because I believe #10329 is a better fix in the long run, but for now let's just unblock the release.

asfgit pushed a commit that referenced this pull request Dec 16, 2015
…h Mesos cluster mode.

SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.

We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.

Author: Timothy Chen <tnachen@gmail.com>

Closes #10332 from tnachen/scheduler_ui.

(cherry picked from commit ad8c1f0)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in ad8c1f0 Dec 16, 2015
@SparkQA
Copy link

SparkQA commented Dec 16, 2015

Test build #47830 has finished for PR 10332 at commit baea28f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// cause spark-submit script to look for files in SPARK_HOME instead.
// We only need the ability to specify where to find spark-submit script
// which user can user spark.executor.home or spark.home configurations.
val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately there is a subtle error here, and this is a no-op. And nobody ran this code, it seems.

Here's what happens: environmentVariables is a map, not a sequence. So filter works on Pairs, and a pair will never be equal to a string. The correct call would have been filterKeys.

Unfortunately this went in RC3 without fixing the bug. It is harmless otherwise, but highlights the fact that there are no easy fixes or safe changes. :-/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really the problem, I think we should fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting I wonder if I ran it differently than what my code have, since I was able to see it not passed through.
Thanks for retesting this, I think having the automated tests is going to be crucial to prevent mistakes like this that I'm making :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, odd. I ran this against DCOS and didn't see the error.

ghost pushed a commit to dbtsai/spark that referenced this pull request Dec 17, 2015
… server

Fix problem with apache#10332, this one should fix Cluster mode on Mesos

Author: Iulian Dragos <jaguarul@gmail.com>

Closes apache#10359 from dragos/issue/fix-spark-12345-one-more-time.
asfgit pushed a commit that referenced this pull request Dec 17, 2015
… server

Fix problem with #10332, this one should fix Cluster mode on Mesos

Author: Iulian Dragos <jaguarul@gmail.com>

Closes #10359 from dragos/issue/fix-spark-12345-one-more-time.

(cherry picked from commit 8184568)
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
@andrewor14
Copy link
Contributor

OK I'm going to go ahead and revert this patch since it doesn't work...

@andrewor14
Copy link
Contributor

Oh wait, looks like #10359 which fixes this is already merged.

@dragos
Copy link
Contributor

dragos commented Dec 17, 2015

Yeah, just about to comment on that @andrewor14

@andrewor14
Copy link
Contributor

Note: I'm reverting this patch in master only since #10329, the better alternative, is merged there.
This patch continues to exist in branch-1.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants