-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12345][MESOS] Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode. #10332
Conversation
@dragos PTAL |
I would lean towards this patch since it only affects mesos and not standalone mode. |
Agreed. |
Code LGTM. Unfortunately, I cannot try it before a couple of hours. |
Yes I would also want to just make changes on the Mesos side and not cause any possible regression on standalone. |
// with Mesos cluster mode since it's populated by default on the client and it will | ||
// cause spark-submit script to look for files in SPARK_HOME instead. | ||
// We only need the ability to specify where to find spark-submit script | ||
// which user can user spark.executor.home or spark.home configurations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add (SPARK-12345) here, but I'll fix this myself on merge.
LGTM merging into master and 1.6. Just FYI I might revert this patch in master because I believe #10329 is a better fix in the long run, but for now let's just unblock the release. |
…h Mesos cluster mode. SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined. We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead. Author: Timothy Chen <tnachen@gmail.com> Closes #10332 from tnachen/scheduler_ui. (cherry picked from commit ad8c1f0) Signed-off-by: Andrew Or <andrew@databricks.com>
Test build #47830 has finished for PR 10332 at commit
|
// cause spark-submit script to look for files in SPARK_HOME instead. | ||
// We only need the ability to specify where to find spark-submit script | ||
// which user can user spark.executor.home or spark.home configurations. | ||
val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there is a subtle error here, and this is a no-op. And nobody ran this code, it seems.
Here's what happens: environmentVariables
is a map, not a sequence. So filter
works on Pairs, and a pair will never be equal to a string. The correct call would have been filterKeys
.
Unfortunately this went in RC3 without fixing the bug. It is harmless otherwise, but highlights the fact that there are no easy fixes or safe changes. :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really the problem, I think we should fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm interesting I wonder if I ran it differently than what my code have, since I was able to see it not passed through.
Thanks for retesting this, I think having the automated tests is going to be crucial to prevent mistakes like this that I'm making :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, odd. I ran this against DCOS and didn't see the error.
… server Fix problem with apache#10332, this one should fix Cluster mode on Mesos Author: Iulian Dragos <jaguarul@gmail.com> Closes apache#10359 from dragos/issue/fix-spark-12345-one-more-time.
… server Fix problem with #10332, this one should fix Cluster mode on Mesos Author: Iulian Dragos <jaguarul@gmail.com> Closes #10359 from dragos/issue/fix-spark-12345-one-more-time. (cherry picked from commit 8184568) Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
OK I'm going to go ahead and revert this patch since it doesn't work... |
Oh wait, looks like #10359 which fixes this is already merged. |
Yeah, just about to comment on that @andrewor14 |
Note: I'm reverting this patch in master only since #10329, the better alternative, is merged there. |
SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.
We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.