-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1680] [Docs] Explain environment variables for running on YARN in cluster mode #10869
Conversation
…ark on YARN in cluster mode, which is a special case.
Can one of the admins verify this patch? |
CC @sryza @vanzin @tgravescs for a check |
you should file a separate jira for this. The original went in and is closed. I'm fine with the text, it might be nice to add something about preferring the use of spark.executorEnv.[EnvironmentVariableName] over the spark-env.sh file. |
It looks like the original JIRA status is "RESOLVED" but it was never "CLOSED" and so it is currently showing this pull request, which is probably a good thing. Since these documentation changes relate to the earlier code changes, I think it makes sense to put them on the same JIRA if possible. Otherwise, I'd be tempted to call this doc change "trivial" and skip creating a new JIRA. Can you assist with the text relating to spark.executorEnv.[EnvironmentVariableName]? I don't have a deep understanding of when to use the property vs. when to use the .sh file. |
@weineran agree with leaving it as is; my full logic: since that JIRA is soo old, and this change is still logically separable (i.e. one makes sense without the other), it's reasonable to make a new JIRA. However it's also trivial (i.e. diff ~= description of change) so in that sense, doesn't really matter. |
@@ -1700,6 +1700,8 @@ to use on each machine and maximum memory. | |||
Since `spark-env.sh` is a shell script, some of these can be set programmatically -- for example, you might | |||
compute `SPARK_LOCAL_IP` by looking up the IP of a specific network interface. | |||
|
|||
Note: When running Spark on YARN in cluster mode, environment variables need to be set using the <code>spark.yarn.appMasterEnv.[EnvironmentVariableName]</code> property in your `conf/spark-defaults.conf` file. Environment variables that are set in `spark-env.sh` will not be reflected in the YARN Application Master process in cluster mode. See the [YARN-related Spark Properties](running-on-yarn.html#spark-properties) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit -- this needs back-ticks rather than <code>
right? I think it's OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah well that could be. Does the <code>
tag only get used in tables? Let me know if I should switch to back-ticks.
Just realized I should throw some back-ticks around cluster
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah as I recall <code>
is required in tables since it occurs within the <table>
tag and ticks aren't parsed there. But otherwise use back ticks.
Merged to master |
JIRA 1680 added a property called spark.yarn.appMasterEnv. This PR draws users' attention to this special case by adding an explanation in configuration.html#environment-variables