Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7524][SPARK-7846]add configs for keytab and principal, pass these two configs with different way in different modes #6051

Closed
wants to merge 6 commits into from

Conversation

WangTaoTheTonic
Copy link
Contributor

  • As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.
  • --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
    In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
    We should pass these command args in different forms, say system properties.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32382 has started for PR 6051 at commit 73afa64.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32383 has started for PR 6051 at commit 08bb4e8.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32383 has finished for PR 6051 at commit 08bb4e8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32383/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32382 has finished for PR 6051 at commit 73afa64.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32382/
Test PASSed.

@@ -169,6 +169,8 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S
deployMode = Option(deployMode).orElse(env.get("DEPLOY_MODE")).orNull
numExecutors = Option(numExecutors)
.getOrElse(sparkProperties.get("spark.executor.instances").orNull)
keytab = Option(keytab).orElse(sparkProperties.get("spark.yarn.keytab")).orNull
principal = Option(principal).orElse(sparkProperties.get("spark.yarn.principal")).orNull
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please document the configs since you are making them public in the running-on-yarn.md

@harishreedharan
Copy link
Contributor

Have you actually run this one a cluster with the various options:

  • Use --principal + spark.yarn.keytab (and vice versa)
  • Both set and making sure command line gets picked up
  • Only command line and only spark.yarn.*

Also, I have not seen spark.internal. options elsewhere in Spark? Are there are other config options like this? I am wondering if it makes sense to just re-use the current configs? If the command line params are set, just overwrite the ones we read from the configuration?

@tgravescs
Copy link
Contributor

No spark doesn't use the .internal. elsewhere. Please don't rename any of the ones we haven't documented as internal.

I agree that we should just expand the use of the current ones since they are going to be propagated anyway.

@tgravescs
Copy link
Contributor

@WangTaoTheTonic can you update based on comments?

@WangTaoTheTonic
Copy link
Contributor Author

@tgravescs Okay, I will do it right now.

@AmplabJenkins
Copy link

Build triggered.

@AmplabJenkins
Copy link

Build started.

@WangTaoTheTonic WangTaoTheTonic changed the title [SPARK-7524][YARN]add configs for keytab and principal, move originals to internal [SPARK-7524][SPARK-7846][YARN]add configs for keytab and principal, move originals to internal May 24, 2015
@SparkQA
Copy link

SparkQA commented May 24, 2015

Test build #33438 has started for PR 6051 at commit ecfe43a.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 24, 2015

Test build #33439 has started for PR 6051 at commit ebd9ea0.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33439/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 24, 2015

Test build #33438 has finished for PR 6051 at commit ecfe43a.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33438/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 24, 2015

Test build #33441 has finished for PR 6051 at commit e65699a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33441/
Test PASSed.

@WangTaoTheTonic
Copy link
Contributor Author

I updated per comments and add fix for SPARK-7846. Haven't tested as there is some problem with my cluster and will do it ASAP.

@WangTaoTheTonic
Copy link
Contributor Author

@harishreedharan
I have tested per your comments, using spark-shell:

  • Use --principal + spark.yarn.keytab (and vice versa) (double check)
  • Both set and making sure command line gets picked up (check)
  • Only command line (check) and only spark.yarn.* (check)

It all worked fine.

@WangTaoTheTonic
Copy link
Contributor Author

ping? @harishreedharan @tgravescs

@tgravescs
Copy link
Contributor

lgtm. going to kick jenkins once more.

@tgravescs
Copy link
Contributor

Jenkins, test this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 29, 2015

Test build #33736 has started for PR 6051 at commit e65699a.

@SparkQA
Copy link

SparkQA commented May 29, 2015

Test build #33736 has finished for PR 6051 at commit e65699a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@tgravescs
Copy link
Contributor

@WangTaoTheTonic can you please fix the description on this pr. Remove the [YARN]1 from heading.

@WangTaoTheTonic WangTaoTheTonic changed the title [SPARK-7524][SPARK-7846][YARN]1. add configs for keytab and principal 2.pass these two configs with different way in different modes [SPARK-7524][SPARK-7846]1. add configs for keytab and principal 2.pass these two configs with different way in different modes May 29, 2015
@WangTaoTheTonic WangTaoTheTonic changed the title [SPARK-7524][SPARK-7846]1. add configs for keytab and principal 2.pass these two configs with different way in different modes [SPARK-7524][SPARK-7846]add configs for keytab and principal, pass these two configs with different way in different modes May 29, 2015
@WangTaoTheTonic
Copy link
Contributor Author

@tgravescs Done.

@asfgit asfgit closed this in a51b133 May 29, 2015
@harishreedharan
Copy link
Contributor

Sorry, I am late to this to this one. Could you create a followup PR to update the security.md file with info about running in client vs cluster modes and that this is supported only in cluster mode via command line params.

@tgravescs - If 1.4 gets released without this patch, there will be a regression in client mode (--principal and --keytab won't work in client mode). Should this + docs for this one also go into 1.4?

@tgravescs
Copy link
Contributor

@harishreedharan I'm pretty sure the --keytab and --principal options still work in client mode to SparkSubmit. Its simply the way it passes it on to the client. in yarn client mode it uses the config and in yarn cluster mode it uses the command line options.

OptionAssigner(args.principal, YARN, CLIENT, sysProp = "spark.yarn.principal"),
OptionAssigner(args.keytab, YARN, CLIENT, sysProp = "spark.yarn.keytab"),

I'm building now and will test it to double check. Did you try it yourself?

That would not be a blocker for 1.4 though. We would revert and fix.

@harishreedharan
Copy link
Contributor

Ah, right - my bad! I wasn't reading the code properly. I will try it out later today to confirm.

@WangTaoTheTonic
Copy link
Contributor Author

@harishreedharan How about the confirm? Should we do some changes in branh 1.4?

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…these two configs with different way in different modes

* As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.

*  --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
We should pass these command args in different forms, say system properties.

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes apache#6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:

e65699a [WangTaoTheTonic] change logic to loadEnvironments
ebd9ea0 [WangTaoTheTonic] merge master
ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
33a7f40 [WangTaoTheTonic] expand the use of the current configs
08bb4e8 [WangTaoTheTonic] fix wrong cite
73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals to internal
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…these two configs with different way in different modes

* As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.

*  --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
We should pass these command args in different forms, say system properties.

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes apache#6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:

e65699a [WangTaoTheTonic] change logic to loadEnvironments
ebd9ea0 [WangTaoTheTonic] merge master
ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
33a7f40 [WangTaoTheTonic] expand the use of the current configs
08bb4e8 [WangTaoTheTonic] fix wrong cite
73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals to internal
mccheah pushed a commit to palantir/spark that referenced this pull request Feb 9, 2016
…these two configs with different way in different modes

* As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.

*  --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
We should pass these command args in different forms, say system properties.

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes apache#6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:

e65699a [WangTaoTheTonic] change logic to loadEnvironments
ebd9ea0 [WangTaoTheTonic] merge master
ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
33a7f40 [WangTaoTheTonic] expand the use of the current configs
08bb4e8 [WangTaoTheTonic] fix wrong cite
73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals to internal

Conflicts:
	docs/running-on-yarn.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants