Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33720][K8S] Support submit to k8s only with token #30684

Closed
wants to merge 2 commits into from

Conversation

hddong
Copy link
Contributor

@hddong hddong commented Dec 9, 2020

What changes were proposed in this pull request?

Support submit to k8s only with token.

Why are the changes needed?

Now, sumbit to k8s always need oauth files.

Does this PR introduce any user-facing change?

How was this patch tested?

Before, submit job out of k8s cluster without correct ca.crt, we may get this exception:

Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306)
        at sun.security.validator.Validator.validate(Validator.java:271)
        at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:312)

When set spark.kubernetes.trust.certificates = true, we can submit only with correct token, no need to config ca.crt in local env.
Submit as:

 bin/spark-submit \
     --master $master \
     --name pi \
     --deploy-mode cluster \
     --conf spark.kubernetes.container.image=$image \
     --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
     --conf spark.kubernetes.authenticate.submission.oauthToken=$clusterToken \
     --conf spark.kubernetes.trust.certificates=true \
     local:///opt/spark/examples/src/main/python/pi.py 200

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-33720][Kubernetes]Support submit to k8s only with token [SPARK-33720][K8S] Support submit to k8s only with token Dec 9, 2020
@dongjoon-hyun
Copy link
Member

ok to test

val KUBERNETES_TRUST_CERTIFICATES =
ConfigBuilder("spark.kubernetes.trust.certificates")
.doc("If set to true then client can submit to kubernetes cluster only with token")
.version("3.0.2")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @hddong .
This should be 3.2.0 because this is a new feature and master branch's version is 3.2.0-SNAPSHOT.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to https://github.com/apache/spark/pull/30684/files#r539500957, could you describe the environment where this PR is helpful? How can we verify this PR? Without testing or verification, this PR is not mergeable. Please write the reproducible procedure in the PR description instead of saying no need, please.

How was this patch tested?

no need

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Test build #132501 has finished for PR 30684 at commit 8ddb290.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37103/

@SparkQA
Copy link

SparkQA commented Dec 9, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37103/

@hddong
Copy link
Contributor Author

hddong commented Dec 14, 2020

@dongjoon-hyun : thansk for your review, had address them.

@SparkQA
Copy link

SparkQA commented Dec 14, 2020

Test build #132767 has finished for PR 30684 at commit 14e1cac.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 14, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37369/

@SparkQA
Copy link

SparkQA commented Dec 14, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37369/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hddong Why don't we use insecure-skip-tls-verify in .kubeconfig?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@hddong
Copy link
Contributor Author

hddong commented Jan 19, 2021

@dongjoon-hyun : sorry for feedback so long time later, insecure-skip-tls-verify is a chose when have .kubeconfig. Use oauthToken, we can submit without .kubeconfig, and submit easy with master url and oauthToken.

@dongjoon-hyun
Copy link
Member

I'm wondering when .kube/config is missing. In general, Apache Spark assumes an environment like A running Kubernetes cluster at version >= 1.6 with access configured to it using kubectl.

sorry for feedback so long time later, insecure-skip-tls-verify is a chose when have .kubeconfig. Use oauthToken, we can submit without .kubeconfig, and submit easy with master url and oauthToken.

@hddong
Copy link
Contributor Author

hddong commented Jan 20, 2021

@dongjoon-hyun : We can submit directly with token when client has no k8s client(kubectl and .kube/config) in environment. Otherwise, when submit to a new k8s cluster, we need modify .kube/config each time. kubernetes.trust.certificates make it simpler to submit our jars.

@jho
Copy link

jho commented Feb 22, 2021

I have a project that would greatly benefit from this patch as well. The ability to use .kube/config is nice but being able to use the spark-submit params to adjust what cluster we are submitting to dynamically is a better approach for us as well. AWS EKS k8s clusters use self-signed load balanced certs per cluster and this leaves with lots of issues with PKIX errors.

@hddong
Copy link
Contributor Author

hddong commented Mar 24, 2021

@dongjoon-hyun: how do you think about this PR.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 24, 2021

Sorry for the delay, @hddong . I'll review right now again.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. I verified this manually.
Merged to master for Apache Spark 3.2.0.

@dongjoon-hyun
Copy link
Member

cc @attilapiros , too.

flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
### What changes were proposed in this pull request?

Support submit to k8s only with token.

### Why are the changes needed?

Now, sumbit to k8s always need oauth files.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Before, submit job out of k8s cluster without correct ca.crt, we may get this exception:
```
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306)
        at sun.security.validator.Validator.validate(Validator.java:271)
        at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:312)
```
When set spark.kubernetes.trust.certificates = true, we can submit only with correct token, no need to config ca.crt in local env.
Submit as:
```
 bin/spark-submit \
     --master $master \
     --name pi \
     --deploy-mode cluster \
     --conf spark.kubernetes.container.image=$image \
     --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
     --conf spark.kubernetes.authenticate.submission.oauthToken=$clusterToken \
     --conf spark.kubernetes.trust.certificates=true \
     local:///opt/spark/examples/src/main/python/pi.py 200
```

Closes apache#30684 from hddong/trust-certs.

Authored-by: hongdongdong <hongdongdong@cmss.chinamobile.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 985c653)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants