Skip to content

[SPARK-45175][K8S] download krb5.conf from remote storage in spark-submit on k8s#42943

Closed
dcoliversun wants to merge 1 commit intoapache:masterfrom
dcoliversun:SPARK-45175
Closed

[SPARK-45175][K8S] download krb5.conf from remote storage in spark-submit on k8s#42943
dcoliversun wants to merge 1 commit intoapache:masterfrom
dcoliversun:SPARK-45175

Conversation

@dcoliversun
Copy link
Contributor

@dcoliversun dcoliversun commented Sep 15, 2023

What changes were proposed in this pull request?

krb5.conf currently only supports the local file format. Tenants would like to save this file on their own servers and download it during the spark-submit phase for better implementation of multi-tenant scenarios.

Why are the changes needed?

Currently, spark.kubernetes.kerberos.krb5.path supports only local files.

With this PR, we can do the following.

bin/spark-submit \
...
--conf spark.kubernetes.kerberos.krb5.path=s3a://tenant1/krb5.conf \
...

Does this PR introduce any user-facing change?

Yes, this is an improvement.

How was this patch tested?

Add k8s integration test.

Was this patch authored or co-authored using generative AI tooling?

No

@dcoliversun dcoliversun marked this pull request as draft September 15, 2023 06:50
@dcoliversun dcoliversun changed the title [WIP][SPARK-45175][K8S] download krb5.conf from remote storage in spark-sumbit on k8s [SPARK-45175][K8S] download krb5.conf from remote storage in spark-submit on k8s Sep 15, 2023
@dcoliversun dcoliversun marked this pull request as ready for review September 15, 2023 09:52
@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun It would be good if you have time to review this PR

@yaooqinn
Copy link
Member

What if the remote storage requires login via Kerberos before accessing it?

@dongjoon-hyun
Copy link
Member

I have the same question with @yaooqinn . Since this is in Security domain, I'm wondering if this is safe or a recommended way for Kerberos.

@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun @yaooqinn Thanks for your review. And this is a good question. The specific scenario of this PR is to support users to use krb5.conf on cloud storage, in which authentication is AccessKey. This PR does not support remote services authenticated as Kerberos. I haven't thought of an implementation plan for this yet. Do you have any suggestions?

@dongjoon-hyun
Copy link
Member

I'm just wondering if this is a recommended way in the Kerberos community. In any way, you are suggesting to bypass Kerberos security environment in order to download krb5.conf and eventually to use Kerberos-enabled remote storage.

This PR does not support remote services authenticated as Kerberos.

@dcoliversun
Copy link
Contributor Author

dcoliversun commented Sep 22, 2023

@dongjoon-hyun @yaooqinn
I haven't found any best practice or recommended way from the Kerberos community for downloading krb5.conf.

Back to multi-tenant scenarios, I find Apache spark provide spark.kubernetes.kerberos.krb5.configMapName to mount ConfigMap containing the krb5.conf file, we could manage these files by creating multiple configMaps for multi-tenants.

Since this PR cannot support kerberos-based services, I will close it.

And Thanks for your time :)

@dongjoon-hyun
Copy link
Member

Thank you for your decision, @dcoliversun .

@dcoliversun dcoliversun deleted the SPARK-45175 branch September 25, 2023 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants