Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34316][K8S] Support spark.kubernetes.executor.disableConfigMap #31428

Closed
wants to merge 2 commits into from
Closed

[SPARK-34316][K8S] Support spark.kubernetes.executor.disableConfigMap #31428

wants to merge 2 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Feb 2, 2021

What changes were proposed in this pull request?

This PR aims to add a new configuration spark.kubernetes.executor.disableConfigMap.

Why are the changes needed?

This can be use to disable config map creating for executor pods due to #27735 .

Does this PR introduce any user-facing change?

No. By default, this doesn't change AS-IS behavior.
This is a new feature to add an ability to disable SPARK-30985.

How was this patch tested?

Pass the newly added UT.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Feb 2, 2021

Could you review this, @ScrapCodes ? There was a user request to disable the new feature of SPARK-30985 for some security reasons.

@SparkQA
Copy link

SparkQA commented Feb 2, 2021

Test build #134750 has finished for PR 31428 at commit ee9311b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39337/

@SparkQA
Copy link

SparkQA commented Feb 2, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39337/

@dongjoon-hyun
Copy link
Member Author

Could you review this, @holdenk ?

@ScrapCodes
Copy link
Member

Hi @dongjoon-hyun , Thank you for the PR. You mentioned due to a security issue, you would like add this config. Can you please describe the issue either here or if you think it makes sense then @Private list.

I have taken a look a cursory look, no issues found. :)

@ScrapCodes
Copy link
Member

Ok, +1 LGTM !

@dongjoon-hyun, Feel free to merge this.

@dongjoon-hyun
Copy link
Member Author

Thank you, @ScrapCodes . Yes, for security, some namespaces do not allow the driver pod to create new ConfigMaps. As a result, Spark jobs are unable to be executed in those namespaces. This unblocks Spark jobs in those namespace. As I wrote in the PR description, this doesn't aim to hurt the functionality of the original contribution. Instead, this PR aims to provide a workaround for those security enforced namespace.

@dongjoon-hyun
Copy link
Member Author

Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-34316 branch February 2, 2021 06:26
@ScrapCodes
Copy link
Member

Ahh, makes sense then. Wish, we could autodetect and fallback too.

skestle pushed a commit to skestle/spark that referenced this pull request Feb 3, 2021
### What changes were proposed in this pull request?

This PR aims to add a new configuration `spark.kubernetes.executor.disableConfigMap`.

### Why are the changes needed?

This can be use to disable config map creating for executor pods due to apache#27735 .

### Does this PR introduce _any_ user-facing change?

No. By default, this doesn't change AS-IS behavior.
This is a new feature to add an ability to disable SPARK-30985.

### How was this patch tested?

Pass the newly added UT.

Closes apache#31428 from dongjoon-hyun/SPARK-34316.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
### What changes were proposed in this pull request?

This PR aims to add a new configuration `spark.kubernetes.executor.disableConfigMap`.

### Why are the changes needed?

This can be use to disable config map creating for executor pods due to apache#27735 .

### Does this PR introduce _any_ user-facing change?

No. By default, this doesn't change AS-IS behavior.
This is a new feature to add an ability to disable SPARK-30985.

### How was this patch tested?

Pass the newly added UT.

Closes apache#31428 from dongjoon-hyun/SPARK-34316.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit f66e38c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@advancedxy
Copy link
Contributor

Hi, @dongjoon-hyun and @ScrapCodes, When I was developing #41196, I initially thought it may be desirable to reuse the SPARK_CONF_DIR config map, but it seems a bit of complex and therefore create a new ConfigMap instead.

However after some digging and found some limitation exposed by K8S:

  1. config map is an object, it may has quota limitation and add burdens to K8S's api server and etcd
  2. for access control/security reasons, it may not be possible to create new objects from driver pod(such as the cases in this PR)

Therefore I think there should be some rules when developing feature steps for spark on K8S regarding config maps:

  1. the config map should be created as less as possible, unless it's unavoidable, such as a new HADOOP_CONF map
  2. the config map should only be created on the client(spark-submit) side, not from the driver pod.

I'd like to make a proposal here:

  1. create SPARK_CONF config map on the client side, records the config map's name, the driver pod should reuse this config map to mount SPARK_CONF on the executor side, rather creating a new ConfigMap. Therefore the driver and executor shares the same SPARK_CONF mount. Also, this makes spark.kubernetes.executor.disableConfigMap unnecessary
  2. extends KubernetesFeatureConfigStep to add a method getAdditionalDataFileForSparkConfConfigMap (an example one), to allow other feature steps to dynamic add data files when building driver pod. By leveraging this method, PodTemplateConfigMapStep could eliminate the need to create a new config map. New feature steps could also benefit from this method.

WDYT?

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 18, 2023

Yes, actually, that bit us as you see in this PR's goal.

However after some digging and found some limitation exposed by K8S:
config map is an object, it may has quota limitation and add burdens to K8S's api server and etcd

Please send an email to dev@spark mailing list for further Q&A or discussion. We don't want to open a long discussion on the existing PR like this because that frequently causes unnecessary confusion for other audiences, @advancedxy .

@advancedxy
Copy link
Contributor

Yes, actually, that bit us as you see in this PR's goal.

However after some digging and found some limitation exposed by K8S:
config map is an object, it may has quota limitation and add burdens to K8S's api server and etcd

Please send an email to dev@spark mailing list for further Q&A or discussion. We don't want to open a long discussion on the existing PR like this because that frequently causes unnecessary confusion for other audiences, @advancedxy .

Thanks. Let me open a discuss mail on the dev mail list.

@dongjoon-hyun
Copy link
Member Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants