Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32067][K8S] Use unique ConfigMap name for executor pod template #29934

Closed
wants to merge 1 commit into from
Closed

[SPARK-32067][K8S] Use unique ConfigMap name for executor pod template #29934

wants to merge 1 commit into from

Conversation

stijndehaes
Copy link
Contributor

@stijndehaes stijndehaes commented Oct 2, 2020

What changes were proposed in this pull request?

The pod template configmap always had the same name. This PR makes it unique.

Why are the changes needed?

If you scheduled 2 spark jobs they will both use the same configmap name this will result in conflicts. This PR fixes that

BEFORE

$ kubectl get cm --all-namespaces -w | grep podspec
podspec-configmap                              1      65s

AFTER

$ kubectl get cm --all-namespaces -w | grep podspec
aaece65ef82e4a30b7b7800aad600d4f   spark-test-app-aac9f37502b2ca55-driver-podspec-conf-map   1      0s

This can be seen when running the integration tests

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests and the integration tests test if this works

@HyukjinKwon
Copy link
Member

Can you file a jira and link it to the PR title? See also https://spark.apache.org/contributing.html

@stijndehaes stijndehaes changed the title Make sure the pod template configmap has a unique name [SPARK-32067][K8S]Make sure the pod template configmap has a unique name Oct 3, 2020
@stijndehaes
Copy link
Contributor Author

@onursatici @vanzin I see you guys have changed parts of this code would you kindly review?

@stijndehaes
Copy link
Contributor Author

@HyukjinKwon Sorry forgot to add the existing JIRA ticket in the title. I also added the K8S component in the title now

@stijndehaes stijndehaes changed the title [SPARK-32067][K8S]Make sure the pod template configmap has a unique name [SPARK-32067][k8s] Give pod template configmap a unique name Oct 3, 2020
Copy link
Contributor

@yuj yuj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider using the already-defined podspec-configmap constant.

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Oct 3, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33993/

@SparkQA
Copy link

SparkQA commented Oct 3, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33993/

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Test build #129385 has finished for PR 29934 at commit 9993d13.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Test build #129389 has finished for PR 29934 at commit 66609a2.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Test build #129390 has finished for PR 29934 at commit a4e423c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33996/

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33997/

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33996/

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33997/

@dongjoon-hyun
Copy link
Member

Thank you for updates, @stijndehaes .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32067][k8s] Give pod template configmap a unique name [SPARK-32067][K8S] Give pod template configmap a unique name Oct 4, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32067][K8S] Give pod template configmap a unique name [SPARK-32067][K8S] Make pod template configmap with a unique name Oct 4, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32067][K8S] Make pod template configmap with a unique name [SPARK-32067][K8S] Use unique ConfigMap name for pod template Oct 4, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-32067][K8S] Use unique ConfigMap name for pod template [SPARK-32067][K8S] Use unique ConfigMap name for executor pod template Oct 4, 2020
@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Test build #129399 has finished for PR 29934 at commit 09689f2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34006/

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34006/

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Test build #129419 has finished for PR 29934 at commit d48c8e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34026/

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34026/

@dongjoon-hyun
Copy link
Member

Wow, it's much simpler, @stijndehaes . :)

@yuj
Copy link
Contributor

yuj commented Oct 5, 2020

@dongjoon-hyun Could you please consider include this bug fix as part of any coming release. This is a much needed fix for probably many people.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Oct 5, 2020

Of course, @yuj . I'm considering this for Apache Spark 3.1/3.0.2/2.4.8. I've been reviewing this PR since two days ago.

@@ -31,14 +31,16 @@ private[spark] class PodTemplateConfigMapStep(conf: KubernetesConf)

private val hasTemplate = conf.contains(KUBERNETES_EXECUTOR_PODTEMPLATE_FILE)

private val configmapName = s"${conf.resourceNamePrefix}-$POD_TEMPLATE_CONFIGMAP"
Copy link
Member

@dongjoon-hyun dongjoon-hyun Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the following status.

tpcds-py-eebc4474fa60b6f7-driver-conf-map      1      5s
tpcds-py-eebc4474fa60b6f7-podspec-configmap    1      5s

This is inconsistent because Driver is using driver-conf-map postfix and executor is using podspec-configmap.

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala:    val configMapName = s"${conf.resourceNamePrefix}-driver-conf-map"

Let's use -exec-conf-map as a postfix.

Copy link
Contributor Author

@stijndehaes stijndehaes Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wel actually this configmap is not used by the executors but also by the driver. The driver needs this configmap to spawn executors with the right template. So I'll change the name to:driver-podspec-conf-map. Let me know if that is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share your result from k get cm? Does it looks reasonably consistent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please update your PR description. I added AFTER paragraph at this PR description. You can fill in the final output there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During running the integration tests when watching the configmap it looks something like this:

aaece65ef82e4a30b7b7800aad600d4f   spark-test-app-aac9f37502b2ca55-driver-conf-map   1      0s
aaece65ef82e4a30b7b7800aad600d4f   spark-test-app-aac9f37502b2ca55-driver-podspec-conf-map   1      0

For me this looks nice because it's clear that the confimap is used by the driver. And when sorted alphabetically they are neatly together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thank you, @stijndehaes .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stijndehaes . This PR looks good except the consistency between driver and executor. Please update it.

The pod template configmap always had the same name
for each job submitted.
This means if you schedule 2 spark jobs in the same
namespace there will be conflicts.
@SparkQA
Copy link

SparkQA commented Oct 6, 2020

Test build #129433 has finished for PR 29934 at commit 948e15a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34040/

@SparkQA
Copy link

SparkQA commented Oct 6, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34040/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master/3.0 for Apache Spark 3.1.0/3.0.2.

dongjoon-hyun pushed a commit that referenced this pull request Oct 7, 2020
### What changes were proposed in this pull request?

The pod template configmap always had the same name. This PR makes it unique.

### Why are the changes needed?

If you scheduled 2 spark jobs they will both use the same configmap name this will result in conflicts. This PR fixes that

**BEFORE**
```
$ kubectl get cm --all-namespaces -w | grep podspec
podspec-configmap                              1      65s
```

**AFTER**
```
$ kubectl get cm --all-namespaces -w | grep podspec
aaece65ef82e4a30b7b7800aad600d4f   spark-test-app-aac9f37502b2ca55-driver-podspec-conf-map   1      0s
```

This can be seen when running the integration tests

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit tests and the integration tests test if this works

Closes #29934 from stijndehaes/bugfix/SPARK-32067-unique-name-for-template-configmap.

Authored-by: Stijn De Haes <stijndehaes@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 3099fd9)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun
Copy link
Member

Thank you again, @stijndehaes . I added you to the Apache Spark contributor group and assigned SPARK-32067 to you.

Welcome to the Apache Spark community again!

@yuj
Copy link
Contributor

yuj commented Oct 7, 2020

Awesome. Thanks guys @stijndehaes @dongjoon-hyun

@stijndehaes stijndehaes deleted the bugfix/SPARK-32067-unique-name-for-template-configmap branch October 8, 2020 05:43
@stijndehaes
Copy link
Contributor Author

@dongjoon-hyun @yuj Thanks for all the help guys :)

holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
### What changes were proposed in this pull request?

The pod template configmap always had the same name. This PR makes it unique.

### Why are the changes needed?

If you scheduled 2 spark jobs they will both use the same configmap name this will result in conflicts. This PR fixes that

**BEFORE**
```
$ kubectl get cm --all-namespaces -w | grep podspec
podspec-configmap                              1      65s
```

**AFTER**
```
$ kubectl get cm --all-namespaces -w | grep podspec
aaece65ef82e4a30b7b7800aad600d4f   spark-test-app-aac9f37502b2ca55-driver-podspec-conf-map   1      0s
```

This can be seen when running the integration tests

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit tests and the integration tests test if this works

Closes apache#29934 from stijndehaes/bugfix/SPARK-32067-unique-name-for-template-configmap.

Authored-by: Stijn De Haes <stijndehaes@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 3099fd9)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants