-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47730][K8S] Support APP_ID
and EXECUTOR_ID
placeholders in labels
#46149
[SPARK-47730][K8S] Support APP_ID
and EXECUTOR_ID
placeholders in labels
#46149
Conversation
cc @dongjoon-hyun |
APP_ID
and EXECUTOR_ID
placeholders in labels
private val CUSTOM_DRIVER_LABELS = Map("labelkey" -> "labelvalue") | ||
private val CUSTOM_DRIVER_LABELS = Map( | ||
"labelkey" -> "labelvalue", | ||
"yunikorn.apache.org/app-id" -> "{{APPID}}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand the intention, let's isolate yunikorn
related feature inside YuniKornSuite.scala
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we mixed that and it turns that it was not good in a long term perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Actually the label key used here can be any string. I've updated it to use a general label key as well as a general annotation key in this test. Also fixed the typo APPID
-> APP_ID
.
@@ -102,13 +102,15 @@ private[spark] trait BasicTestsSuite { k8sSuite: KubernetesSuite => | |||
sparkAppConf | |||
.set("spark.kubernetes.driver.label.label1", "label1-value") | |||
.set("spark.kubernetes.driver.label.label2", "label2-value") | |||
.set("spark.kubernetes.driver.label.yunikorn.apache.org/app-id", "{{APP_ID}}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try to add this test coverage in YuniKornSuite.scala
?
@@ -589,7 +589,8 @@ class KubernetesSuite extends SparkFunSuite | |||
assert(pod.getMetadata.getLabels.get("label2") === "label2-value") | |||
assert(pod.getMetadata.getAnnotations.get("annotation1") === "annotation1-value") | |||
assert(pod.getMetadata.getAnnotations.get("annotation2") === "annotation2-value") | |||
val appId = pod.getMetadata.getAnnotations.get("yunikorn.apache.org/app-id") | |||
val appIdLabel = pod.getMetadata.getLabels.get("yunikorn.apache.org/app-id") | |||
val appIdAnnotation = pod.getMetadata.getAnnotations.get("yunikorn.apache.org/app-id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better if we can move all yunikorn stuff into YuniKornSuite
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @jshmchenxi .
I'm interested in your proposal and use case. Could you address my review comments?
Gentle ping, @jshmchenxi . |
It's been a busy week, sorry for the delay. I'll address your comments today, thanks! @dongjoon-hyun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun Thanks for the suggestions! I've updated the label key in the unit test to use a general one instead of the app id key of YuniKorn.
And in our use case, we could use any custom label key for the app id. We are reusing yunikorn.apache.org/app-id
for simplicity and it is in the new standards YuniKorn labels - YUNIKORN-1351 - but it is not yet released. So there is no need to put the labels in the YuniKornSuite
yet.
private val CUSTOM_DRIVER_LABELS = Map("labelkey" -> "labelvalue") | ||
private val CUSTOM_DRIVER_LABELS = Map( | ||
"labelkey" -> "labelvalue", | ||
"yunikorn.apache.org/app-id" -> "{{APPID}}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Actually the label key used here can be any string. I've updated it to use a general label key as well as a general annotation key in this test. Also fixed the typo APPID
-> APP_ID
.
391609f
to
0231b0d
Compare
Currently, only the pod annotations supports `APP_ID` and `EXECUTOR_ID` placeholders. This commit aims to add the same function to pod labels. The use case is to support using customized labels for availability zone based topology pod affinity. We want to use the Spark application ID as the customized label value, to allow Spark executor pods to run in the same availability zone as Spark driver pod. Although we can use the Spark internal label `spark-app-selector` directly, this is not a good practice when using it along with YuniKorn Gang Scheduling. When Gang Scheduling is enabled, the YuniKorn placeholder pods should use the same affinity as real Spark pods. In this way, we have to add the internal `spark-app-selector` label to the placeholder pods. This is not good because the placeholder pods could be recognized as Spark pods in the monitoring system. Thus we propose supporting the `APP_ID` and `EXECUTOR_ID` placeholders in Spark pod labels as well for flexibility.
Also fix the typo value `APPID` -> `APP_ID` in `BasicDriverFeatureStepSuite`.
0231b0d
to
528539c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you for updating, @jshmchenxi .
Merged to master for Apache Spark 4.0.0.
I added you (Xi Chen) to Apache Spark contributor group and assigned SPARK-47730 to you. |
… labels ### What changes were proposed in this pull request? Currently, only the pod annotations supports `APP_ID` and `EXECUTOR_ID` placeholders. This commit aims to add the same function to pod labels. ### Why are the changes needed? The use case is to support using customized labels for availability zone based topology pod affinity. We want to use the Spark application ID as the customized label value, to allow Spark executor pods to run in the same availability zone as Spark driver pod. Although we can use the Spark internal label `spark-app-selector` directly, this is not a good practice when using it along with YuniKorn Gang Scheduling. When Gang Scheduling is enabled, the YuniKorn placeholder pods should use the same affinity as real Spark pods. In this way, we have to add the internal `spark-app-selector` label to the placeholder pods. This is not good because the placeholder pods could be recognized as Spark pods in the monitoring system. Thus we propose supporting the `APP_ID` and `EXECUTOR_ID` placeholders in Spark pod labels as well for flexibility. ### Does this PR introduce _any_ user-facing change? No because the pattern strings are very specific. ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46149 from jshmchenxi/SPARK-47730/support-app-placeholder-in-labels. Authored-by: Xi Chen <jshmchenxi@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Currently, only the pod annotations supports
APP_ID
andEXECUTOR_ID
placeholders. This commit aims to add the same function to pod labels.Why are the changes needed?
The use case is to support using customized labels for availability zone based topology pod affinity. We want to use the Spark application ID as the customized label value, to allow Spark executor pods to run in the same availability zone as Spark driver pod.
Although we can use the Spark internal label
spark-app-selector
directly, this is not a good practice when using it along with YuniKorn Gang Scheduling. When Gang Scheduling is enabled, the YuniKorn placeholder pods should use the same affinity as real Spark pods. In this way, we have to add the internalspark-app-selector
label to the placeholder pods. This is not good because the placeholder pods could be recognized as Spark pods in the monitoring system.Thus we propose supporting the
APP_ID
andEXECUTOR_ID
placeholders in Spark pod labels as well for flexibility.Does this PR introduce any user-facing change?
No because the pattern strings are very specific.
How was this patch tested?
Unit tests.
Was this patch authored or co-authored using generative AI tooling?
No