-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54184][K8S] Support spark.kubernetes.executor.deletedExecutorsCacheTimeout
#52884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sCacheTTLSeconds`
| .createWithDefault(true) | ||
|
|
||
| val KUBERNETES_DELETED_EXECUTORS_CACHE_TTL_SECONDS = | ||
| ConfigBuilder("spark.kubernetes.executor.deletedExecutorsCacheTTLSeconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also accepts human-readable strings, like 10s, 3min, can we eliminate Seconds in the config name then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. You're right. I blindly searched TTL keyword and followed a bad practice. Let me revise this.
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala
|
Thank you, @pan3793 . Also, cc @peter-toth . |
spark.kubernetes.executor.deletedExecutorsCacheTTLSecondsspark.kubernetes.executor.deletedExecutorsCacheTimeout
…sCacheTimeout` ### What changes were proposed in this pull request? This PR aims to support `spark.kubernetes.executor.deletedExecutorsCacheTimeout`. ### Why are the changes needed? To allow users to control the TTL for the deleted executors cache. Previously, it has a hard-coded value. In some very slow clusters, we need to remember longer than 3 minutes. https://github.com/apache/spark/blob/a8e35c407bc5340f83b35e5a2f0b0767c6baadb0/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L54-L57 ### Does this PR introduce _any_ user-facing change? No behavior change because the default value is the same. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52884 from dongjoon-hyun/SPARK-54184. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit ada1908) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
peter-toth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late LGTM.
|
Thank you, @peter-toth . |
…sCacheTimeout` ### What changes were proposed in this pull request? This PR aims to support `spark.kubernetes.executor.deletedExecutorsCacheTimeout`. ### Why are the changes needed? To allow users to control the TTL for the deleted executors cache. Previously, it has a hard-coded value. In some very slow clusters, we need to remember longer than 3 minutes. https://github.com/apache/spark/blob/a8e35c407bc5340f83b35e5a2f0b0767c6baadb0/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala#L54-L57 ### Does this PR introduce _any_ user-facing change? No behavior change because the default value is the same. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#52884 from dongjoon-hyun/SPARK-54184. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR aims to support
spark.kubernetes.executor.deletedExecutorsCacheTimeout.Why are the changes needed?
To allow users to control the TTL for the deleted executors cache.
Previously, it has a hard-coded value. In some very slow clusters, we need to remember longer than 3 minutes.
spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
Lines 54 to 57 in a8e35c4
Does this PR introduce any user-facing change?
No behavior change because the default value is the same.
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No.