Skip to content

Conversation

@ForVic
Copy link
Contributor

@ForVic ForVic commented Nov 4, 2025

What changes were proposed in this pull request?

Adds support for K8s Deployment API to allocate pods.

Why are the changes needed?

Allocating individual pods is not ideal, and we can allocate with higher level APIs. #33508 helps this by adding an interface for arbitrary allocators and adds a statefulset allocator. However, dynamic allocation only works if you have implemented a PodDisruptionBudget associated with the decommission label. Since Deployment uses ReplicaSet, which supports pod-deletion-cost annotation, we can avoid needing to create a separate PDB resource, and allow dynamic allocation (w/ shuffle tracking) by adding a low deletion cost to executors we are scaling down. When we scale the Deployment, it will choose to scale down the pods with the low deletion cost.

Does this PR introduce any user-facing change?

Yes, adds user-facing configs

spark.kubernetes.executor.podDeletionCost

How was this patch tested?

New unit tests + passing existing unit tests + tested in a cluster with shuffle tracking and dynamic allocation enabled

Was this patch authored or co-authored using generative AI tooling?

No

@ForVic ForVic changed the title [todo] Add support for Deployment API on K8s [todo][K8S] Add support for Deployment API on K8s Nov 4, 2025
@ForVic ForVic changed the title [todo][K8S] Add support for Deployment API on K8s [SPARK-54173][K8S] Add support for Deployment API on K8s Nov 4, 2025
@ForVic ForVic marked this pull request as ready for review November 5, 2025 01:03
@ForVic
Copy link
Contributor Author

ForVic commented Nov 5, 2025

cc @sunchao

also @holdenk @dongjoon-hyun

@sunchao
Copy link
Member

sunchao commented Nov 5, 2025

Thanks @dongjoon-hyun for the review!

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ForVic ! Looks mostly good to me (already reviewed internally). Just one question around the new conf.

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sunchao sunchao closed this in d4bc277 Nov 11, 2025
@sunchao
Copy link
Member

sunchao commented Nov 11, 2025

Merged to master, thanks @ForVic for the contribution and @dongjoon-hyun for the review!

zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
### What changes were proposed in this pull request?

Adds support for K8s `Deployment` API to allocate pods.

### Why are the changes needed?

Allocating individual pods is not ideal, and we can allocate with higher level APIs. apache#33508 helps this by adding an interface for arbitrary allocators and adds a statefulset allocator. However, dynamic allocation only works if you have implemented a PodDisruptionBudget associated with the decommission label. Since Deployment uses ReplicaSet, which supports `pod-deletion-cost` annotation, we can avoid needing to create a separate PDB resource, and allow dynamic allocation (w/ shuffle tracking) by adding a low deletion cost to executors we are scaling down. When we scale the Deployment, it will choose to scale down the pods with the low deletion cost.

### Does this PR introduce _any_ user-facing change?
Yes, adds user-facing configs
```
spark.kubernetes.executor.podDeletionCost
```

### How was this patch tested?
New unit tests + passing existing unit tests + tested in a cluster with shuffle tracking and dynamic allocation enabled

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#52867 from ForVic/dev/victors/deployment_allocator.

Lead-authored-by: Victor Sunderland <victors@openai.com>
Co-authored-by: victors-oai <victors@openai.com>
Co-authored-by: Victor Sunderland <64456855+ForVic@users.noreply.github.com>
Signed-off-by: Chao Sun <chao@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants