Skip to content

[SPARK-56793][K8S] Avoid cluster-wide LIST in executor pods polling#55754

Open
TongWei1105 wants to merge 1 commit intoapache:masterfrom
TongWei1105:spark-k8s-scope-poller-namespace
Open

[SPARK-56793][K8S] Avoid cluster-wide LIST in executor pods polling#55754
TongWei1105 wants to merge 1 commit intoapache:masterfrom
TongWei1105:spark-k8s-scope-poller-namespace

Conversation

@TongWei1105
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Scope the executor pod LIST issued by ExecutorPodsPollingSnapshotSource to the configured Kubernetes namespace by inserting .inNamespace(namespace) between .pods() and the label filters.

Why are the changes needed?

Without .inNamespace(...) the fabric8 client issues a cluster-wide LIST against the K8s API server. Other paths in the K8s scheduler module (e.g. KubernetesClusterSchedulerBackend.doKillExecutors, ExecutorPodsLifecycleManager) already scope their pod operations to the configured namespace; the polling source was inconsistent.

A cluster-wide LIST:

  • fails under the typical least-privilege deployment where the driver ServiceAccount is bound to a namespaced Role rather than a ClusterRole;
  • causes unnecessary load and broadens the visibility surface even when ClusterRole permissions are granted.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Updated ExecutorPodsPollingSnapshotSourceSuite to mock the new .inNamespace(...) link in the chain. Ran the full resource-managers/kubernetes/core test module: 344 tests across 42 suites, all passing.

Was this patch authored or co-authored using generative AI tooling?

Yes,Generated-by: Claude Code 4.7

### What changes were proposed in this pull request?

Scope the executor pod LIST issued by `ExecutorPodsPollingSnapshotSource`
to the configured Kubernetes namespace by inserting `.inNamespace(namespace)`
between `.pods()` and the label filters.

### Why are the changes needed?

Without `.inNamespace(...)` the fabric8 client issues a cluster-wide LIST
against the K8s API server. Other paths in the K8s scheduler module
(e.g. `KubernetesClusterSchedulerBackend.doKillExecutors`,
`ExecutorPodsLifecycleManager`) already scope their pod operations to the
configured namespace; the polling source was inconsistent.

A cluster-wide LIST:
- fails under the typical least-privilege deployment where the driver
  ServiceAccount is bound to a namespaced Role rather than a ClusterRole;
- causes unnecessary load and broadens the visibility surface even when
  ClusterRole permissions are granted.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Updated `ExecutorPodsPollingSnapshotSourceSuite` to mock the new
`.inNamespace(...)` link in the chain. Ran the full
`resource-managers/kubernetes/core` test module: 344 tests across
42 suites, all passing.

### Was this patch authored or co-authored using generative AI tooling?

No.
@TongWei1105
Copy link
Copy Markdown
Contributor Author

Could you please take a look? @dongjoon-hyun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant