Skip to content

[SPARK-56238][K8S] Fix app ID propagation in KubernetesClusterSchedulerBackend for client mode submission#55355

Closed
xiaoxuandev wants to merge 2 commits into
apache:masterfrom
xiaoxuandev:fix-56238
Closed

[SPARK-56238][K8S] Fix app ID propagation in KubernetesClusterSchedulerBackend for client mode submission#55355
xiaoxuandev wants to merge 2 commits into
apache:masterfrom
xiaoxuandev:fix-56238

Conversation

@xiaoxuandev
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Cache the application ID at construction time in KubernetesClusterSchedulerBackend so that applicationId() returns a stable value across calls.

Previously, applicationId() fell back to KubernetesConf.getKubernetesAppId() when spark.app.id was not yet set, which generates a new random UUID on every call. In client mode, SparkContext sets spark.app.id only after start() returns, so during start() the multiple calls to applicationId() (for podAllocator.start(), watchEvents.start(), pollEvents.start(), and setUpExecutorConfigMap()) each received a different ID. This caused subsystems to use inconsistent app IDs for pod labeling and filtering.

This only affects client mode. In cluster mode, the submission client generates the app ID upfront and writes it into spark.app.id via BasicDriverFeatureStep before the driver pod starts, so conf.getOption("spark.app.id") always returns a value and the getOrElse branch is never reached.

The fix adds a private val appId that resolves the ID once at construction time and returns it consistently, matching the pattern used by SchedulerBackend, LocalSchedulerBackend, and other backends.

Why are the changes needed?

Without this fix, the Kubernetes scheduler backend could propagate different app IDs to different subsystems during start(), leading to:

  • Pod allocator, watch events, and poll events using different app IDs
  • stop() unable to clean up resources created by start() (services, PVCs, config maps, executor pods) because the label selector uses a different ID

Does this PR introduce any user-facing change?

No direct user-facing change. This fixes an internal consistency issue that could cause resource leaks in Kubernetes deployments.

How was this patch tested?

  • Unit tests in KubernetesClusterSchedulerBackendSuite verifying applicationId() stability both when spark.app.id is set and when it is not set.
  • Existing tests for DeploymentAllocatorSuite, ExecutorPodsLifecycleManagerSuite, and StatefulSetAllocatorSuite continue to pass.

Was this patch authored or co-authored using generative AI tooling?

Yes, co-authored with Kiro.

…erBackend

### What changes were proposed in this pull request?

Cache the application ID at construction time in `KubernetesClusterSchedulerBackend` so that `applicationId()` returns a stable value across calls.

Previously, `applicationId()` fell back to `KubernetesConf.getKubernetesAppId()` when `spark.app.id` was not yet set, which generates a new random UUID on every call. In client mode, `SparkContext` sets `spark.app.id` only after `start()` returns, so during `start()` the multiple calls to `applicationId()` (for `podAllocator.start()`, `watchEvents.start()`, `pollEvents.start()`, and `setUpExecutorConfigMap()`) each received a different ID. This caused subsystems to use inconsistent app IDs for pod labeling and filtering.

This only affects client mode. In cluster mode, the submission client generates the app ID upfront and writes it into `spark.app.id` via `BasicDriverFeatureStep` before the driver pod starts, so `conf.getOption("spark.app.id")` always returns a value and the `getOrElse` branch is never reached.

The fix adds a `private val appId` that resolves the ID once at construction time and returns it consistently, matching the pattern used by `SchedulerBackend`, `LocalSchedulerBackend`, and other backends.

### Why are the changes needed?

Without this fix, the Kubernetes scheduler backend could propagate different app IDs to different subsystems during `start()` in client mode, leading to:
- Pod allocator, watch events, and poll events using different app IDs
- `stop()` unable to clean up resources created by `start()` (services, PVCs, config maps, executor pods) because the label selector uses a different ID

### Does this PR introduce _any_ user-facing change?

No direct user-facing change. This fixes an internal consistency issue that could cause resource leaks in Kubernetes client-mode deployments.

### How was this patch tested?

- Unit tests in `KubernetesClusterSchedulerBackendSuite` verifying `applicationId()` stability both when `spark.app.id` is set and when it is not set.
- Existing tests for `DeploymentAllocatorSuite`, `ExecutorPodsLifecycleManagerSuite`, and `StatefulSetAllocatorSuite` continue to pass.

### Was this patch authored or co-authored using generative AI tooling?

Yes, co-authored with Kiro.
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


test("SPARK-56238: applicationId() returns consistent value when spark.app.id is set") {
val id1 = schedulerBackendUnderTest.applicationId()
val id2 = schedulerBackendUnderTest.applicationId()
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case doesn't make sense to me in this PR's context. Please remove this test case because this passes without your PR, @xiaoxuandev .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks!

assert(id1 === TEST_SPARK_APP_ID)
}

test("SPARK-56238: applicationId() is stable across calls when spark.app.id is not set") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case seems to reproduce the reported scenario.

when(localRpcEnv.setupEndpoint(any(), any())).thenReturn(driverEndpointRef)
val localTaskScheduler = mock(classOf[TaskSchedulerImpl])
when(localTaskScheduler.sc).thenReturn(localSc)
val backendWithoutAppId = new KubernetesClusterSchedulerBackend(
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you happen to know when this situation happens in the production environment, @xiaoxuandev ? I'm wondering if this is a valid case in the Apache Spark usage.

This only affects client mode.

One more question. Do you know if this is a regression or not? (as Enrico claims)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a regression introduced by #54269. The original code cached the generated ID in private val appId, which was correct.

Affected versions: v4.2.0-preview3+. All 4.0.x and 4.1.x releases are clean.

Regarding production usage: this affects Kubernetes client mode (--deploy-mode client), where the driver runs outside the K8s cluster. In that path, spark.app.id is not pre-set before backend.start(), so each call to applicationId() during start() would generate a different UUID.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thank you for confirming, @xiaoxuandev .

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @xiaoxuandev .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-56238][K8S] Fix app ID propagation in KubernetesClusterSchedulerBackend [SPARK-56238][K8S] Fix app ID propagation in KubernetesClusterSchedulerBackend for client mode submission Apr 15, 2026
@EnricoMi
Copy link
Copy Markdown
Contributor

Thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants