Skip to content

[SPARK-56205][SS] Validate base state store checkpoint ID before committing microbatch#55008

Closed
ericm-db wants to merge 1 commit intoapache:masterfrom
ericm-db:validate-base-checkpoint-id
Closed

[SPARK-56205][SS] Validate base state store checkpoint ID before committing microbatch#55008
ericm-db wants to merge 1 commit intoapache:masterfrom
ericm-db:validate-base-checkpoint-id

Conversation

@ericm-db
Copy link
Copy Markdown
Contributor

@ericm-db ericm-db commented Mar 25, 2026

What changes were proposed in this pull request?

  • Validate that the executor-reported baseStateStoreCkptId matches the checkpoint ID the driver originally sent (currentStateStoreCkptId) before accepting the microbatch commit
  • Add STATE_STORE_BASE_CHECKPOINT_ID_MISMATCH error class with StateStoreBaseCheckpointIdMismatch exception
  • Replace the // TODO validate baseStateStoreCkptId in MicroBatchExecution.updateStateStoreCkptIdForOperator with validation logic

Why are the changes needed?

To ensure that we have the right lineage of checkpoint files

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Add integration test verifying baseStateStoreCkptId matches commit log across batches within a single stream run
  • Add integration test verifying baseStateStoreCkptId matches commit log across stream stop/restart (exercises the populateStartOffsets commit log recovery path)
  • Add unit test verifying StateStoreBaseCheckpointIdMismatch error class produces correct error condition and message

Was this patch authored or co-authored using generative AI tooling?

Generated by: Claude Opus 4.6

…atch

## Summary
- Validate that the executor-reported `baseStateStoreCkptId` matches the checkpoint ID the driver originally sent (`currentStateStoreCkptId`) before accepting the microbatch commit
- Add `STATE_STORE_BASE_CHECKPOINT_ID_MISMATCH` error class with `StateStoreBaseCheckpointIdMismatch` exception
- Replace the `// TODO validate baseStateStoreCkptId` in `MicroBatchExecution.updateStateStoreCkptIdForOperator` with validation logic

## Test plan
- Add integration test verifying `baseStateStoreCkptId` matches commit log across batches within a single stream run
- Add integration test verifying `baseStateStoreCkptId` matches commit log across stream stop/restart (exercises the `populateStartOffsets` commit log recovery path)
- Add unit test verifying `StateStoreBaseCheckpointIdMismatch` error class produces correct error condition and message

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ericm-db ericm-db changed the title [SS] Validate base state store checkpoint ID before committing microbatch [SPARK-56205][SS] Validate base state store checkpoint ID before committing microbatch Mar 25, 2026
@anishshri-db
Copy link
Copy Markdown
Contributor

LGTM pending CI

@HeartSaVioR
Copy link
Copy Markdown
Contributor

@ericm-db
Just a drive-by comment: please do not skip the sections in the PR template. Please leave a short text even it's not applicable, to acknowledge you notice the PR template and try to fill it out.

@anishshri-db
Copy link
Copy Markdown
Contributor

Docker failures only - should not be related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants