Skip to content

Conversation

@zifeif2
Copy link
Contributor

@zifeif2 zifeif2 commented Jan 8, 2026

What changes were proposed in this pull request?

Support checkpointV2 for repartition writer and StateRewriter by returning the checkpoint Id to caller function after write is done.
Changes include

  • RocksDB loadWithCheckpointId supports loadEmpty
  • StatePartitionAllColumnFamiliesWriter return StateStoreCheckpointInfo
  • StateRewriter also propagate StateStoreCheckpointInfo back to the RepartitionRunner
  • RepartitionRunner stores the checkpointIds in commitLog

Why are the changes needed?

This is required in PrPr for repartition project

Does this PR introduce any user-facing change?

No

How was this patch tested?

See added unit tests on moth operator with single state store and multiple state stores

Was this patch authored or co-authored using generative AI tooling?

Yes. Sonnet 4.5

@github-actions
Copy link

github-actions bot commented Jan 8, 2026

JIRA Issue Information

=== New Feature SPARK-54590 ===
Summary: State Writer supports checkpoint V2
Assignee: None
Status: Open
Affected: ["4.1.0"]


This comment was automatically generated by GitHub Actions

@zifeif2 zifeif2 marked this pull request as ready for review January 8, 2026 01:17
@zifeif2 zifeif2 force-pushed the repartition-cp-v2 branch from 1199413 to 4879f3a Compare January 9, 2026 23:43
@zifeif2 zifeif2 force-pushed the repartition-cp-v2 branch from 4879f3a to 672daff Compare January 9, 2026 23:46
Copy link
Contributor

@micheal-o micheal-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am stamping this PR so we can move forward, but please lets correctly address the review comments. Thanks

@micheal-o
Copy link
Contributor

micheal-o commented Jan 15, 2026

@zifeif2 Also fix the PR title to: Support Checkpoint V2 for State Rewriter and Repartitioning

@zifeif2 zifeif2 changed the title [SPARK-54590][SS] Support CheckpointV2 for StatePartitionAllColumnFamiliesWriter [SPARK-54590][SS] Support Checkpoint V2 for State Rewriter and Repartitioning Jan 15, 2026
Fix logging format in StateRewriter.scala
try {
if (loadedVersion != version || (loadedStateStoreCkptId.isEmpty ||
stateStoreCkptId.get != loadedStateStoreCkptId.get)) {
if (loadEmpty || loadedVersion != version || loadedStateStoreCkptId.isEmpty ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why we need to support loadEmpty in loadWithCheckpointId in RocksDB is in repartition, we don't need to read previous data, that's why we need to add loadEmpty in RocksDB

I put loadEmpty in this if statement along with loadedVersion != version || loadedStateStoreCkptId.isEmpty ||... to reduce some duplicate code, but looks like it makes it harder to understand. I can refactor the code to make loadEmpty its separate block

log"with uniqueId ${MDC(LogKeys.UUID, stateStoreCkptId)}")
if (loadEmpty) {
logInfo(log"Loaded empty store at version ${MDC(LogKeys.VERSION_NUM, version)} " +
log"with uniqueId")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is unique Id not available here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we don't expect caller function to pass in uniqueId when they are calling loadWithCheckpointId when loadEmpty = true, because we are not load any previous versions of data when loadEmpty=true. We also have a require check above.

require(stateStoreCkptId.isEmpty, "stateStoreCkptId should be empty when loadEmpty is true")

I can change it to a less confusing message

partitionWriter.write(partitionIter)
}
Iterator(partitionWriter.write(partitionIter))
}.collect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we calling collect here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we need to add collect() to make the rewrite actually happen and get a list of StateStoreCheckpointInfo

// Since we cleared the local dir, we should also clear the local file mapping
rocksDBFileMapping.clear()
// Set empty metrics since we're not loading any files from DFS
loadCheckpointMetrics = RocksDBFileManagerMetrics.EMPTY_METRICS
Copy link
Contributor Author

@zifeif2 zifeif2 Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this line to make sure that loadCheckpointMetrics is set correctly when we are loading empty store cc @anishshri-db .
RocksDB will run fileManagerMetrics = fileManager.latestLoadCheckpointMetrics, and latestLoadCheckpointMetrics return loadCheckpointMetrics

Copy link
Contributor

@anishshri-db anishshri-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm pending green CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants