Skip to content

RATIS-1723. CounterStateMachine should update the latest snapshot.#765

Merged
codings-dan merged 3 commits intoapache:masterfrom
szetszwo:RATIS-1723
Oct 24, 2022
Merged

RATIS-1723. CounterStateMachine should update the latest snapshot.#765
codings-dan merged 3 commits intoapache:masterfrom
szetszwo:RATIS-1723

Conversation

@szetszwo
Copy link
Copy Markdown
Contributor

private volatile File stateMachineDir = null;

private volatile SingleFileSnapshotInfo currentSnapshot = null;
private final AtomicReference<SnapshotInfo> latestSnapshot = new AtomicReference<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use AtomicReference<SingleFileSnapshotInfo> here, we don't have to cast in the caller site if we're sure that the snapshot storage is SimpleStateMachineStorage.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. Thanks!

}

// update storage
final MD5Hash md5 = MD5FileUtil.computeAndSaveMd5ForFile(snapshotFile);
Copy link
Copy Markdown
Member

@tisonkun tisonkun Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reader of the saved md5 file? Does the file have an owner that remove it on exiting/snapshot cleanup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should verify the md5 when loading the snapshot.

The md5 file should be removed when the snapshot file is removed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be removed

Can you show me the code path to implement this clean-up logic? I don't find it :(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we don't have clean-up logic implemented in the examples.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tisonkun , actually, the old snapshots are deleted in StateMachineUpdater.takeSnapshot() according to the SnapshotRetentionPolicy. It should also delete the md5 files.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I saw aabe571 now :)

Copy link
Copy Markdown
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I ask some questions about cleaning up MD5 files because I ever saw Alibaba's celeborn project write its own clean up logics to do so and I wonder if it can be some code in the upstream :)

Reference: https://github.com/alibaba/celeborn/blob/bff2a7065b5b707856b9de93f6307b2b63098b52/master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/StateMachine.java#L73-L156

@szetszwo
Copy link
Copy Markdown
Contributor Author

... Alibaba's celeborn project write its own clean up logics ...

Thanks for the info! Good to know.

@szetszwo szetszwo requested a review from codings-dan October 19, 2022 06:11
Copy link
Copy Markdown
Contributor

@codings-dan codings-dan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change looks good, +1

@codings-dan codings-dan merged commit b8117f4 into apache:master Oct 24, 2022
@szetszwo
Copy link
Copy Markdown
Contributor Author

@codings-dan , thanks a lot for reviewing this!

szetszwo added a commit that referenced this pull request Mar 5, 2023
symious pushed a commit to symious/ratis that referenced this pull request Mar 12, 2024
…pache#765)

* RATIS-1723. CounterStateMachine should update the latest snapshot.

* Address review comments and fix bugs.

* delete md5 file
@szetszwo szetszwo deleted the RATIS-1723 branch May 26, 2025 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants