Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [MR-581] Add functionalities of creating and removing unverified checkpoint markers #657

Merged
merged 24 commits into from
Aug 14, 2024

Conversation

ShuoWangNSL
Copy link
Contributor

@ShuoWangNSL ShuoWangNSL commented Jul 29, 2024

This PR adds a marker file whenever a tip or a state sync scratchpad is promoted to an official checkpoint. This marker file is only removed once the checkpoint has been successfully loaded.

If the loading fails, the replica will crash, leaving the marker file within the checkpoint directory.  Upon restarting, any checkpoints containing a marker file will be archived and ignored, preventing the system from entering a continuous crash loop due to repeatedly attempting to load a corrupt checkpoint.

With the marker file, we don't need to load checkpoint the synced checkpoint twice at the end of state sync.

Writing the marker file is limited to CheckpointLayout with WritePolicy.

@ShuoWangNSL ShuoWangNSL changed the title Add functionaries for creating and removing unverified checkpoint markers feat: [MR-581] Add functionalities of creating and removing unverified checkpoint markers Jul 29, 2024
@github-actions github-actions bot added the feat label Jul 29, 2024
rs/state_layout/src/state_layout.rs Outdated Show resolved Hide resolved
rs/state_layout/src/state_layout.rs Show resolved Hide resolved
rs/state_layout/src/state_layout.rs Outdated Show resolved Hide resolved
rs/state_machine_tests/src/lib.rs Show resolved Hide resolved
rs/state_manager/src/lib.rs Outdated Show resolved Hide resolved
rs/state_manager/src/lib.rs Outdated Show resolved Hide resolved
rs/state_manager/src/lib.rs Outdated Show resolved Hide resolved
rs/state_manager/src/manifest/tests/computation.rs Outdated Show resolved Hide resolved
rs/state_manager/tests/state_manager.rs Outdated Show resolved Hide resolved
rs/state_manager/tests/state_manager.rs Outdated Show resolved Hide resolved
@ShuoWangNSL ShuoWangNSL requested a review from a team as a code owner August 6, 2024 03:23
rs/state_layout/src/state_layout.rs Show resolved Hide resolved
rs/state_layout/src/state_layout/tests.rs Show resolved Hide resolved
rs/state_manager/src/checkpoint.rs Show resolved Hide resolved
rs/state_manager/src/lib.rs Show resolved Hide resolved
rs/state_manager/src/state_sync.rs Outdated Show resolved Hide resolved
rs/state_manager/tests/state_manager.rs Show resolved Hide resolved
rs/state_manager/tests/state_manager.rs Show resolved Hide resolved
@ShuoWangNSL ShuoWangNSL added this pull request to the merge queue Aug 14, 2024
Merged via the queue into master with commit 6968299 Aug 14, 2024
23 checks passed
@ShuoWangNSL ShuoWangNSL deleted the shuo/unverified_marker branch August 14, 2024 22:24
levifeldman pushed a commit to levifeldman/ic that referenced this pull request Oct 1, 2024
…d checkpoint markers (dfinity#657)

This PR adds a marker file whenever a tip or a state sync scratchpad is
promoted to an official checkpoint. This marker file is only removed
once the checkpoint has been successfully loaded.

If the loading fails, the replica will crash, leaving the marker file
within the checkpoint directory.  Upon restarting, any checkpoints
containing a marker file will be archived and ignored, preventing the
system from entering a continuous crash loop due to repeatedly
attempting to load a corrupt checkpoint.

With the marker file, we don't need to load checkpoint the synced
checkpoint twice at the end of state sync.

Writing the marker file is limited to `CheckpointLayout` with
`WritePolicy`.

---------

Co-authored-by: Stefan Schneider <31004026+schneiderstefan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants