Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make checkpointing pluggable #161

Merged
merged 10 commits into from
Jul 26, 2019
Merged

Make checkpointing pluggable #161

merged 10 commits into from
Jul 26, 2019

Conversation

badrishc
Copy link
Contributor

The goal of this PR is to make FASTER C# checkpointing use a pluggable user-specified interface for providing devices (such as for index and snapshot) and for performing the atomic metadata commit. We will refactor the current implementation (currently hard coded to use LocalStorageDevice and C# Streams) as a reference implementation.

This will allow users to provide plugins that might, for example, (1) write checkpoints to remote devices such as Azure page blobs; (2) use a DBMS to commit and store the checkpoint metadata; and (3) write custom checkpoint adapters that for example, may encrypt or add checksums to the checkpoint data/metadata.

@badrishc badrishc added enhancement New feature or request work in progress Work in progress labels Jul 23, 2019
@badrishc
Copy link
Contributor Author

Fixes #148

/// </summary>
/// <param name="logToken"></param>
/// <returns>Commit info, if valid checkpoint found, and null otherwise</returns>
byte[] GetLogCommitMetadata(Guid logToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear from the interface whether this needs to be implemented for fold over checkpoints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it needs to be implemented for all log checkpoints (fold-over and snapshot). In case of foldover, this has information related to the status of the log as of the checkpoint, so we can resume from the correct log location (e.g., to handle the case if more uncheckpointed activity occurred on the log before the crash).

/// <param name="indexToken"></param>
/// <param name="logToken"></param>
/// <returns>true if latest valid checkpoint found, false otherwise</returns>
bool GetLatestCheckpoint(out Guid indexToken, out Guid logToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear from the interface how this works for fold over checkpoints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All full checkpoints (foldover or snapshot does not matter) consist of a Guid for the index and a Guid for the log. The implementer of GetLatestCheckpoint simply returns these two Guids. For a "full checkpoint" (index+log) these two Guids will usually be the same.

Copy link
Contributor

@peterfreiling peterfreiling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@badrishc
Copy link
Contributor Author

FASTER calls the checkpoint manager interface during checkpoint/recovery in this sequence:

Checkpoint:

  • InitializeIndexCheckpoint (for index checkpoints) ->
  • GetIndexDevice (for index checkpoints) ->
  • InitializeLogCheckpoint (for log checkpoints) ->
  • GetSnapshotLogDevice (for log checkpoints in snapshot mode) ->
  • GetSnapshotObjectLogDevice (for log checkpoints in snapshot mode with objects) ->
  • CommitLogCheckpoint (for log checkpoints) ->
  • CommitIndexCheckpoint (for index checkpoints) ->

Recovery:

  • GetLatestCheckpoint (if request to recover to latest checkpoint) ->
  • GetIndexCommitMetadata ->
  • GetLogCommitMetadata ->
  • GetIndexDevice ->
  • GetSnapshotLogDevice (for recovery in snapshot mode) ->
  • GetSnapshotObjectLogDevice (for recovery in snapshot mode with objects)

Provided devices will be closed directly by FASTER when done.

@badrishc badrishc merged commit c617beb into master Jul 26, 2019
@badrishc badrishc deleted the chkpt-device branch September 7, 2020 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request work in progress Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants