-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot store #3643
Snapshot store #3643
Conversation
lib/storage/src/types.rs
Outdated
@@ -53,7 +53,7 @@ pub struct StorageConfig { | |||
pub storage_path: String, | |||
#[serde(default = "default_snapshots_path")] | |||
#[validate(length(min = 1))] | |||
pub snapshots_path: String, | |||
pub snapshots_path: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed because in the future there could be either snapshot_path for local storage either some details for the storage back end. It would have been cleaner with an enum in here but I believe it would break backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:check:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@generall an enum would mean to also have a custom deserialization implementation and everything, overall I felt that it was cleaner how I implemented it here, but for reference this is how it would look like AlexandruCihodaru@887d426
lib/collection/src/collection/mod.rs
Outdated
snapshots_path: snapshots_path.to_owned(), | ||
snapshot_storage_manager: Self::create_snapshot_storage_manager(snapshots_path), | ||
channel_service, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in the future snapshots_path could be remove so we will avoid info duplication, the path can be retrieved from the snapshot_storage_manager.
lib/collection/src/collection/mod.rs
Outdated
@@ -50,6 +51,7 @@ pub struct Collection { | |||
this_peer_id: PeerId, | |||
path: PathBuf, | |||
snapshots_path: PathBuf, | |||
snapshot_storage_manager: Box<dyn SnapshotStorage>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be static dispatch by making Collection generic over SnapshotStorage. We could switch to it if snapshotting happens often because it is cheaper.
b662a4f
to
965a818
Compare
Hey @generall, I have addressed your comments. In a subsequent PR we should also move the get_snapshot_description and get_snapshot/checksum_path into the trait because we will have different implementations for the LocalFS and S3. |
* Define a trait with methods for managing snapshots after their creation. The goal of this trait is to allow implementation of snapshot management logic for different storage backends without major changes in the code. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
* Move the snapshot deletion logic into the implementation of SnapshotStorage for LocalFileSystemConfig. * Replace snapshot deletion logic with calls to the SnapshotStorage implementation. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
* Move the snapshot listing logic into the implementation of SnapshotStorage for LocalFileSystemConfig. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
* Move the snapshot creation logic into the implementation of SnapshotStorage for LocalFileSystemConfig. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
965a818
to
48fc10c
Compare
Hey @generall I was wondering if the desired outcome is to use S3 like a "persistent" storage or to use it for any snapshot related actions. If we targeting the first option than I think it is possible to bring locally from S3 the snapshot dir before initializing the TableOfContent in here
|
/tip $200 |
@AlexandruCihodaru: You just got a $200 tip! 👉 Complete your Algora onboarding to collect your payment. |
🎉🎈 @AlexandruCihodaru has been awarded $200! 🎈🎊 |
* [snapshots]: Define trait for snapshots managemet * Define a trait with methods for managing snapshots after their creation. The goal of this trait is to allow implementation of snapshot management logic for different storage backends without major changes in the code. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com> * [snapshots]: Move snapshot deletion logic * Move the snapshot deletion logic into the implementation of SnapshotStorage for LocalFileSystemConfig. * Replace snapshot deletion logic with calls to the SnapshotStorage implementation. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com> * [snapshots]: Move snapshot list logic * Move the snapshot listing logic into the implementation of SnapshotStorage for LocalFileSystemConfig. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com> * [snapshots]: Move snapshot creation logic * Move the snapshot creation logic into the implementation of SnapshotStorage for LocalFileSystemConfig. Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com> * review fixes * fmt * refactor full storage snapshot * create directory on get_stored_file + make sure temp archive file is removed --------- Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com> Co-authored-by: generall <andrey@vasnetsov.com>
This PR is a first step in an attempt to fix #3324.
To be able to add S3 support and other storage back ends later I defined a trait that will contain all needed methods to handle created snapshots. I took a simplifying assumption that the snapshots will first be created in a temp local place and after the snapshotting is completed we will move the snapshot and checksum.
After S3 support is added, it should be straight forward to implement support for additional storage back ends, the only needed action being implementing the trait for the new type.
To keep the PR small and easy to review, currently it only handles create_snapshot,delete,list (collection/full/shard). If the direction is good others could be implemented in subsequent PRs without breaking compatibility.
All Submissions:
dev
branch. Did you create your branch fromdev
?New Feature Submissions:
cargo +nightly fmt --all
command prior to submission?cargo clippy --all --all-features
command?Changes to Core Features: