Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement rdb snapshot write directly into s3. #1205

Merged
merged 1 commit into from May 13, 2023
Merged

feat: Implement rdb snapshot write directly into s3. #1205

merged 1 commit into from May 13, 2023

Conversation

romange
Copy link
Collaborator

@romange romange commented May 11, 2023

  1. Load flow reorganized - most of the logic is now in InferLoadFile function. S3 read is not yet supported.
  2. Write path is implemented. Specifically, you can use undocumented (by design) option to save like: SAVE rdb s3://bucket/path/file.
  3. When using --dir=s3://bucket/path/ it also saves into s3 on shutdown.

@romange romange requested a review from chakaz May 11, 2023 04:57
src/server/server_family.cc Outdated Show resolved Hide resolved
src/server/server_family.h Show resolved Hide resolved
}

// Returns bucket_name, obj_path for an s3 path.
optional<pair<string_view, string_view>> GetBucketPath(string_view path) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is an internal function (side note: maybe wrap it in an anonymous namespace?), I'd define a struct for its return type, i.e. struct AwsData { string_view bucket_name; string_view obj_path; };

BTW consider using ABSL_ATTRIBUTE_LIFETIME_BOUND (see here) to indicate that the returned value cannot surpass the input's lifetime

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. yes, it's already anynymous.
  2. Oppah, welcome to crust++! Gladly, but I did not quite understand where I should put it to mark a return value as bound to the parameter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be right after the argument, as in:

optional<pair<string_view, string_view>> GetBucketPath(string_view path ABSL_ATTRIBUTE_LIFETIME_BOUND)

However I'm not sure, it's fairly new and I have not yet used it myself. Does it work as such?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works but I did not like how much visual clutter it adds. Decided to just return strings.

error_code file_ec;
data_folder = fs::canonical(dir, file_ec);
if (file_ec) {
LOG(ERROR) << "Data directory error: " << file_ec.message();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider including dir in the LOG()

}

fs::path fl_path{obj_path};
fl_path /= dbname;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm not a big fan of using operator /= for appending paths (I'd use .append() instead), but that's very much a matter of preference

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hh, np, changed.

@@ -204,9 +269,16 @@ class RdbSnapshot {
return started_ || (saver_ && saver_->Mode() == SaveMode::SUMMARY);
}

void SetAWS(cloud::AWS* aws) {
aws_ = aws;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a non-owning pointer, perhaps add a documentation saying such?

@@ -279,7 +374,9 @@ void ExtendDfsFilenameWithShard(int shard, fs::path* filename) {
}

GenericError ValidateFilename(const fs::path& filename, bool new_version) {
if (!filename.parent_path().empty()) {
bool has_cloud_prefix = absl::StartsWith(filename.c_str(), "s3://");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use IsCloudPath() here?

data_folder = fs::canonical(dir, file_ec);
string flag_dir = GetFlag(FLAGS_dir);
if (IsCloudPath(flag_dir)) {
aws_.reset(new cloud::AWS("s3"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safer to use std::make_unique<cloud::AWS>("s3") instead of .reset(). Generally, calling reset() and unique_ptr's c'tor directly is considered bad practice.

src/server/server_family.cc Outdated Show resolved Hide resolved
@chakaz
Copy link
Collaborator

chakaz commented May 12, 2023

I'm still new to GitHub, but I don't see the updated code? Am I doing something wrong?

1. Load flow reorganized - most of the logic is now in InferLoadFile function.
   S3 read is not yet supported.
2. Write path is implemented. Specifically, you can use undocumented (by design) option to save like:
   `SAVE rdb s3://bucket/path/file`.
3. When using `--dir=s3://bucket/path/` it also saves into s3 on shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
@romange
Copy link
Collaborator Author

romange commented May 12, 2023

You should see the new diff here: https://github.com/dragonflydb/dragonfly/pull/1205/files
for example, I can see I return pair<string, string> in there

@romange romange merged commit 4e96c56 into main May 13, 2023
6 checks passed
@romange romange deleted the S3Write branch May 13, 2023 03:44
romange added a commit that referenced this pull request Jun 1, 2023
1. Load flow reorganized - most of the logic is now in InferLoadFile function.
   S3 read is not yet supported.
2. Write path is implemented. Specifically, you can use undocumented (by design) option to save like:
   `SAVE rdb s3://bucket/path/file`.
3. When using `--dir=s3://bucket/path/` it also saves into s3 on shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants