New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement rdb snapshot write directly into s3. #1205
Conversation
src/server/server_family.cc
Outdated
} | ||
|
||
// Returns bucket_name, obj_path for an s3 path. | ||
optional<pair<string_view, string_view>> GetBucketPath(string_view path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is an internal function (side note: maybe wrap it in an anonymous namespace
?), I'd define a struct for its return type, i.e. struct AwsData { string_view bucket_name; string_view obj_path; };
BTW consider using ABSL_ATTRIBUTE_LIFETIME_BOUND
(see here) to indicate that the returned value cannot surpass the input's lifetime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- yes, it's already anynymous.
- Oppah, welcome to crust++! Gladly, but I did not quite understand where I should put it to mark a return value as bound to the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be right after the argument, as in:
optional<pair<string_view, string_view>> GetBucketPath(string_view path ABSL_ATTRIBUTE_LIFETIME_BOUND)
However I'm not sure, it's fairly new and I have not yet used it myself. Does it work as such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works but I did not like how much visual clutter it adds. Decided to just return strings.
src/server/server_family.cc
Outdated
error_code file_ec; | ||
data_folder = fs::canonical(dir, file_ec); | ||
if (file_ec) { | ||
LOG(ERROR) << "Data directory error: " << file_ec.message(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider including dir
in the LOG()
src/server/server_family.cc
Outdated
} | ||
|
||
fs::path fl_path{obj_path}; | ||
fl_path /= dbname; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'm not a big fan of using operator /=
for appending paths (I'd use .append()
instead), but that's very much a matter of preference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hh, np, changed.
@@ -204,9 +269,16 @@ class RdbSnapshot { | |||
return started_ || (saver_ && saver_->Mode() == SaveMode::SUMMARY); | |||
} | |||
|
|||
void SetAWS(cloud::AWS* aws) { | |||
aws_ = aws; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a non-owning pointer, perhaps add a documentation saying such?
src/server/server_family.cc
Outdated
@@ -279,7 +374,9 @@ void ExtendDfsFilenameWithShard(int shard, fs::path* filename) { | |||
} | |||
|
|||
GenericError ValidateFilename(const fs::path& filename, bool new_version) { | |||
if (!filename.parent_path().empty()) { | |||
bool has_cloud_prefix = absl::StartsWith(filename.c_str(), "s3://"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use IsCloudPath()
here?
src/server/server_family.cc
Outdated
data_folder = fs::canonical(dir, file_ec); | ||
string flag_dir = GetFlag(FLAGS_dir); | ||
if (IsCloudPath(flag_dir)) { | ||
aws_.reset(new cloud::AWS("s3")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's safer to use std::make_unique<cloud::AWS>("s3")
instead of .reset()
. Generally, calling reset()
and unique_ptr
's c'tor directly is considered bad practice.
I'm still new to GitHub, but I don't see the updated code? Am I doing something wrong? |
1. Load flow reorganized - most of the logic is now in InferLoadFile function. S3 read is not yet supported. 2. Write path is implemented. Specifically, you can use undocumented (by design) option to save like: `SAVE rdb s3://bucket/path/file`. 3. When using `--dir=s3://bucket/path/` it also saves into s3 on shutdown. Signed-off-by: Roman Gershman <roman@dragonflydb.io>
You should see the new diff here: https://github.com/dragonflydb/dragonfly/pull/1205/files |
1. Load flow reorganized - most of the logic is now in InferLoadFile function. S3 read is not yet supported. 2. Write path is implemented. Specifically, you can use undocumented (by design) option to save like: `SAVE rdb s3://bucket/path/file`. 3. When using `--dir=s3://bucket/path/` it also saves into s3 on shutdown. Signed-off-by: Roman Gershman <roman@dragonflydb.io>
SAVE rdb s3://bucket/path/file
.--dir=s3://bucket/path/
it also saves into s3 on shutdown.