-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-14916: [C++] GcsFileSystem can delete directories #11890
Conversation
7fea27a
to
df10c1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, just a couple more suggestions.
cpp/src/arrow/filesystem/gcsfs.cc
Outdated
|
||
auto async_delete = | ||
[&p](gcs::Client& client, | ||
google::cloud::StatusOr<gcs::ObjectMetadata> o) -> google::cloud::Status { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two suggestions:
- just capture the client as a closure parameter
- return an arrow
Status
here, such that you can useAllFinished(const std::vector<Future>&)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I captured this
because capturing client_
is not possible in C++11.
cpp/src/arrow/filesystem/gcsfs.cc
Outdated
// This iterates over all the objects, and schedules parallel deletes. | ||
auto prefix = p.object.empty() ? gcs::Prefix() : gcs::Prefix(p.object); | ||
for (auto& o : client_.ListObjects(p.bucket, prefix)) { | ||
submitted.push_back( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using DeferNotOk
(which turns a Result<Future<T>>
into a Future<T>
), you could instead append to a std::vector<Future>
, which will simplify the collection code below.
For example (untested):
std::vector<Future> submitted;
// This iterates over all the objects, and schedules parallel deletes.
auto prefix = p.object.empty() ? gcs::Prefix() : gcs::Prefix(p.object);
for (auto& o : client_.ListObjects(p.bucket, prefix)) {
submitted.push_back(DeferNotOk(
io_context.executor()->Submit(async_delete, std::ref(client_), std::move(o))));
}
return AllFinished(submitted).status();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, thanks.
arrow::fs::AssertFileInfo(fs.get(), filename, FileType::NotFound); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you test that other paths in the bucket still exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Benchmark runs are scheduled for baseline = 3ef4032 and contender = fdc344b. fdc344b is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
No description provided.