New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the number of requests in dir_size #3382
Conversation
This pull request has been linked to Shortcut Story #18944: Size for arrays is often N/A. |
@@ -75,9 +75,10 @@ class directory_entry { | |||
* @param p The path of the entry | |||
* @param size The size of the filesystem entry | |||
*/ | |||
directory_entry(const std::string& p, uintmax_t size) | |||
directory_entry(const std::string& p, uintmax_t size, bool is_directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the new parameter to the doc.
Tested this with the arrays mentioned in the SC card. For almost all of them there is a speedup of 3 |
Spyros is reporting good numbers, but still, it's minutes for a I've researched this a bit for S3 (with a bit of luck we might be able to generalize it to other backends) and it seems it's possible to get the |
Optimize no of requests for dir_size
7df437a
to
3ca4a22
Compare
* Extend ls_with_sizes to retrieve is_directory metadata Optimize no of requests for dir_size * address comment from @-KiterLuc, fix hdfs failing build
Extend ls_with_sizes to get
is_dir
/is_file
metadata info within the same request and use the extra info to get rid of therecursive
vfs::is_file
andvfs::file_size
requests.Note for reviewers:
is_dir
sanity check.dir_size(file) = file_size(file)
, which is similar to whatdu
does on unix, we can get rid of the extra request which we pay for alldir_size
calls.TYPE: IMPROVEMENT
DESC: Reduce the number of requests in dir_size