-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage API: Allow wildcard in object path to allow retrieval of several objects #4154
Comments
@halfdanrump
|
@sagarrakshe You're right, I hadn't noticed that parameter. In my case this is sufficient, so thanks for telling me about it! :) As you also point it's a partial solution. Actually it might be handle to even allow for regex matches on the filenames @lukesneeringer. I'm not into the code , so I don't know how difficult this would be to implement. Do you think it would be worth the effort? Cheers, |
The list objects API doesn't allow any wildcard parameter, apart from prefix: So we need to add a parameter to |
@sagarrakshe As you note, the back-end doesn't provide such access, so what we are discussing is really a convenience wrapper for application code which would otherwise be something like: import fnmatch
for blob in bucket.list_blobs():
if fnmatch.fnmatch(blob.name, "*something"):
do_something_with(blob) |
Nice. So can we close this issue? @tseaver |
I'll close it, as there isn't much we can do to improve on application-level processing. |
fnmatch lib works, but the filter process so slow. Don't know how gsutil handle the problem, very effectively. |
At any point will this feature request be re-opened to allow wildcards within the list_blobs method? For example:
|
Because the back-end doesn't provide support for that kind of matching, we decided that it was not worth the effort, given how simple it is do do the matching in the application (as my example above illustrates). |
gcsfs has a very handy feature that allows you to fetch multiple files by allowing wildcards in the object path. I think this would be a nice little feature to add to this library.
This example illustrates the idea:
As far as I know, the current way to accomplish the same would be to filter the list of all the files in the bucket and then fetch the files one by one (please correct me if I'm wrong :). The problem with this is that it's slow if you have a bucket with a very large number of files.
Cheers,
Halfdan
The text was updated successfully, but these errors were encountered: