-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add s3util.ListObjects(url string, c *Config) (*ListObjectsResult, error) #7
Conversation
… uses "/", while thirdparty tools like S3fox, 3Hub and NativeS3FileSystem. Sort entries after trimming "_$folder$" suffixes.
…use as a marker for the next query.
… later list results.
I thought I replied here before, apologies for missing that. The name ListObjects seems redundant, why not just List? The public interface here seems a bit complicated. Two new Also, is there any way to avoid exposing fields like Marker func Open(url string, c *Config) (*File, error)
func (f *File) List(n int) ([]os.FileInfo, error) |
Hi, thanks for your reply. When we add an API for listing buckets in the future, I think Readdir is confusing. I implemented the ListObjects API as a low level primitive API corresponding to: I think your signature for List() is somewhat misleading. I think this is better. Before we think for function signatures, we have to think about And we need to specify directory names with these suffixes If we use os.File for file or directory entries, we must use Could you tell me what you think? |
Package s3util isn't really meant for low level functions. S3 doesn't have directories, but it's possible to treat Since '/' is already the path separator, creating an empty Given the following objects:
This api could produce the following listings:
etc. Why can't List can work for listing both buckets and objects? |
Now I understand that s3util is meant for high level access. Thanks for your explanation. As for directory suffixes, I wish all tool out there used only '/'. In reality, there are already Your listing is a breadth-first search, but S3 List API is a depth-first search. I would like to control the count of S3 API calls because they costs money. Yes, maybe List can work for listings both buckets and objects. |
The page you linked above, The key seems to be to supply the path separator as the The design I suggest would perform exactly one S3 call per call to List. Just like for os.File.Readdir, List can let the user decide how many results |
I read samples in By setting delimeter=/, you get only directory entries. So you have to do an extra API call for getting entries in directories. And those results have files and subdirectories mixed. By just using the marker parameter and not using delimiter, the needed API call count is int((entries - 1)/ 1000) + 1 (1000 = the max entries count per an API call). And this is the minimum you can get. |
Files and directories aren't mixed. Files are listed in <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>example-bucket</Name>
<Prefix></Prefix>
<Marker></Marker>
<MaxKeys>1000</MaxKeys>
<Delimiter>/</Delimiter>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>sample.html</Key>
<LastModified>2011-02-26T01:56:20.000Z</LastModified>
<ETag>"bf1d737a4d46a19f3bced6905cc8b902"</ETag>
<Size>142863</Size>
<Owner>
<ID>canonical-user-id</ID>
<DisplayName>display-name</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
<CommonPrefixes>
<Prefix>photos/</Prefix>
</CommonPrefixes>
</ListBucketResult> Doing a breadth-first traversal might still take a few more api calls than |
Thank you again for your explanation. I tried to implement proposed APIs, but I found out we cannot get LastModified for directories. |
Yes, that seems reasonable. Also for Size() etc. Since directories |
Oh, I was wrong about directories. I knew S3 console creates entries for directories, but I thought we cannot get them with delimiter specified. Actually we can get them. an empty directory created with S3 console. <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>go-s3</Name>
<Prefix>s3util/foo/</Prefix>
<Marker/>
<MaxKeys>1000</MaxKeys>
<Delimiter>/</Delimiter>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>s3util/foo/</Key>
<LastModified>2013-06-07T07:52:45.000Z</LastModified>
<ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>
<Size>0</Size>
<Owner>
<ID>a42a235b94cfe0f3fd630844e076307918c210d57a6e3499e813f564588716a4</ID>
<DisplayName>hnakamur</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
</ListBucketResult> a file uploaded to the directory above. <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>go-s3</Name>
<Prefix>s3util/hoge/</Prefix>
<Marker/>
<MaxKeys>1000</MaxKeys>
<Delimiter>/</Delimiter>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>s3util/hoge/</Key>
<LastModified>2013-07-22T23:31:55.000Z</LastModified>
<ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>
<Size>0</Size>
<Owner>
<ID>a42a235b94cfe0f3fd630844e076307918c210d57a6e3499e813f564588716a4</ID>
<DisplayName>hnakamur</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
<Contents>
<Key>s3util/hoge/list_local.go.bak</Key>
<LastModified>2013-07-22T23:36:04.000Z</LastModified>
<ETag>"afda40162cce64840ffd7aae3b2d3094"</ETag>
<Size>894</Size>
<Owner>
<ID>a42a235b94cfe0f3fd630844e076307918c210d57a6e3499e813f564588716a4</ID>
<DisplayName>hnakamur</DisplayName>
</Owner>
<StorageClass>STANDARD</StorageClass>
</Contents>
</ListBucketResult> When I had created my directory structures on S3 for my experiments and implementing ListObjects(), I initially uploaded files with 3Hub: Amazon S3 Client (for Mac OS X). If you use only S3 console to create directories, Of course, if you use only S3 APIs, you can create file entries without parent directory entries. |
Yes, in my interpretation, |
Thanks for your comment. I close this pull request since I made another pull request #14 for new APIs. |
This is a function for the GET Bucket (List Objects) API.
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html