listObjects method doesn't work on very large buckets! #9

Closed
epugh opened this Issue Feb 12, 2013 · 2 comments

Projects

None yet

2 participants

@epugh
epugh commented Feb 12, 2013

The method listObjects on https://github.com/abashev/vfs-s3/blob/master/src/main/java/com/intridea/io/vfs/provider/s3/S3FileObject.java#L245 doesn't work on VERY LARGE buckets, it eventually times out. Works fine on buckets that don't have potentially millions of documents in them.

I hit the jets3t wiki, and found some links about a listObjectsChunked method that lets you not return all the matches, but just some. And this sample code:

https://bitbucket.org/jmurty/jets3t/src/2573ec7fb362e60ce68585cbd8b349833be9142c/src/org/jets3t/samples/ThreadedObjectListing.java?at=default

That appears to show using the listObjectsChunked in a threaded manner.

I'm still new to this, so any thoughts would be great. Seems like vfs-s3 should use the listObjectsChunked method instead of listObjects.

@abashev
Owner
abashev commented Feb 12, 2013

Eric, to be fair I don't have any plans to support jets3t for future releases. I think that Amazon SDK right now is mature enough and I want to switch the library to it. Moritz did a really good job for porting vfs-s3 to SDK but I was lazy to continue his efforts. Right now I see that someone need this product so I'll try to finish it. Really appreciate any kind of your help or attention.

@epugh
epugh commented Feb 13, 2013

So I looked at the feature branch, unfortunately some of the tests fail. But the method doListChildren does take what appears to be a much better approach a la the chunking:

final List summaries = new ArrayList(listing.getObjectSummaries());
while (listing.isTruncated()) {
final ListObjectsRequest loReq = new ListObjectsRequest();
loReq.setBucketName(bucket.getName());
loReq.setMarker(listing.getNextMarker());
listing = service.listObjects(loReq);
summaries.addAll(listing.getObjectSummaries());
}

Eric

On Feb 12, 2013, at 3:55 PM, Alexey Abashev wrote:

Eric, to be fair I don't have any plans to support jets3t for future releases. I think that Amazon SDK right now is mature enough and I want to switch the library to it. Moritz did a really good job for porting vfs-s3 to SDK but I was lazy to continue his efforts. Right now I see that someone need this product so I'll try to finish it. Really appreciate any kind of your help or attention.


Reply to this email directly or view it on GitHub.


Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Apache Solr 3 Enterprise Search Server available from http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

@epugh epugh closed this Feb 13, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment