Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Optimizing AWS S3 File Downloads for Multiple Objects in a Single Request #680

Closed
arenas7307979 opened this issue Jun 27, 2023 · 1 comment

Comments

@arenas7307979
Copy link

arenas7307979 commented Jun 27, 2023

Description:
I'm always frustrated when downloading multiple images from a specific folder in AWS S3 because the current implementation in my code uses one request per image, resulting in significant request consumption. I am looking for a better way to download multiple images efficiently within a single request.

Solution:
I would like to optimize the AWSdownloadFileFromS3 method to support downloading multiple objects in a more efficient manner. Ideally, I would like to download multiple objects using a single request to reduce the overall request consumption.

Alternatives:
I have considered exploring different approaches or techniques to achieve the efficient downloading of multiple objects from AWS S3 using a single request.

Additional Context:
N/A

Code:

func AWSdownloadFileFromS3(s3: S3, urlString: String) -> EventLoopFuture<Data> {
    guard let url = URL(string: urlString),
          let host = url.host,
          let bucket = host.components(separatedBy: ".").first else {
        fatalError("Invalid URL")
    }
    
    let key = String(url.path.dropFirst())
    let runOnEventLoop = s3.client.eventLoopGroup.next()
    
    var byteBufferCollate = ByteBufferAllocator().buffer(capacity: 0)
    
    let getObjectRequest = S3.GetObjectRequest(bucket: bucket, key: key)
    let getObjectFuture = s3.getObjectStreaming(getObjectRequest, on: runOnEventLoop) { byteBuffer, eventLoop in
        var byteBuffer = byteBuffer
        byteBufferCollate.writeBuffer(&byteBuffer)
        return eventLoop.makeSucceededFuture(())
    }
    
    let dataFuture = getObjectFuture.flatMap { _ -> EventLoopFuture<Data> in
        if byteBufferCollate.readableBytes > 0 {
            guard let data = byteBufferCollate.readData(length: byteBufferCollate.readableBytes) else {
                return s3.client.eventLoopGroup.next().makeFailedFuture(MyError.emptyFile)
            }
            return s3.client.eventLoopGroup.next().makeSucceededFuture(data)
        } else {
            return s3.client.eventLoopGroup.next().makeFailedFuture(MyError.emptyFile)
        }
    }
    
    return dataFuture
}
@arenas7307979 arenas7307979 changed the title Optimizing AWS S3 File Downloads for Multiple Objects in a Single Request How to Optimizing AWS S3 File Downloads for Multiple Objects in a Single Request Jun 27, 2023
@adam-fowler
Copy link
Member

Are you looking to use one S3 request to download multlple files, or a single function (which may make multiple requests to S3)?

I don't know of any way to do the first option. The second is easier. There is a project https://github.com/soto-project/soto-s3-file-transfer which can be used to download multiple files concurrently to your filesystem. Also if you use the new swift concurrency versions of the S3 APIs it is fairly easy to download multiple files concurrently using TaskGroups

@soto-project soto-project locked and limited conversation to collaborators Jun 27, 2023
@adam-fowler adam-fowler converted this issue into discussion #682 Jun 27, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants