New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for uploading big files on S3 #337
Comments
probably because they don't implement signature on chunk/stream, For now, we just provide a low level methods and let user deal with such edge cases. But since the beginning we planned to add extra high level method/helper. I think this is a great candidate to start. Thank for the proposal. |
No, the 5GB limit is part of the PutObject endpoint itself. It is not about the signing part. It is about the fact that bigger files must be uploaded in several requests (I totally understand that they don't want to allow incoming requests with a 5TB body). |
I was talking about this |
Thinking about that, I'm not sure it should be part of this project, It's more something that should be managed by a high level application "flysystem-adapter" for instance. One may want a |
uploading a big file to S3 while taking into account the limitation of the S3 PutObject API (needing to switch to multipart upload) seems like a fit for a S3 SDK. That's only dealing with S3 APIs. |
I think this is an interesting feature. I think we should add it when there is users asking for it. As I understand from @stof, you are just making an observation of a missing feature, right? The question we should ask us now is: Do we see that we can add this feature without breaking BC in the future? |
Well, the issues I opened for big files and presigned requests correspond to what I found missing to be able to migrate the Incenteev project from the official AWS SDK to |
This will fix async-aws#337
I added a PR for this. See #403 |
* [S3] MultipartUpload This will fix #337 * Added UploadPart and ListParts * Added unit tests * Added some tests * Make sure tests are green
I think the point of @stof is providing a helper method |
Yes. Could we (manually) create a simple abstraction layer of upload/download above the S3 client. Similar to the official SDK. $s3 = new S3Client();
$simpleS3 = new SimpleS3Client($s3, 'example-bucket');
$simpleS3->upload($path, $file, $acl, $somethingElse);
$simpleS3->download($path); |
Here is an example implementation:
use AsyncAws\Core\Exception\Exception;
use AsyncAws\Core\Exception\Http\HttpException;
use AsyncAws\S3\Result\GetObjectOutput;
use AsyncAws\S3\S3Client;
class SimpleS3Client
{
/**
* @var S3Client
*/
private $s3;
/**
* Default bucket if none is provided.
*
* @var string|null
*/
private $bucket;
public function __construct(S3Client $s3, ?string $bucket = null)
{
$this->s3 = $s3;
$this->bucket = $bucket;
}
/**
* @throws Exception
*/
public function upload(object $data): bool
{
// TODO convert $data to $input
if (!$input['Bucket']) {
$input['Bucket'] = $this->bucket;
}
// TODO get content length from input or read body
if ($contentLenth < 5 * 1024 * 1024 * 1024) {
$this->s3->putObject($input);
return true;
}
$uploadId = $this->s3->createMultipartUpload($input)->getUploadId();
foreach ($data as $chunk) {
$this->s3->uploadPart($input);
}
$this->s3->completeMultipartUpload($input);
return true;
}
/**
* @throws Exception
*/
public function download(object $data): GetObjectOutput
{
// TODO convert $data to $input
if (!$input['Bucket']) {
$input['Bucket'] = $this->bucket;
}
// TODO convert input to GetObjectRequest
$result = $this->s3->getObject($input);
$result->resolve();
return $result;
}
/**
* @throws Exception
*/
public function remove(object $data): bool
{
// TODO convert $data to $input
if (!$input['Bucket']) {
$input['Bucket'] = $this->bucket;
}
// TODO convert input to GetObjectRequest
try {
$this->s3->deleteObject($input);
} catch (HttpException $e) {
return false;
}
return true;
}
} |
If I can get some help with a draft of an API (or class skeleton) then I would be happy implementing this. |
@Nyholm have a look to https://github.com/aws/aws-sdk-php/blob/master/src/S3/ObjectUploader.php for entrypoint (which use https://github.com/aws/aws-sdk-php/blob/master/src/S3/MultipartUploader.php for handling multipart upload) |
PutObject
is limited to 5GB. Bigger objects (up to 5TB) require using a multipart upload. The official SDK has aupload()
method, which handles this transparently (by default, it will switch to multipart much earlier than the 5GB threshold, probably because they found it more efficient or more reliable to upload in chunks). Does it make sense to provide the same kind of feature in this SDK ?The text was updated successfully, but these errors were encountered: