Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining @uppy/aws-s3 and @uppy/aws-s3-multipart #4218

Closed
2 tasks done
Murderlon opened this issue Nov 14, 2022 · 4 comments
Closed
2 tasks done

Combining @uppy/aws-s3 and @uppy/aws-s3-multipart #4218

Murderlon opened this issue Nov 14, 2022 · 4 comments
Assignees
Labels
AWS S3 Plugin that handles uploads to Amazon AWS S3 Feature

Comments

@Murderlon
Copy link
Member

Murderlon commented Nov 14, 2022

Initial checklist

  • I understand this is a feature request and questions should be posted in the Community Forum
  • I searched issues and couldn’t find anything (or linked relevant results below)

Problem

We currently have two separate plugins for uploading to S3 (and S3-compatible services): @uppy/aws-s3 and @uppy/aws-s3-multpart. They have different use cases. The advantages of multipart uploads are:

  • Improved throughput – You can upload parts in parallel to improve throughput.
  • Quick recovery from any network issues – Smaller part size minimizes the impact of restarting a failed upload due to a network error.
  • Pause and resume object uploads – You can upload object parts over time. After you initiate a multipart upload, there is no expiry; you must explicitly complete or stop the multipart upload.
  • Begin an upload before you know the final object size – You can upload an object as you are creating it.

However, the downside is request overhead, as it needs to do creation, signing, and completion requests besides the upload requests. For example, if you are uploading files that are only a couple kilobytes with a 100ms roundtrip latency, you are spending 400ms on overhead and only a few milliseconds on uploading. This really adds up if you upload a lot of small files.

AWS, and generally the internet from what I found, tend to agree that you don't want to use multipart uploads for files under 100MB. But this sometimes puts users of our libraries in an awkward position, as their end users may not only upload very large files, or only small files. In this case a portion of their users get a subpar experience.

Solution

Combine the plugins, use multipart based on file size or a custom function that takes a file and returns for instance a boolean.

Scope

  • Aligning the plugin options
    • @uppy/aws-s3 has an option getResponseData() for "use with almost S3-compatible storage solutions". I'd say the fact that we already support S3-compatible services (which we don't test against btw) is already enough and also trying to support "almost S3-compatible storage solutions" is too much.
    • Remove timeout option. Wasn't even used in multipart plugin and doesn't seem to have a purpose anymore.
    • Remove getChunkSize option. This should always be done by us, optimal part size is something we can calculate and letting the users do this is unnecessarily risky. See https://github.com/tus/tusd/blob/074d58300781cd68ed2f7768b8312ce666a6c43e/pkg/s3store/s3store.go#L943-L985.
    • Add option shouldUseMultipart: boolean | (file: UppyFile) => boolean (or some other name).
    • ...think about any other options we may need to add/remove/rename.
  • Fix integration with Golden Retriever: Resumable uploads with S3 Multipart #2121

Considerations

  • @uppy/aws-s3 is also used by quite a few people to upload to S3-compatible providers, such as Google Cloud Storage. This is because the API surface is the same. But AFAIK that's not the case for multipart, or at the very least not the same amount of providers that offer regular S3 compatibility. This must be explained well in the docs.
  • If we were to combine the plugins, it means either creating a third plugin or moving on with @uppy/aws-s3 and deprecating @uppy/aws-s3-multipart.

Alternatives

  • Don't combine the two plugins. In a way, they are different enough from each other. But it's more to maintain and will always be less ideal for mixed circumstances.
@CanRau
Copy link

CanRau commented Jan 22, 2023

Just ran into this, was expecting both plugins to "just" work alongside, then came to find a way to tell each one when to use, at the moment I think I can just dynamically uppy.use the right aws plugin depending on the file size.

This worked fine until I started using the dashboard. I tried to listen to the file-added event and within that add @uppy/aws-s3 or @uppy/aws-s3-multpart but this doesn't seem to work 😔

Another option could be having a way to set a plugin to enabled which could then be dynamically toggled 🤔

In any case this feature would be highly appreciated, I really enjoy uppy so far, thanks a lot for this amazing project 🙏🏼

I think for now I can make my component decide based on maxFileSize passed in as prop which plugin to use, I hope 🤞🏼

@aduh95
Copy link
Member

aduh95 commented Jan 22, 2023

Another option could be having a way to set a plugin to enabled which could then be dynamically toggled 🤔

That's an interesting idea, which could potentially be useful in more cases than just AWS. If this problem ever resurfaces in other part of the codebase we will definitely consider implementing that. When it comes to AWS though, I think it really makes more sense to combine both plugins so users don't have to duplicate their configuration and it would also simplify the maintenance to have one plugin instead of two.

The only robust workaround for now is to have two Uppy instances, each with one plugin, but of course it's quickly become very hacky to make it seemless for end users, so you might be better off sticking to one or the other plugin for now until the combination work is done.

@Lewis-Teoh
Copy link

Hi, may I know any updates on Golden Retrieval + S3 Multiparts implementation?

@aduh95 aduh95 changed the title Combining @uppy/aws-s3 and @uppy/aws-s3-multpart Combining @uppy/aws-s3 and @uppy/aws-s3-multipart Jun 22, 2023
@arturi
Copy link
Contributor

arturi commented Aug 16, 2023

This had been implemented in #4299.

@arturi arturi closed this as completed Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AWS S3 Plugin that handles uploads to Amazon AWS S3 Feature
Projects
None yet
Development

No branches or pull requests

6 participants