Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support user upload directly to S3 #5314

Open
kontrollanten opened this issue Sep 30, 2022 · 9 comments
Open

Support user upload directly to S3 #5314

kontrollanten opened this issue Sep 30, 2022 · 9 comments

Comments

@kontrollanten
Copy link
Contributor

Describe the problem to be solved

Currently users are uploading their files to the PeerTube server. This may affect the upload speed but it also makes PeerTube hard to scale. In case running a single PeerTube server behind a load balancer and that server is replaced by another, it will make any ongoing uploads to fail.

Describe the solution you would like

Uploads directly to S3. Based on this guide https://www.altostra.com/blog/multipart-uploads-with-s3-presigned-url it could work as follows:

  1. Client creates a POST video request to the PT server
  2. Server creates a multi part upload and pre-signed URL:s for each part of the upload. The upload is stored in the database, paired with the video id. Pre-signed URL:s are returned to the client.
  3. Client saves the response in local storage, to be able to resume in case of closing the tab.
  4. Client uploads to S3.
  5. Client sends ETag and part number to PT server after each finished upload part.
  6. Server stores ETag and part number in database.
  7. Client sends finish request to PT server.
  8. Server ends the multipart upload.

This would be a great step forward to be able to scale PeerTube.

@emansom
Copy link
Contributor

emansom commented Oct 9, 2022

The node-uploadx component used to handle uploads within PeerTube supports S3. 🎉🚀

Hook that up to object storage config and it should work. 😊

@kukhariev
Copy link
Contributor

uploadx can upload data directly to Google Cloud Storage(clientDirectUpload option) , but not to S3 storages.

Here is a proposed client uploader that supports both local storage and S3.
server draft

@emansom
Copy link
Contributor

emansom commented Oct 9, 2022

uploadx can upload data directly to Google Cloud Storage(clientDirectUpload option) , but not to S3 storages.

Here is a proposed client uploader that supports both local storage and S3.
server draft

Misinterpreted the issue on first read, thought it was talking about S3 upload on server side.

Thanks for implementing the client bits! 🚀

@kontrollanten
Copy link
Contributor Author

@kukhariev I saw that you released new versions with support this. Huge thanks for your work! 🚀

Should we treat it as experimental or should it be stable?

@kukhariev
Copy link
Contributor

@kontrollanten i hope - stable. It's difficult to testing.

@kontrollanten
Copy link
Contributor Author

I've been looking deeper into this now. The current video upload functionality parses the video file to get the resolution and video duration when upload is done. I propose that instead of parsing the video file during the HTTP request, it should be handled by a job. By putting workload from the web server we minimize the risks of crashing ongoing processes when doing restarts/upgrades of PeerTube. In the long run PeerTube should support remote workers for handling jobs, whom can be terminated in a controlled way (i.e. terminate when all ongoing jobs are done).

@Chocobozzz any comments, or should I start working on a PR where video parsing is done in a job?

@aeharding
Copy link

This issue feels like the natural next step now that remote transcoding is finished (#947).

PeerTube would never have to touch a single video file!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants