Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload streams in browser using upload-defered-length #126

Merged
merged 7 commits into from Dec 29, 2018

Conversation

Projects
None yet
3 participants
@paulrosenzweig
Copy link
Contributor

commented Sep 13, 2018

Resolves #118

This PR adds the ability to upload streams in the browser. In order to upload data as it's created, I used the Upload-Defer-Length header with the creation-defer-length tus extension.

To show a possible application of this, I added a button to the demo page that captures video with your webcam and uploads the data as it is recorded.

What kind of "streams"?

Ideally, we'd use ReadableStreams, but sadly that's not finalized yet. Instead you can upload any object adhering to a simple contract:

You can upload a reader like you would a file:

new tus.Upload(reader)

The reader must has a method read that returns a promise. The promise should resolve to an object {value, done}.

value is data as an array, blob, or typed array.
done is a boolean.

The read method described here is exactly what's (more eloquently) described in MDN here.

ReadableStreams as implemented do have a getReader method that returns a compatible object. If you don't have access to a reader, but instead have a callback that is run as new data is available, it's straightforward to wrap this in a reader.

What happens if chunkSize is not set?

We throw an error. With files, it makes sense to upload the entire file as default behavior. However, with streams, we need to know how much data to wait for before sending each chunk.

What happens if read produces data small or larger than chunkSize?

If read results in more data than we want to send in one chunk, we cut off chunkSize bytes and save the rest for future sends.

If read results in less data than we want to send in one chunk, we immediately call read again until we have enough data for one chunk or the stream ends.

Is this "resumable"?

It's not. I'm not quite sure what it would mean to resume uploading a stream. Would we assume the stream's current read position is the same offset as the upload?

I expect the right solution to network failures is for users to tee streams to a file and if the stream upload fails they can retry with the file, but I'm very open to different options here.

@Acconut

This comment has been minimized.

Copy link
Member

commented Sep 28, 2018

Thank you very much for this amazing PR, @paulrosenzweig! Unfortunately, I didn't have the time to look into this yet but I will definitely try to do so in the next few days.

@kvz

This comment has been minimized.

Copy link
Member

commented Sep 28, 2018

Without having looked at the code yet, regarding making stream uploads resumable, we be able to buffer chunks locally and retry uploading those, up until a point? (configurable, likely not hundreds of MB’s) That would at least make the uploading resumable in the face of switching cell towers/WiFi access point/server restarts/load balancer changes/etc. or maybe I’m thinking about this in the wrong way?

@Acconut

This comment has been minimized.

Copy link
Member

commented Sep 28, 2018

we be able to buffer chunks locally and retry uploading those, up until a point? (configurable, likely not hundreds of MB’s) That would at least make the uploading resumable in the face of switching cell towers/WiFi access point/server restarts/load balancer changes/etc. or maybe I’m thinking about this in the wrong way?

Yes, that should be possible. From looking at the code, the stream is already read into a buffer before being uploaded. This is also how tus-java-client handles retries when uploading a chunk fails.

@paulrosenzweig

This comment has been minimized.

Copy link
Contributor Author

commented Oct 19, 2018

Sorry for the delay in addressing your feedback. I finally got a chance to make this change.

I added the ability to retry the last chunk. I had been clearing data from the buffer as soon as it was read, but now I only clear data in excess of the most recent chunkSize bytes. If an upload fails, the source has enough data to return the most recent chunk from the buffer again.

Mostly this just required tweaking the logic around buffer offsets, but I also had to make another change related to the source. Previously, retrying an upload would recreate the source, but with the buffer state living in the source, I updated this to reuse the existing source instance rather than recreate it from the file. Everything seems to still work, but let me know if you anticipate any issues that this might cause. Here's the change:

-    let source = this._source = getSource(file, this.options.chunkSize);
+    this._source = this._source || getSource(file, this.options.chunkSize);

Note: Removing the local variable source wasn't strictly necessary, but it was only used twice so I just referenced this._source in both places.

@paulrosenzweig

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

Hey @Acconut just wanted to bump this!

I totally understand that open source is not your full-time job, but I'd love to get this merged after any feedback.

@Acconut

This comment has been minimized.

Copy link
Member

commented Nov 17, 2018

Sorry for the delay, @paulrosenzweig. I should have more time now. First of all, I would like to discuss a few more high level questions:

  1. I would prefer it if the _readUntilEnoughDataOrDone method would not use Promises. I know they may be easier to use but tus-js-client uses callbacks for its internals and the exposed interface and i would like to keep the library code consistent.

  2. I am not sure about this point yet but I think we should make length-deferring opt in instead of automatically assuming that the user wants to use length-deferring if no size is specified. This makes our internals easier and we can also keep the safety check that you removed in your PR (https://github.com/tus/tus-js-client/pull/126/files#diff-26d16a76eb2641c3108d9b4b67f9cb72L96). This is just here to make spotting errors easier and avoid misconfiguration for the user.

  3. What would be a good way to dealing with the upload progress callbacks? Currently we provide the users with bytesSent and bytesTotal for each progress event but we do not have bytesTotal. Does your code account for that yet? I was thinking about setting bytesTotal to null to indicate that we do not have it yet.

Let me know what you think.

@paulrosenzweig

This comment has been minimized.

Copy link
Contributor Author

commented Nov 18, 2018

Thanks for your thoughts!

  1. This is tricky because the API I've defined above relies on a read method that returns a Promise. That said, I totally appreciate the value of having consistent internal style. Do you think it makes sense to restructure things internally to use callbacks wherever possible but to still use Promises to receive data from the user? Or, would you think the API should change to not accept a "reader"?

  2. That makes sense to me. I'll add an option for this.

  3. Yeah I wasn't sure what to do here either, but I left it as passing null like you suggest. I had also considered passing the current bytesSent as for bytesTotal, but thought it'd be more confusing since you'd always be at 100% progress.

@Acconut

This comment has been minimized.

Copy link
Member

commented Nov 19, 2018

Do you think it makes sense to restructure things internally to use callbacks wherever possible but to still use Promises to receive data from the user?

Yes, I think that would be the best option for our current situation.

Yeah I wasn't sure what to do here either, but I left it as passing null like you suggest.

Perfect, we will add this to the documentation and then it should be clear.

@paulrosenzweig

This comment has been minimized.

Copy link
Contributor Author

commented Nov 22, 2018

@Acconut, I just committed both of those changes

@Acconut
Copy link
Member

left a comment

Thank you very much for the changes, @paulrosenzweig! I had another look and have a few more questions. Also, would you mind adding a bit of documentation to README.md covering the new option?

Show resolved Hide resolved lib/browser/source.js
Show resolved Hide resolved lib/browser/source.js Outdated
Show resolved Hide resolved lib/browser/source.js Outdated
Show resolved Hide resolved lib/browser/source.js Outdated
Show resolved Hide resolved lib/browser/source.js Outdated
Show resolved Hide resolved test/spec/upload.browser.js Outdated
@Acconut
Copy link
Member

left a comment

Thanks for the changes. I only have one wish for improvement at the error handling.

BTW, I added a few more commits to master which cause these merge conflicts. The main changes are that the dist and lib.es5 directories are removed from the Git index and that the getSource methods are not also asynchronous.

It would be nice if you would try to merge/rebase the changes but if it's too hard I can also take care of that.

Show resolved Hide resolved lib/browser/source.js Outdated

@paulrosenzweig paulrosenzweig force-pushed the paulrosenzweig:deferred-upload-length branch from 44d09ed to 6e803c1 Dec 13, 2018

Paul Rosenzweig added some commits Dec 13, 2018

Paul Rosenzweig
Paul Rosenzweig
Show resolved Hide resolved lib/upload.js

@Acconut Acconut merged commit 980d84b into tus:master Dec 29, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@Acconut

This comment has been minimized.

Copy link
Member

commented Dec 29, 2018

Thank you very, very much for your continuous support and dealing will all of my feedback. I just added a few final touches (a bit more documentation and splitting the demo to improve code and UI usability). A new release should come in the next days with this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.