New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Golden Retriever: recover files (and uploads) after a browser crash (or closed tab) #268

Merged
merged 17 commits into from Aug 15, 2017

Conversation

Projects
None yet
6 participants
@arturi
Copy link
Collaborator

arturi commented Jul 21, 2017

  • unique IDs for multiple Uppy instances: #256 (comment)
  • service worker stores blobs, local storage stores all file state. on boot restore from local storage, then load blobs from service worker
  • start the upload after restore for now
  • store small blobs in IndexedDB, for when ServiceWorker is not available or browser crashed
  • cleanup from IDB on boot
  • store “current step”: preprocessing / uploading / postprocessing and pick up from that
  • preview only if we can restore blob, otherwise generic file type icon
  • uppy server support for that: store token persistently for longer, emit complete event on reconnect (if is complete)

Building on the work of @richardwillars discussed in #237.

DEMO GIF

Initial take on using Local Storage for storing file state and Servic…
…e Worker to store file blobs, and then restore on boot

@arturi arturi changed the title Initial take on using Local Storage for storing file state and Service Worker to store file blobs, and then restore on boot [WIP ]Initial take on using Local Storage for storing file state and Service Worker to store file blobs, and then restore on boot Jul 21, 2017

@arturi arturi changed the title [WIP ]Initial take on using Local Storage for storing file state and Service Worker to store file blobs, and then restore on boot [WIP] Initial take on using Local Storage for storing file state and Service Worker to store file blobs, and then restore on boot Jul 21, 2017

@goto-bus-stop

This comment has been minimized.

Copy link
Member

goto-bus-stop commented Jul 21, 2017

Super cool 🎉

General thought for a future iteration: we should namespace the messages we send on postMessage so they don't conflict with a developer's own serviceworker messages (eg. ADD_FILEuppy/ADD_FILE)

@nqst

This comment has been minimized.

Copy link
Contributor

nqst commented Jul 21, 2017

Design preview (see #257):

screen shot 2017-07-21 at 15 27 29

screen shot 2017-07-21 at 15 27 35

screen shot 2017-07-21 at 15 27 21

arturi and others added some commits Jul 21, 2017

@arturi
Copy link
Collaborator

arturi left a comment

😍 😍 😍

@goto-bus-stop

This comment has been minimized.

Copy link
Member

goto-bus-stop commented Jul 22, 2017

I moved the ServiceWorker file storage stuff into a separate class with methods:

  • list() to get a list of all stored blobs;
  • put({ id, data }) to store a blob;
  • delete(id) to delete a blob.

Then added an IndexedDB class that has the same API. For now I made the plugin switch between the two based on an option, but perhaps we should switch based on what's supported. We still need storage limits for IDB, right now it'll happily try to store 5GB files (not sure what happens then, I guess the put() call would just fail).

@arturi

This comment has been minimized.

Copy link
Collaborator

arturi commented on 61076a0 Jul 22, 2017

🥇🥇🥇
😍 😍 😍

@goto-bus-stop

This comment has been minimized.

Copy link
Member

goto-bus-stop commented Jul 23, 2017

A drawback of IndexedDB storage is that we can't auto-expire it, so if you close the tab and never visit the page again the stored blobs will stay around forever (until clearing browser cache/cookies). At least we could auto-delete blobs that are old (probably use the same expiry time as in #264 if possible), and when restoring an Uppy instance we could delete blobs that aren't referenced by the file state from localStorage.

I will check if storing a File object in IDB actually stores the entire data blob or if it only stores a reference to the file on disk. If it stores the entire data blob I guess we should be quite conservative about how much we store in IDB.

@arturi

This comment has been minimized.

Copy link
Collaborator

arturi commented Jul 23, 2017

A drawback of IndexedDB storage is that we can't auto-expire it

Yes, that sucks. However, since the browser will eventually clear the cache, and if we are only storing files not larger than, say, 20mb each, and not more than, say, 500mb in total, we should be fine. We’ll remove them on successful upload, too. We could also try to expire them after a day or two with Service Worker if it stays alive long enough?

Might be relevant: https://github.com/localForage/localForage, https://hacks.mozilla.org/2012/02/storing-images-and-files-in-indexeddb/.

@arturi

This comment has been minimized.

Copy link
Collaborator

arturi commented Jul 23, 2017

Tried the IndexedDB thing, demo: https://www.webpackbin.com/bins/-Kpk7xYanarn9i6zJVct.

Seems like it automagically stores the whole blob or the browser allows it to access the file on disk, unlike with localStorage?

@goto-bus-stop

This comment has been minimized.

Copy link
Member

goto-bus-stop commented Jul 23, 2017

Yeah. What I'm curious about is whether storing a File object for a 20MB file requires just a few KB of metadata, or if it stores store the entire 20MB in IndexedDB. If it only stores metadata (which I think keeping Files in ServiceWorker does), we don't have to worry so much about storage space

@arturi

This comment has been minimized.

Copy link
Collaborator

arturi commented Jul 23, 2017

I was able to locate files I stored in IDB in Firefox’s storage:

screen shot 2017-07-23 at 7 35 26 pm

When I add a file to IDB and then move it to a different location on disk, the file preview is still loading after refresh, but same is true when using Service Worker, so it’s just always storing blobs in memory after URL.createObjectURL(fileData), I guess?

I wish there was a more scientific approach to debugging this :)

@arturi arturi changed the title [WIP] Initial take on using Local Storage for storing file state and Service Worker to store file blobs, and then restore on boot [WIP] GoldenRetriver: recover selected or in progress files after a browser crash or closed tab Jul 24, 2017

@arturi arturi referenced this pull request Jul 24, 2017

Closed

uppy: state recovery from localStorage #256

2 of 4 tasks complete

goto-bus-stop and others added some commits Jul 24, 2017

Keep track of ongoing uploads in state; allow restoring uploads
Each upload() now generates a unique ID and stores the relevant file IDs
in state. It now also keeps track of which `step` in the 'pipeline' the
upload is at.

Since uploads are now stored in state, I added two methods to manage
them:

 - `createUpload(fileIDs)` to create a new upload. It returns the unique
 ID for the upload.
 - `removeUpload(uploadID)` to remove an upload, because it was
 completed or canceled.

I split off the 'pipeline' logic from the `upload()` method, into a new
private method `runUpload(uploadID)`, and added a `restore(uploadID)`
method that can be used to continue a preexisting upload. `runUpload()`
continues at the `step` stored in state.

The Golden Retriever loops through the existing uploads from the
restored state and `restore(uploadID)`s each of them.

@arturi arturi changed the title [WIP] GoldenRetriver: recover selected or in progress files after a browser crash or closed tab [WIP] GoldenRetriver: recover files (and uploads) after a browser crash (or closed tab) Jul 24, 2017

@arturi arturi changed the title [WIP] GoldenRetriver: recover files (and uploads) after a browser crash (or closed tab) [WIP] Golden Retriever: recover files (and uploads) after a browser crash (or closed tab) Jul 25, 2017

@richardwillars

This comment has been minimized.

Copy link
Contributor

richardwillars commented Jul 25, 2017

Wow! When I put the proof of concept on here I was never expecting you guys to pick up and run with it like you have done! Kudos to the project and the team behind it for being so agile and open to ideas. With more user awareness this could easily become the top file uploader and stay ahead of the competition for a long time to come!

goto-bus-stop and others added some commits Jul 27, 2017

@oyeanuj

This comment has been minimized.

Copy link

oyeanuj commented Aug 6, 2017

@arturi @goto-bus-stop @ifedapoolarewaju My usual question - how can this be utilized by someone not using the UI? Are there going to be methods exposed in the core that one could call to get the unfinished files?

@arturi

This comment has been minimized.

Copy link
Collaborator

arturi commented Aug 7, 2017

Hi @oyeanuj!

how can this be utilized by someone not using the UI? Are there going to be methods exposed in the core that one could call to get the unfinished files?

You can access restored files in uppy.state.files just as regular files, except restored ones get isRestored: true property. For example, this is how restored files can be accessed without the UI:

const restoredFiles = Object.keys(_Uppy.state.files).filter(fileID => _Uppy.state.files[fileID].isRestored)

restoredFiles here is a array of IDs of restored files.

Files that have begun uploading, but haven’t finished, have both uploadComplete: false and uploadStarted: true, this is unrelated to Golden Retriever.

}

onBlobsLoaded (blobs) {
window.myblobs = blobs

This comment has been minimized.

@goto-bus-stop

goto-bus-stop Aug 15, 2017

Member

Should get rid of this before merge ;)

@arturi arturi changed the title [WIP] Golden Retriever: recover files (and uploads) after a browser crash (or closed tab) Golden Retriever: recover files (and uploads) after a browser crash (or closed tab) Aug 15, 2017

@arturi arturi merged commit f8f8f29 into master Aug 15, 2017

@arturi arturi referenced this pull request Aug 19, 2017

Closed

Background uploading #237

@goto-bus-stop goto-bus-stop deleted the feature/restore-files branch Nov 20, 2017

@oyeanuj

This comment has been minimized.

Copy link

oyeanuj commented Mar 6, 2018

@arturi Super late follow up as I got back to the file upload part to upgrade Uppy and hopefully implement Golden Retrievier. So, I wanted to clarify your above comment - am I understanding correctly that when just using the core (no UI), I need to look for files that are unfinished using isRestored: true and then just pass it back to Uppy? And will Uppy take it over from there or are there any other steps in between?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment