Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming Network/File Support #1012

Open
wallw-teal opened this issue Sep 1, 2020 · 0 comments
Open

Streaming Network/File Support #1012

wallw-teal opened this issue Sep 1, 2020 · 0 comments

Comments

@wallw-teal
Copy link
Contributor

wallw-teal commented Sep 1, 2020

OpenSphere currently uses a single GET request to load the entire response into memory. Files are similarly loaded entirely into memory, and are additionally limited because we currently store the file in a single IDB key, which further limits the file size to that of a single IDB value (~104MB).

Loading and parsing large files is problematic in that it spikes memory. For configurations such as Electron (which uses file:// URLs rather than IDB storage), it is fairly trivial to crash the application. Instead, we should stream the file from the source.

Network

For network requests, including file://, we should be able to do the following steps:

  1. Make a HEAD request to URL
  2. Check the content-length response header. If it is small enough, we can just load it and run legacy parsers.
  3. If it contains the Accept-Ranges: bytes response header, then we can stream it via range requests. fetch with ReadableStream on Response.body does not buy us much here as the full response is still being loaded into memory even if we parse it as each chunk is pushed through. The resulting API here should still use ReadableStream. Note that this method of streaming is common in video players which support DASH (and maybe also HLS), so it may be possible to use or adapt some of the network logic from something like Google's Shaka Player (which would play nice with the compiler).
  4. For the initial type detection and sample parse for import, tee the stream

We will need to move the parsers from full format parsers (e.g. JSON.parse(response) and document parsing) to streaming parsers. It should be possible to do this in a piecemeal/backwards-compatible manner so that we don't just break third-party parsers (Does the parser support streaming? If not then spool up the whole thing and pass it in, but be wary of file size so we don't crash). We already have streaming JSON/XML "parsers" used by file type detection (oboe and xml-lexer).

Note: API requests such as WMS/WMTS/WFS may not have support for byte ranges and as such may benefit from fetch/ReadableStream over just xhr GET.

Note: This demo makes use of fetch/ReadableStream without spooling up all the bytes of the response (so that may be the way to go if that's possible).

Warning: the other thing to be careful of here is browser support for ReadableStream (which should be decent). However, some of the transform streams like TextDecoderStream aren't implemented in current Firefox, so polyfills for those will be needed.

File

For files loaded from disk (but not in Electron because that just resorts to file:// URLs), the native File should be streamable (if not with a ReadableStream implementation then with Blob.slice()). However, the biggest issue there is that when the application restarts, we no longer have access to that File instance without the user going to pick it from the file browser again. That's why we currently dump files into IDB.

Some strawman options. We are definitely open to suggestions here:

  • Stop using IDB file storage. Files become usable in the current session only. Offer to upload the files and use a URL if you want to keep it between sessions?
  • Expand IDB file storage to multiple keys (which moves the limit from IDB single value size to total available IDB size)
  • A hybrid approach where we continue to store "smaller" files and stream in larger ones but do not save the larger ones to storage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant