Allow request/response.blob() to resolve before reading the full stream #556

jakearchibald · 2017-06-20T14:26:16Z

The blob() case is more efficient by far in terms of what the browser can and does optimize. Speaking for Gecko, the blob() call only creates a handle to the underlying file on disk. No reads of the file's contents need to be performed before returning the blob. This handle can be given to the new Response and passed across processes, allowing the target process to perform just the 1024-byte read directly, and without needing to involve the ServiceWorker or its global.

This is against the current spec which requires reading the full stream, but seems like a nice optimisation.

This should be limited to bodies with a known size, which means items with a Content-Length header, or items backed by disk (cache API).

The blob should enter an errored state if the body doesn't eventually match the size of the blob.

The text was updated successfully, but these errors were encountered:

wanderview · 2017-06-20T14:38:36Z

FWIW, while we do this optimization for blobs in IDB/etc, I don't think we do it for Cache API in gecko yet.

jakearchibald · 2017-06-20T15:26:26Z

I guess there's less need for this if no one does it already. w3c/ServiceWorker#913 is maybe still the best fix.

wanderview · 2017-06-20T15:36:29Z

The problem with the blob optimization here is that its different semantics to a ReadableStream. A blob you can random access read in any order, etc.

What if we just added some kind of "skip" operation to ReadableStream? This would let you efficiently seek forwards to the part you want without necessarily loading it all into memory.

domenic · 2017-06-20T15:40:09Z

Although upon seeing this issue initially I was sympathetic, thinking that the spec should not prohibit interesting optimizations, now I am not so sure. It really comes to down the question of, what does response.blob() fulfilling mean? Or, from another angle, what does having a Blob object in hand mean?

I see two possibilities:

It means you have a handle to some data that could in theory be read, perhaps even successfully, by the browser/the FileReader API/etc.
It means that you have some assurance that the data is a known, readable quantity, that (modulo transient filesystem errors) can be read successfully at any time.

If we say that having a Blob/a fulfilled response.blob() promise means the latter, then we really do need to wait to read the response stream to the end first.

What if we just added some kind of "skip" operation to ReadableStream? This would let you efficiently seek forwards to the part you want without necessarily loading it all into memory.

I feel like we discussed this before. I am amenable. IIRC last time we talked the idea was the default implementation would throw away data but the underlying byte source could provide a specialized skip operation to be more efficient.

jakearchibald · 2017-06-20T15:44:20Z

Yeah, I think optimising blobs here is pretty complicated, I was only driving it when I thought it was something Firefox already did.

The right solution is fixing range requests, and maybe a skip/advance method on streams.

wanderview · 2017-06-20T15:45:16Z

I feel like we discussed this before. I am amenable. IIRC last time we talked the idea was the default implementation would throw away data but the underlying byte source could provide a specialized skip operation to be more efficient.

It would be nice to expose a hook to js ReadableStreams as well. They may be able to do something smart depending on what their underlying source really is. If its computed they can just compute ahead, etc.

domenic · 2017-06-20T15:53:30Z

It would be nice to expose a hook to js ReadableStreams as well. They may be able to do something smart depending on what their underlying source really is. If its computed they can just compute ahead, etc.

Yeah, that's definitely what I meant; a new method you pass to the constructor.

asutherland · 2017-06-21T00:26:31Z

Apologies for the confusion. I oversimplified and oversold the awesome powers of Blobs in that issue. Gecko only ever hands out Blob instances when the entire contents are available. There are currently no plans to speculatively return Blobs before all the data is received. Having said that, there is the edge-case w3c/FileAPI#47 wherein Gecko will throw an error on reads if the backing file selected from input[type=file] has had its file size change since selection.

Clarifying Gecko optimizations: As @wanderview points out, Gecko's Blob optimizations do not currently apply to Response.blob(). Its implementation is naive and treats all responses like they came over the network, consuming them in their entirety. Its returned Blob will be file-backed if it was large enough to spill to disk (and privacy/security settings allow), but it will be a new file on disk that is distinct from the one stored by the Cache API. Having said that, we're not far from being able to implement such an optimization, but it definitely will not return the Blob until the response has been received in its entirety (making them fully-written "incumbent records" instead of in-progress "fetching records").

yutakahirano · 2017-06-21T03:44:58Z

ReadableStream based solution sounds good.

guest271314 · 2017-06-24T15:52:59Z

@jakearchibald Was able to set Range request header to bytes=0-<number less than or equal to Content-Length> at Request() object. Media is played back to specific range when set at .src of HTMLMediaElement. However, when first portion of bytes is greater than 0, no media is played back.

When 0 is set at first portion of value of Range header media plays back

let headers = new Headers();
let range = 1024 * 8 * 128;
headers.append("Range", "bytes=0-" + range); 
let request = new Request(url, {headers:headers});
fetch(request)
.then(response => response.blob())
.then(data => audio.src = URL.createObjectURL(data))

When a number greater than 0 is set at first portion of value of Range header media does not plays back

headers.append("Range", "bytes=1-" + range);

http://plnkr.co/edit/vci20DGSjX1fAjrHOcOY?p=preview

Are you aware of the reason for this result?

beaufortfrancois · 2017-06-26T08:18:10Z

audio.error.message says "DEMUXER_ERROR_COULD_NOT_OPEN: FFmpegDemuxer: open context failed". and this doesn't help...
However, I guess that the first byte (missing) is needed to read properly metadata from the audio file hence why it is failing to read.

jakearchibald · 2017-08-21T11:41:47Z

Seems ok for this to be an implementation detail.

jakearchibald mentioned this issue Jun 20, 2017

Body.arrayBuffer([begin, end]) #554

Closed

jakearchibald closed this as completed Aug 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow request/response.blob() to resolve before reading the full stream #556

Allow request/response.blob() to resolve before reading the full stream #556

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

domenic commented Jun 20, 2017

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

domenic commented Jun 20, 2017

asutherland commented Jun 21, 2017

yutakahirano commented Jun 21, 2017

guest271314 commented Jun 24, 2017 •

edited

Loading

beaufortfrancois commented Jun 26, 2017

jakearchibald commented Aug 21, 2017

Allow request/response.blob() to resolve before reading the full stream #556

Allow request/response.blob() to resolve before reading the full stream #556

Comments

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

domenic commented Jun 20, 2017

jakearchibald commented Jun 20, 2017

wanderview commented Jun 20, 2017

domenic commented Jun 20, 2017

asutherland commented Jun 21, 2017

yutakahirano commented Jun 21, 2017

guest271314 commented Jun 24, 2017 • edited Loading

beaufortfrancois commented Jun 26, 2017

jakearchibald commented Aug 21, 2017

guest271314 commented Jun 24, 2017 •

edited

Loading