Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for blob range requests #1520

Merged
merged 1 commit into from
Feb 20, 2023
Merged

Conversation

dlrobertson
Copy link
Member

@dlrobertson dlrobertson commented Oct 29, 2022

(See WHATWG Working Mode: Changes for more details.)

Implement support for blob range requests.

Related to: #1070


Preview | Diff

@dlrobertson
Copy link
Member Author

Posting this a bit early in the implementation phase... See the many TODOs 😄

Any feedback would be really helpful!

fetch.bs Outdated
Comment on lines 4637 to 4641
<li><p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset, run the
following steps:

<ol>
<li><p>Set <var>response</var>'s <a for=response>status message</a> to `<code>OK</code>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level formatting nit (this does not just apply here):

Suggested change
<li><p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset, run the
following steps:
<ol>
<li><p>Set <var>response</var>'s <a for=response>status message</a> to `<code>OK</code>`
<li>
<p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset, then:
<ol>
<li><p>Set <var>response</var>'s <a for=response>status message</a> to `<code>OK</code>`.

fetch.bs Outdated
<li><p>Append <var>bodyLength</var> to <var>contentRange</var>.

<li><p>Set <var>response</var>'s <a for=response>status message</a> to
`<code>Partial Content</code>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to set status to 206 as well here. (It might be okay to not set status message at all and let it be the empty byte sequence as it would be in HTTP/2 and up, but I don't care strongly.)

fetch.bs Outdated
<p>blob.slice(<var>rangeStart</var>[0], <var>rangeValue</var>[1], <var>type</var>) here
on <var>body</var>.

<p class="note">TODO(dlrobertson): This requires a lot more thought.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primarily we don't want to use the public JS API here but instead call an internal algorithm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the more I've looked at the the more I think your suggestion to upstream a change to FileAPI to make the guts of blob.slice a separate algorithm would be real nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first attempt at this in w3c/FileAPI#183.

fetch.bs Outdated
<var>contentRange</var>.

<li>
<p>If <var>rangeValue</var>[0] is null, <a lt="serialize an integer">Serialize</a> and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use "Otherwise" here instead to make it clear it's one or the other?

fetch.bs Outdated
<li><p>Let <var>bodyWithType</var> be the result of <a for=BodyInit>safely extracting</a>
<var>blobURLEntry</var>'s <a for="blob URL entry">object</a>.

<li><p>Let <var>body</var> be <var>bodyWithType</var>'s <a for="body with type">body</a>.

<li><p>Let <var>length</var> be <var>body</var>'s <a for=body>length</a>,
<li><p>Let <var>bodyLength</var> be <var>body</var>'s <a for=body>length</a>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe initialLength/originalLength/fullLength is clearer?

@dlrobertson
Copy link
Member Author

Updated to use slice blob.

fetch.bs Show resolved Hide resolved
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted a lot of tiny nits, but I might have missed something bigger therefore. Probably needs another close look once these are addressed. Hope that's understandable.

fetch.bs Outdated
<li><p>Set <var>response</var> to a new <a for=/>response</a>.

<li><p>If <var>request</var>'s <a for=response>header list</a> contains `<code>Range</code>`,
set <var>response</var>'s <a for=response>range-requested flag</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
set <var>response</var>'s <a for=response>range-requested flag</a>.
then set <var>response</var>'s <a for=response>range-requested flag</a>.

fetch.bs Outdated
Comment on lines 4668 to 4669
<p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset, run the
following steps:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset, run the
following steps:
<p>If <var>response</var>'s <a for=response>range-requested flag</a> is unset:

fetch.bs Outdated
Comment on lines 4682 to 4683
<p>Otherwise, if <var>response</var>'s <a for=response>range-requested flag</a> is set, run the
following steps:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>Otherwise, if <var>response</var>'s <a for=response>range-requested flag</a> is set, run the
following steps:
<p>Otherwise:

(I was initially thinking we could maybe keep it without the "if" or have it as assert, but since there's only two branches I don't think that's warranted.)

fetch.bs Outdated
following steps:

<ol>
<li><p>Let <var>rangeHeader</var> be the result of <a>extracting header list values</a> given
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to use "get" or "get, decode, and split". We're trying to get rid of "extracting header list values".

fetch.bs Outdated
<li><p>Let <var>sliceEndRange</var> be null.

<li>
<p>If <var>rangeValue</var>[1] is non-null, let <var>sliceEndRange</var> be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then set sliceEndRange to*

fetch.bs Outdated
<li><p>Set <var>response</var>'s <a for=response>status</a> to `<code>206</code>`.

<li><p>Set <var>response</var>'s <a for=response>header list</a> to «
(`<code>Content-Length</code>`, <var>length</var>), (`<code>Content-Type</code>`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is <var>length</var> defined?

fetch.bs Outdated
<li><p>Set <var>response</var>'s <a for=response>status message</a> to
`<code>Partial Content</code>`.

<li><p>Set <var>response</var>'s <a for=response>status</a> to `<code>206</code>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this field is an integer.

Suggested change
<li><p>Set <var>response</var>'s <a for=response>status</a> to `<code>206</code>`.
<li><p>Set <var>response</var>'s <a for=response>status</a> to 206.

fetch.bs Show resolved Hide resolved
fetch.bs Show resolved Hide resolved
fetch.bs Outdated
<var>type</var>), (`<code>Content-Range</code>`, <var>contentRange</var>) ».
</ol>

<li><p> return <var>response</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li><p> return <var>response</var>.
<li><p>Return <var>response</var>.

Copy link
Member Author

@dlrobertson dlrobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted a lot of tiny nits, but I might have missed something bigger therefore. Probably needs another close look once these are addressed. Hope that's understandable.

Definitely. Feedback is greatly appreciated!

fetch.bs Show resolved Hide resolved
fetch.bs Show resolved Hide resolved
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks quite clean. I still have a couple inline comments and a longer thought here for a somewhat more substantial change.

So in principle we can get the type and size from a blob, although unfortunately they are not internal concepts as-of-yet.

That means we can avoid extracting twice.

It also means we can avoid setting up a new response upfront and can probably have two return statements at the end of the two branches where we create a response and set all its fields as we do now. (They're equivalent, but I kinda like the atomic look of the current setup.)

We also need only one conditional, whether request contains a Range header and the two branches can flow from that. (This is doable either way.)

Even if editorial, this would be some work. What do you think?

fetch.bs Outdated

<li>
<p>If <var>rangeValue</var>[1] is non-null, then set <var>sliceEndRange</var> to
<code><var>rangeValue</var>[1] + 1</code>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<code><var>rangeValue</var>[1] + 1</code>.
<var>rangeValue</var>[1] + 1.

Equations aren't code. At least that's not what we've gone with thus far.

fetch.bs Show resolved Hide resolved
fetch.bs Outdated
<ol>
<li><p>Let <var>rangeHeader</var> be the result of <a for="header list">getting</a>
`<code>Range</code>` from <var>request</var>'s <a for=request>header list</a>.
<!-- Range should be added to the headers defined in HTTP extensions. -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that makes sense. And then move the single range header value parser there?

fetch.bs Outdated
Comment on lines 4740 to 4743
<li><p>Set <var>response</var>'s <a for=response>status message</a> to
`<code>Partial Content</code>`.

<li><p>Set <var>response</var>'s <a for=response>status</a> to 206.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's flip these two steps. Status is more significant.

fetch.bs Show resolved Hide resolved
@dlrobertson
Copy link
Member Author

Thanks, this looks quite clean. I still have a couple inline comments and a longer thought here for a somewhat more substantial change.

So in principle we can get the type and size from a blob, although unfortunately they are not internal concepts as-of-yet.

That means we can avoid extracting twice.

Yeah, this makes sense. I didn't change this in order to attempt to keep things closer to how they were before, but we could easily remove the bodyWithType steps in the section of the algorithm before the branch.

It also means we can avoid setting up a new response upfront and can probably have two return statements at the end of the two branches where we create a response and set all its fields as we do now. (They're equivalent, but I kinda like the atomic look of the current setup.)

We also need only one conditional, whether request contains a Range header and the two branches can flow from that. (This is doable either way.)

Even if editorial, this would be some work. What do you think?

I like the idea of the single return statement using Blob size and type definitions, but I'm happy to change this if this isn't preferred.

@dlrobertson
Copy link
Member Author

Updated to use Blob.size and Blob.type

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifications are generally quite happy with multiple return statements, but I'm okay sticking with one here. Couple final nits and this seems good to go.

fetch.bs Outdated
Comment on lines 4653 to 4654
<li><p>If <var>request</var>'s <a for=response>header list</a> contains `<code>Range</code>`,
then set <var>response</var>'s <a for=response>range-requested flag</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should deduplicate this with the conditional below.

fetch.bs Outdated

<li><p>Let <var>length</var> be <var>body</var>'s <a for=body>length</a>,
<li><p>Let <var>fullLength</var> be <var>blobURLEntry</var>'s
<a for="blob URL entry">object</a>'s <a for=Blob data-link-type=attribute>size</a>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use {{Blob/size}} as we do elsewhere in this document. Same for type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Just for future reference, what is the benefit of {{Blob/size}} and how does it differ from <a for=Blob data-link-type=attribute>size</a>? Both create links to https://w3c.github.io/FileAPI/#dfn-size for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the latter include also result in <code> being included? Anyway, the former is a shorthand that should do the right thing and it's useful to do it consistently within a document so you can easily find all instances.

Now having said all that, typically you would never want to reference an IDL attribute as it's the public API. The problem here is that Blob still doesn't define its internal model.

Copy link
Member Author

@dlrobertson dlrobertson Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the latter include also result in <code> being included?

Ah! No it does not.

Anyway, the former is a shorthand that should do the right thing and it's useful to do it consistently within a document so you can easily find all instances.

This makes a lot of sense.

fetch.bs Outdated

<li><p>Let <var>slicedLength</var> be <var>slicedBlob</var>'s
<a for=Blob data-link-type=attribute>size</a>, <a lt="serialize an integer">serialized</a>
and <a>isomorphic encoded</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could still refer to the length of the body (that's the result of extracting), but I'm on the fence as to whether that's better. In any event we should use {{Blob/size}} until we can reference some internal field on Blob.

@dlrobertson
Copy link
Member Author

Updated to follow your suggestions. Let me know if there is anything else you'd like changed!

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo a couple tiny nits this looks good.

Are Chromium and WebKit passing all the tests or should there be bugs on them as well?

fetch.bs Outdated
@@ -4648,21 +4648,107 @@ steps:
<p class=note>The `<code>GET</code>` <a for=/>method</a> restriction serves no useful purpose
other than being interoperable.

<li><p>Let <var>bodyWithType</var> be the result of <a for=BodyInit>safely extracting</a>
<var>blobURLEntry</var>'s <a for="blob URL entry">object</a>.
<li><p>Set <var>response</var> to a new <a for=/>response</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li><p>Set <var>response</var> to a new <a for=/>response</a>.
<li><p>Let <var>response</var> be a new <a for=/>response</a>.

fetch.bs Outdated
Comment on lines 4721 to 4724
<li><p>If <var>rangeValue</var>[0] is non-null, then
<a lt="serialize an integer">serialize</a> and
<a>isomorphic encode</a> <var>rangeValue</var>[0], and append the result to
<var>contentRange</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li><p>If <var>rangeValue</var>[0] is non-null, then
<a lt="serialize an integer">serialize</a> and
<a>isomorphic encode</a> <var>rangeValue</var>[0], and append the result to
<var>contentRange</var>.
<li><p>If <var>rangeValue</var>[0] is non-null, then
<a lt="serialize an integer">serialize</a> and <a>isomorphic encode</a>
<var>rangeValue</var>[0], and append the result to <var>contentRange</var>.

fetch.bs Outdated
Comment on lines 4730 to 4733
<li><p>If <var>sliceEndRange</var> is non-null, then
<a lt="serialize an integer">serialize</a> and
<a>isomorphic encode</a> <var>sliceEndRange</var>, and append the result to
<var>contentRange</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<li><p>If <var>sliceEndRange</var> is non-null, then
<a lt="serialize an integer">serialize</a> and
<a>isomorphic encode</a> <var>sliceEndRange</var>, and append the result to
<var>contentRange</var>.
<li><p>If <var>sliceEndRange</var> is non-null, then
<a lt="serialize an integer">serialize</a> and <a>isomorphic encode</a>
<var>sliceEndRange</var>, and append the result to <var>contentRange</var>.

@dlrobertson
Copy link
Member Author

Are Chromium and WebKit passing all the tests or should there be bugs on them as well?

Good question! Both fail most of the upsupported range cases tested in web-platform-tests/wpt#34384. WebKit returns a response with a 416 status when the range is invalid and Chromium only fails 3 of the unsupported ranges, but does return a network error. We'll need to create bugs for this I guess.

@dlrobertson
Copy link
Member Author

Filed the bugs:

I definitely didn't file the chromium bug right... any feedback on filing non-firefox implementation bugs would be appreciated

@annevk
Copy link
Member

annevk commented Dec 9, 2022

I was about to merge this, but then read #1070 (comment) again. I think we should address the whitespace issue first, perhaps in a separate PR to constrain it to CORS?

@dlrobertson
Copy link
Member Author

No problem! I'll take a stab at it this evening

@annevk annevk added the do not merge yet Pull request must not be merged per rationale in comment label Feb 8, 2023
fetch.bs Outdated
https://github.com/whatwg/fetch/issues/1553 for future work -->

<li><p>Let <var>rangeValue</var> be the result of <a>parsing a single range header value</a>
given <var>rangeHeader</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
given <var>rangeHeader</var>.
given <var>rangeHeader</var> and true.

I think this is the only change needed here. But if you could gloss it over one last time that would be appreciated.

I suspect we'll land this somewhere next week given the Bikeshed issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll re-read and post an update this weekend

Implement support for blob range requests.
annevk pushed a commit that referenced this pull request Feb 20, 2023
No normative changes. This is a pre-requisite for #1520.
@annevk annevk removed the do not merge yet Pull request must not be merged per rationale in comment label Feb 20, 2023
@annevk annevk merged commit dc0f9e7 into whatwg:main Feb 20, 2023
@annevk annevk added the addition/proposal New features or enhancements label Feb 20, 2023
annevk pushed a commit to web-platform-tests/wpt that referenced this pull request Feb 20, 2023
@dlrobertson dlrobertson deleted the blob-range-req branch February 21, 2023 00:10
@dlrobertson
Copy link
Member Author

Thanks @annevk for all the help on this!

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Mar 1, 2023
Automatic update from web-platform-tests
Fetch: blob: URL range requests

For whatwg/fetch#1520.
--

wpt-commits: edc7cac3a37e9478fa301fe4c001bcaa3d79c62e
wpt-pr: 34384
jamienicol pushed a commit to jamienicol/gecko that referenced this pull request Mar 1, 2023
Automatic update from web-platform-tests
Fetch: blob: URL range requests

For whatwg/fetch#1520.
--

wpt-commits: edc7cac3a37e9478fa301fe4c001bcaa3d79c62e
wpt-pr: 34384
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Mar 2, 2023
Automatic update from web-platform-tests
Fetch: blob: URL range requests

For whatwg/fetch#1520.
--

wpt-commits: edc7cac3a37e9478fa301fe4c001bcaa3d79c62e
wpt-pr: 34384
@annevk annevk mentioned this pull request Mar 21, 2023
3 tasks
marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this pull request Mar 28, 2023
aosmond pushed a commit to aosmond/gecko that referenced this pull request May 18, 2023
Automatic update from web-platform-tests
Fetch: blob: URL range requests

For whatwg/fetch#1520.
--

wpt-commits: edc7cac3a37e9478fa301fe4c001bcaa3d79c62e
wpt-pr: 34384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements
Development

Successfully merging this pull request may close these issues.

None yet

2 participants