Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Let pointer be a pointer to first code point in..." #373

Closed
mfalken opened this issue Feb 12, 2018 · 8 comments · Fixed by #488
Closed

"Let pointer be a pointer to first code point in..." #373

mfalken opened this issue Feb 12, 2018 · 8 comments · Fixed by #488
Labels
clarification Standard could be clearer topic: parser

Comments

@mfalken
Copy link

mfalken commented Feb 12, 2018

https://url.spec.whatwg.org/commit-snapshots/5fc4c4c17f3077beda5e3647df73f4b1b0ba7084/#url-parsing

Let pointer be a pointer to first code point in input.

What happens if input is the empty string? See some confusion in whatwg/fetch#669

@TimothyGu
Copy link
Member

@mfalken
Copy link
Author

mfalken commented Feb 13, 2018

Thanks but I'm still not sure what the spec is saying. Let's run the basic URL parser where input="" and base="https://www.example.com". Here's my reading:

After step 1, url is a new URL.

After step 10. "Let pointer be a pointer to first code point in input.", pointer points to EOF.

In step 11. we enter the state machine at "scheme start state". c is undefined, so we set state to "no scheme state" and decrease pointer by one. Now the state machine step is over, so increase pointer by one and continue. pointer is again pointing to EOF, so the state machine is done. So we return url, which is a new URL.

Is that correct as-written, and as-intended?

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Feb 14, 2018
… empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
aarongable pushed a commit to chromium/chromium that referenced this issue Feb 14, 2018
… empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhirano@chromium.org>
Reviewed-by: Tsuyoshi Horo <horo@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536644}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Feb 14, 2018
… empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhirano@chromium.org>
Reviewed-by: Tsuyoshi Horo <horo@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536644}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Feb 14, 2018
… empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhirano@chromium.org>
Reviewed-by: Tsuyoshi Horo <horo@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536644}
@annevk
Copy link
Member

annevk commented Feb 18, 2018

No, that seems like a bug. 😟

This should basically return the base URL except for the fragment.

@GPHemsley
Copy link
Member

I was never satisfied with the resolution to #343, as I didn't think it implemented what I wrote in #308, which is what I used in my Perl implementation.

I haven't taken the time to confirm whether this issue revolves around those differences, but I figured I'd bring it up again for the sake of discussion.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 1, 2018
… with a Location header with empty value., a=testonly

Automatic update from web-platform-testsFetch API: Don't DCHECK on a 302 response with a Location header with empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhirano@chromium.org>
Reviewed-by: Tsuyoshi Horo <horo@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536644}

wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453
wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453
@rmisev
Copy link
Member

rmisev commented Apr 2, 2018

Step 11 in specification:

  1. Keep running the following state machine by switching on state. (1) If after a run pointer points to the EOF code point, go to the next step. (2) Otherwise, increase pointer by one and continue with the state machine.

From @mattto's #373 (comment):

... Now the state machine step is over, (2) so increase pointer by one and continue. (1) pointer is again pointing to EOF, so the state machine is done.

The order of (1) and (2) operations here is different than in the spec, so result is incorrect.

If follow specification, then after (2) operation we enter the "no scheme state":

... set state to relative state and decrease pointer by one.

Then we enter the "relative state", and c is EOF:

Set url’s scheme to base’s scheme, and then, switching on c:

  • The EOF code point
    Set url’s username to base’s username, url’s password to base’s password, url’s host to base’s host, url’s port to base’s port, url’s path to a copy of base’s path, and url’s query to base’s query.

Now the pointer isn't decreased, so it points to EOF and the state machine is done after (1) operation. The result is as specified in @annevk's #373 (comment).

@annevk
Copy link
Member

annevk commented Apr 3, 2018

Aah right. That always annoyed me about how this state machine is written. If we can rewrite it somehow to be more basic that'd be good.

gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 2, 2019
… with a Location header with empty value., a=testonly

Automatic update from web-platform-testsFetch API: Don't DCHECK on a 302 response with a Location header with empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhiranochromium.org>
Reviewed-by: Tsuyoshi Horo <horochromium.org>
Commit-Queue: Matt Falkenhagen <falkenchromium.org>
Cr-Commit-Position: refs/heads/master{#536644}

wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453
wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453

UltraBlame original commit: 794c66f8f16ec6e3439a4485a75bf80492609514
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 2, 2019
… with a Location header with empty value., a=testonly

Automatic update from web-platform-testsFetch API: Don't DCHECK on a 302 response with a Location header with empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhiranochromium.org>
Reviewed-by: Tsuyoshi Horo <horochromium.org>
Commit-Queue: Matt Falkenhagen <falkenchromium.org>
Cr-Commit-Position: refs/heads/master{#536644}

wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453
wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453

UltraBlame original commit: 794c66f8f16ec6e3439a4485a75bf80492609514
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 2, 2019
… with a Location header with empty value., a=testonly

Automatic update from web-platform-testsFetch API: Don't DCHECK on a 302 response with a Location header with empty value.

Blink was failing a DCHECK when it expected a redirect to be taken because
the request mode was 'follow' and the response had a redirect status code
and Location header. However, Chrome's //net considers a Location header
with an empty value to be the same as no Location header at all. See the
comments on bug 810288 about net::HttpResponseHeaders::IsRedirect().

WPT tests are added. The spec is unclear (whatwg/url#373)
but seems to either say to fail with network error or enter a loop of self-redirects
that will eventually hit the redirect limit and fail with network error.
Chrome currently fails this test because it treats the Location as
as non-existent, i.e., not a redirect response.

Bug: 810288,707185
Change-Id: Iaca93976aaa697380b1542ab45d63c318f5050b0
Reviewed-on: https://chromium-review.googlesource.com/910753
Reviewed-by: Yutaka Hirano <yhiranochromium.org>
Reviewed-by: Tsuyoshi Horo <horochromium.org>
Commit-Queue: Matt Falkenhagen <falkenchromium.org>
Cr-Commit-Position: refs/heads/master{#536644}

wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453
wpt-commits: e75788501733a0801eec374c19f8cdfbccd9c09d
wpt-pr: 9453

UltraBlame original commit: 794c66f8f16ec6e3439a4485a75bf80492609514
@domenic domenic added the clarification Standard could be clearer label Mar 4, 2020
@stephanebastian
Copy link

Step 11 in specification:

  1. Keep running the following state machine by switching on state. (1) If after a run pointer points to the EOF code point, go to the next step. (2) Otherwise, increase pointer by one and continue with the state machine.

From @mattto's #373 (comment):

... Now the state machine step is over, (2) so increase pointer by one and continue. (1) pointer is again pointing to EOF, so the state machine is done.

The order of (1) and (2) operations here is different than in the spec, so result is incorrect.

If follow specification, then after (2) operation we enter the "no scheme state":

... set state to relative state and decrease pointer by one.

Then we enter the "relative state", and c is EOF:

Hi all,
Let me give my 2 cents ;)
Based on the current wording of the spec, I don't think that we ever get into the 'relative state', thus I would expect to return a new URL (as initially reported by @mattto )
If we've got an empty input then the state machine runs 'scheme start state'
After that run, and according to the spec, we simply go to step 12

If after a run pointer points to the EOF code point, go to the next step (-> step 12 return url)

To me we are at EOF after the 1st run since there is no stream per say as there is no content. Quick info from the spec

The EOF code point is a conceptual code point that signifies the end of a string or code point stream

So I am puzzled as to how it is possible to enter the "relative state" ?
Could you please shed some light ?
Thanks in advance

@annevk
Copy link
Member

annevk commented Apr 24, 2020

I think it is indeed somewhat broken, but if we assume that we start out by pointer pointing at EOF (it's the empty string after all), it is then decreased by 1. Unclear what it now points at, but let's say not EOF so we continue and increase it by 1, at which point it again points to EOF but we're doing another run at least.

But we should rewrite this in terms of Infra, possibly using https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points.

annevk added a commit that referenced this issue May 4, 2020
annevk added a commit that referenced this issue May 5, 2020
This makes the following changes:

* Makes it clear pointer can be negative.
* Uses 1 rather than one as the newish style.
* Uses ordered lists for steps inside all states.
* Clones paths rather than using an undefined copy operation.
* Addresses a number of editorial nits.

Closes #373 and closes #480.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Standard could be clearer topic: parser
Development

Successfully merging a pull request may close this issue.

7 participants