Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(v7.x backport) url: updates to the WHATWG URL parser #12507

Merged
merged 21 commits into from Apr 25, 2017

Conversation

@TimothyGu
Copy link
Member

TimothyGu commented Apr 19, 2017

This pull request brings the WHATWG URL implementation in v7.x up to speed with master, including the code itself, the docs, and the tests.

Note: #12042 is not backported since #11956 (which it depends on) has not been backported yet.

Backports of (in the order they were landed to master):

  • #11060 url: extend URLSearchParams constructor
  • #11436 url: enforce valid UTF-8 in WHATWG parser
  • #11737 url: call toString before valueOf when stringifying
  • #11626 url: spec-compliant URLSearchParams serializer
  • #11859 src: remove explicit UTF-8 validity check in url
  • #11858 url: spec-compliant URLSearchParams parser
  • #11930 lib: use Object.create(null) directly (only the part that adds an explicit class for url[context])
  • #12056 url: add ToObject method to native URL class
  • #11917 src, url: WHATWG URL C++ parser cleanup
  • #12058 url: change path parsing for non-special URLs
  • #12134 url: stricter domainTo*() argument checking
  • #11690 url: avoid instanceof for WHATWG URL
  • #12203 url: trim leading slashes of file URL paths
  • #12331 url: remove javascript URL special case
  • #12315 url: disallow invalid IPv4 in IPv6 parser
  • #12252 url: clean up WHATWG URL origin generation
  • #12253 url: improve WHATWG URL inspection
  • #12251 src: clean up WHATWG URL parser, round 2
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

url

@TimothyGu

This comment has been minimized.

Copy link
Member Author

TimothyGu commented Apr 19, 2017

Copy link
Member

jasnell left a comment

rubber stamp LGTM

TimothyGu and others added 18 commits Jan 28, 2017
PR-URL: #12507
Fixes: #10635
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Ref: whatwg/url#175
Reviewed-By: James M Snell <jasnell@gmail.com>
This commit implements the Web IDL USVString conversion, which mandates
all unpaired Unicode surrogates be turned into U+FFFD REPLACEMENT
CHARACTER. It also disallows Symbols to be used as USVString per spec.

Certain functions call into C++ methods in the binding that use the
Utf8Value class to access string arguments. Utf8Value already does the
normalization using V8's String::Write, so in those cases, instead of
doing the full USVString normalization, only a symbol check is done
(`'' + val`, which uses ES's ToString, versus `String()` which has
special provisions for symbols).

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
The ES addition operator calls the ToPrimitive() abstract operation
without hint String, leading a subsequent OrdinaryToPrimitive() to call
valueOf() first on an object rather than the desired toString().

Instead, use template literals which directly call ToString() abstract
operation, per Web IDL spec.

PR-URL: #12507
Fixes: b610a4d "url: enforce valid UTF-8 in WHATWG parser"
Refs: b610a4d#commitcomment-21200056
Refs: https://tc39.github.io/ecma262/#sec-addition-operator-plus-runtime-semantics-evaluation
Refs: https://tc39.github.io/ecma262/#sec-template-literals-runtime-semantics-evaluation
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
This step was never part of the URL Standard's host parser algorithm,
and is rendered unnecessary after IDNA errors are no longer ignored.

PR-URL: #12507
Refs: c2a302c "src: do not ignore IDNA conversion error"
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
The entire `URLSearchParams` class is now fully spec-compliant.

PR-URL: #12507
Fixes: #10821
Reviewed-By: James M Snell <jasnell@gmail.com>
The object is used as a structure, not as a map, which `StorageObject`
was designed for.

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
Provides a factory method to convert a native URL class
into a JS URL object.

```c++
Environment* env = ...

URL url("http://example.org/a/b/c?query#fragment");

MaybeLocal<Value> val = url.ToObject(env);
```

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
- Clarify port state
- Remove scheme flag
- Clarify URL_FLAG_TERMINATED

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
This changes to the way path parsing for non-special URLs.
It allows paths to be empty for non-special URLs and also
takes that into account when serializing.

PR-URL: #12507
Fixes: #11962
Refs: whatwg/url#213
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Refs: web-platform-tests/wpt#4586
Refs: #11887
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
It should trim the slashes after the colon into three for file URL.

PR-URL: #12507
Refs: web-platform-tests/wpt#5195
Fixes: #11188
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Fixes: #11485
Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #12507
Fixes: #10655
Reviewed-By: James M Snell <jasnell@gmail.com>
@evanlucas evanlucas mentioned this pull request May 1, 2017
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Fixes: #10635
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Ref: whatwg/url#175
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
This commit implements the Web IDL USVString conversion, which mandates
all unpaired Unicode surrogates be turned into U+FFFD REPLACEMENT
CHARACTER. It also disallows Symbols to be used as USVString per spec.

Certain functions call into C++ methods in the binding that use the
Utf8Value class to access string arguments. Utf8Value already does the
normalization using V8's String::Write, so in those cases, instead of
doing the full USVString normalization, only a symbol check is done
(`'' + val`, which uses ES's ToString, versus `String()` which has
special provisions for symbols).

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
The ES addition operator calls the ToPrimitive() abstract operation
without hint String, leading a subsequent OrdinaryToPrimitive() to call
valueOf() first on an object rather than the desired toString().

Instead, use template literals which directly call ToString() abstract
operation, per Web IDL spec.

PR-URL: #12507
Fixes: b610a4d "url: enforce valid UTF-8 in WHATWG parser"
Refs: b610a4d#commitcomment-21200056
Refs: https://tc39.github.io/ecma262/#sec-addition-operator-plus-runtime-semantics-evaluation
Refs: https://tc39.github.io/ecma262/#sec-template-literals-runtime-semantics-evaluation
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
This step was never part of the URL Standard's host parser algorithm,
and is rendered unnecessary after IDNA errors are no longer ignored.

PR-URL: #12507
Refs: c2a302c "src: do not ignore IDNA conversion error"
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
The entire `URLSearchParams` class is now fully spec-compliant.

PR-URL: #12507
Fixes: #10821
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
The object is used as a structure, not as a map, which `StorageObject`
was designed for.

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
Provides a factory method to convert a native URL class
into a JS URL object.

```c++
Environment* env = ...

URL url("http://example.org/a/b/c?query#fragment");

MaybeLocal<Value> val = url.ToObject(env);
```

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
- Clarify port state
- Remove scheme flag
- Clarify URL_FLAG_TERMINATED

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
This changes to the way path parsing for non-special URLs.
It allows paths to be empty for non-special URLs and also
takes that into account when serializing.

PR-URL: #12507
Fixes: #11962
Refs: whatwg/url#213
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Refs: web-platform-tests/wpt#4586
Refs: #11887
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
It should trim the slashes after the colon into three for file URL.

PR-URL: #12507
Refs: web-platform-tests/wpt#5195
Fixes: #11188
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Fixes: #11485
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Fixes: #10655
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
- Use ordinary properties instead of symbols/getter redirection for
  internal object
- Use template string literals
- Remove unneeded custom inspection for internal objects
- Remove unneeded OpaqueOrigin class
- Remove unneeded type checks

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 1, 2017
* reduce indentation
* refactor URL inlined methods
* prefer templates over macros
* do not export ARG_* flags in url binding

PR-URL: #12507
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas added a commit that referenced this pull request May 2, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 2, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 3, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 3, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
imyller added a commit to imyller/meta-nodejs that referenced this pull request May 4, 2017
    Notable changes:

    * **crypto**:
      - add randomFill and randomFillSync (Evan Lucas)
        nodejs/node#10209
    * **meta**: Added new collaborators
      - add lucamaraschi to collaborators (Luca Maraschi)
        nodejs/node#12538
      - add DavidCai1993 to collaborators (David Cai)
        nodejs/node#12435
      - add jkrems to collaborators (Jan Krems)
        nodejs/node#12427
      - add AnnaMag to collaborators (AnnaMag)
        nodejs/node#12414
    * **process**:
      - fix crash when Promise rejection is a Symbol (Cameron Little)
        nodejs/node#11640
    * **url**:
      - make WHATWG URL more spec compliant (Timothy Gu)
        nodejs/node#12507
    * **v8**:
      - fix stack overflow in recursive method (Ben Noordhuis)
        nodejs/node#12460
      - fix build errors with g++ 7 (Ben Noordhuis)
        nodejs/node#12392

    PR-URL: nodejs/node#12775

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
anchnk pushed a commit to anchnk/node that referenced this pull request May 6, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    nodejs#10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    nodejs#12538
  - add DavidCai1993 to collaborators (David Cai)
    nodejs#12435
  - add jkrems to collaborators (Jan Krems)
    nodejs#12427
  - add AnnaMag to collaborators (AnnaMag)
    nodejs#12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    nodejs#11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    nodejs#12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    nodejs#12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    nodejs#12392

PR-URL: nodejs#12775
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.