url: make WHATWG URL properties spec compliant #10408

Closed
wants to merge 2 commits into
from
@joyeecheung
Contributor
joyeecheung commented Dec 22, 2016 edited
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

url, test

Description of change

Fixes: #10376

Possible conflicts: #10317, #10399, those two don't touch internal/url.js that much so should be easy to resolve with rebase.

A few TODOs that I think should go into other PR(s), and are possibly in confict with this one:

  • TupleOrigin#toString should have unicode = true as default (or just get rid of the argument)
  • URL#toString probably don't need the arguments. Those are not tested ATM anyway(see the test coverage) (conflicts with this PR).
  • URL.originFor, URL.domainToASCII and URL.domainToUnicode should be moved to require('URL') (conflicts with this PR)(#10374, #10375).

cc/ @jasnell

@joyeecheung
Contributor
joyeecheung commented Dec 22, 2016 edited

Looks like github doesn't handle whitespace diffs very well :/ I believe only the URL constructor and the href setter(non-existent before this PR) should have function body changes.

EDIT: now the URLSearchParams constructor is changed too.

@joyeecheung
Contributor
joyeecheung commented Dec 22, 2016 edited

I'm going to split this into multiple commits so it's easier to review. Mark this as WIP.

EDIT: turns out we can add ?w=1 to the commit page and ignore whitespace changes. So f5e0fff?w=1 should be easier to review.

@joyeecheung joyeecheung changed the title from url: make WHATWG URL properties spec compliant to [WIP] url: make WHATWG URL properties spec compliant Dec 22, 2016
@joyeecheung joyeecheung changed the title from [WIP] url: make WHATWG URL properties spec compliant to url: make WHATWG URL properties spec compliant Dec 22, 2016
@addaleax
Member

Also /cc @targos @TimothyGu

@TimothyGu
Member

Can you post some benchmark results to make sure that the performance stays the same?

@joyeecheung
Contributor

@TimothyGu Sure. Added those benchmarks to this PR too.

                                                                                                                                 improvement significant      p.value
 url/whatwg-url-properties.js n=10000 prop="hash" url="http://example.com/"                                                          -0.09 %             9.841584e-01
 url/whatwg-url-properties.js n=10000 prop="hash" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                             3.56 %             1.556157e-01
 url/whatwg-url-properties.js n=10000 prop="hash" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"              1.79 %             5.404247e-01
 url/whatwg-url-properties.js n=10000 prop="hash" url="javascript:alert(\\"node is awesome\\");"                                      6.08 %             2.047208e-01
 url/whatwg-url-properties.js n=10000 prop="host" url="http://example.com/"                                                          -3.54 %             1.022282e-01
 url/whatwg-url-properties.js n=10000 prop="host" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                            -4.71 %             2.502342e-01
 url/whatwg-url-properties.js n=10000 prop="host" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"             -3.93 %             1.162558e-01
 url/whatwg-url-properties.js n=10000 prop="host" url="javascript:alert(\\"node is awesome\\");"                                     -0.06 %             9.771016e-01
 url/whatwg-url-properties.js n=10000 prop="hostname" url="http://example.com/"                                                      -1.01 %             5.929368e-01
 url/whatwg-url-properties.js n=10000 prop="hostname" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                        -0.30 %             9.221817e-01
 url/whatwg-url-properties.js n=10000 prop="hostname" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"          0.75 %             7.293513e-01
 url/whatwg-url-properties.js n=10000 prop="hostname" url="javascript:alert(\\"node is awesome\\");"                                 -2.26 %             6.183556e-01
 url/whatwg-url-properties.js n=10000 prop="href" url="http://example.com/"                                                          -5.10 %             2.268722e-01
 url/whatwg-url-properties.js n=10000 prop="href" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                            -1.74 %             4.968612e-01
 url/whatwg-url-properties.js n=10000 prop="href" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"              2.55 %             4.222762e-01
 url/whatwg-url-properties.js n=10000 prop="href" url="javascript:alert(\\"node is awesome\\");"                                      2.76 %             3.673337e-01
 url/whatwg-url-properties.js n=10000 prop="origin" url="http://example.com/"                                                        -0.09 %             9.740200e-01
 url/whatwg-url-properties.js n=10000 prop="origin" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                          -2.34 %             3.395517e-01
 url/whatwg-url-properties.js n=10000 prop="origin" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"            1.94 %             5.599582e-01
 url/whatwg-url-properties.js n=10000 prop="origin" url="javascript:alert(\\"node is awesome\\");"                                   12.13 %         *** 2.510968e-05
 url/whatwg-url-properties.js n=10000 prop="password" url="http://example.com/"                                                       3.24 %             1.762369e-01
 url/whatwg-url-properties.js n=10000 prop="password" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                         1.65 %             5.167405e-01
 url/whatwg-url-properties.js n=10000 prop="password" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"          5.14 %             2.250016e-01
 url/whatwg-url-properties.js n=10000 prop="password" url="javascript:alert(\\"node is awesome\\");"                                  7.27 %             8.088908e-02
 url/whatwg-url-properties.js n=10000 prop="pathname" url="http://example.com/"                                                      -1.57 %             5.112586e-01
 url/whatwg-url-properties.js n=10000 prop="pathname" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                         0.03 %             9.880163e-01
 url/whatwg-url-properties.js n=10000 prop="pathname" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"         -1.75 %             4.691196e-01
 url/whatwg-url-properties.js n=10000 prop="pathname" url="javascript:alert(\\"node is awesome\\");"                                 -0.35 %             9.299632e-01
 url/whatwg-url-properties.js n=10000 prop="port" url="http://example.com/"                                                          -5.57 %           * 3.324185e-02
 url/whatwg-url-properties.js n=10000 prop="port" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                            -4.47 %             1.632997e-01
 url/whatwg-url-properties.js n=10000 prop="port" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"             -0.89 %             7.564485e-01
 url/whatwg-url-properties.js n=10000 prop="port" url="javascript:alert(\\"node is awesome\\");"                                     -3.76 %             5.333226e-01
 url/whatwg-url-properties.js n=10000 prop="protocol" url="http://example.com/"                                                      -3.29 %             2.048837e-01
 url/whatwg-url-properties.js n=10000 prop="protocol" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                        -0.54 %             8.972573e-01
 url/whatwg-url-properties.js n=10000 prop="protocol" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"         -0.17 %             9.533789e-01
 url/whatwg-url-properties.js n=10000 prop="protocol" url="javascript:alert(\\"node is awesome\\");"                                 -1.90 %             5.691681e-01
 url/whatwg-url-properties.js n=10000 prop="search" url="http://example.com/"                                                        -3.73 %             2.202583e-01
 url/whatwg-url-properties.js n=10000 prop="search" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                           3.71 %             1.145016e-01
 url/whatwg-url-properties.js n=10000 prop="search" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"           -1.40 %             7.044299e-01
 url/whatwg-url-properties.js n=10000 prop="search" url="javascript:alert(\\"node is awesome\\");"                                   -3.21 %             1.908788e-01
 url/whatwg-url-properties.js n=10000 prop="searchParams" url="http://example.com/"                                                  -0.61 %             8.534573e-01
 url/whatwg-url-properties.js n=10000 prop="searchParams" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                    -6.61 %             2.891761e-01
 url/whatwg-url-properties.js n=10000 prop="searchParams" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"     -5.15 %             2.477534e-01
 url/whatwg-url-properties.js n=10000 prop="searchParams" url="javascript:alert(\\"node is awesome\\");"                             -3.71 %             4.798811e-01
 url/whatwg-url-properties.js n=10000 prop="toString" url="http://example.com/"                                                      -1.56 %             5.883913e-01
 url/whatwg-url-properties.js n=10000 prop="toString" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                        -0.39 %             8.858133e-01
 url/whatwg-url-properties.js n=10000 prop="toString" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"          0.10 %             9.596280e-01
 url/whatwg-url-properties.js n=10000 prop="toString" url="javascript:alert(\\"node is awesome\\");"                                 -0.18 %             9.552103e-01
 url/whatwg-url-properties.js n=10000 prop="username" url="http://example.com/"                                                       0.17 %             9.549128e-01
 url/whatwg-url-properties.js n=10000 prop="username" url="http://user:pass@foo.bar.com:21/aaa/zzz?l=24#test"                         7.21 %             6.299967e-02
 url/whatwg-url-properties.js n=10000 prop="username" url="https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en"          3.46 %             1.986782e-01
 url/whatwg-url-properties.js n=10000 prop="username" url="javascript:alert(\\"node is awesome\\");"                                 -0.01 %             9.918962e-01           
@TimothyGu

I wouldn't worry about conflicts too much at this point. We can get it in later.

lib/internal/url.js
+ this[context].query = query;
+ this[context].fragment = fragment;
+ this[context].host = host;
+ this[searchParams] = new URLSearchParams(query);
@TimothyGu
TimothyGu Dec 22, 2016 Member

The searchParams attribute is declared with [SameObject]. When the href setter is invoked, the underlying header list should be replaced, but not the URLSearchParams object itself.

@joyeecheung
joyeecheung Dec 23, 2016 Contributor

Thanks for spotting this. Fixed and added tests. Could be in conflict with https://github.com/nodejs/node/pull/10399/files#diff-7fa9474fccb31c3ccef5d728fbd5248bL557 for reusing part of the URLSearchParams constructor though, should be easy to rebase.

@joyeecheung
Contributor

Ran the new-url-parse benchmark with method=new. No significant performance impact.

                                                         improvement significant    p.value
 url/new-url-parse.js n=250000 method="new" type="five"      -1.02 %             0.20831528
 url/new-url-parse.js n=250000 method="new" type="four"      -2.19 %             0.06488415
 url/new-url-parse.js n=250000 method="new" type="one"       -0.99 %             0.24252718
 url/new-url-parse.js n=250000 method="new" type="three"     -1.89 %             0.07118159
 url/new-url-parse.js n=250000 method="new" type="two"       -0.65 %             0.45040674
lib/internal/url.js
@@ -105,7 +105,11 @@ function parse(input, base) {
this[context].query = query;
this[context].fragment = fragment;
this[context].host = host;
- this[searchParams] = new URLSearchParams(query);
+ if (this[searchParams]) { // invoked from href setter
+ initSearchParams(this[searchParams], query);
@TimothyGu
TimothyGu Dec 23, 2016 Member

So initSearchParams strips a leading question mark, if one exists. But is it possible for query contain a legitimate leading ? at this point?

I'd rather just make a parseSearchParams function, while moving the stringification and question mark stripping to where those are needed.

@joyeecheung
joyeecheung Dec 23, 2016 Contributor

I am not very sure about the possibility of the leading ?. If we do need to make it a precondition, then we probably need to add a comment in node_url.cc?

@TimothyGu
TimothyGu Dec 23, 2016 Member

Actually, after looking at node_url.cc, it's not possible for there to be a question mark here, which means that these following lines are not needed for this case:

init = String(init);
if (init[0] === '?') init = init.slice(1);

As a result of that, please move these lines from initSearchParams back to URLSearchParams's constructor.

@joyeecheung
joyeecheung Dec 23, 2016 Contributor

I've moved them back. The commits are there for review, may need to squash them before rebase.

lib/internal/url.js
- init = String(init);
- if (init[0] === '?') init = init.slice(1);
- this[searchParams] = querystring.parse(init);
+ initSearchParams(this, init);
@TimothyGu
TimothyGu Dec 23, 2016 Member

Yeah, I'd much prefer keeping the String() and .slice here, as they are such in the spec.

@jasnell
Member
jasnell commented Dec 23, 2016

I'd like to take a bit more time to review this before it lands... unfortunately I won't have time until later next week after the holiday.

@TimothyGu

No more comments feom me. Thanks!

@addaleax
Member

Could you squash these commits together in the form you think they could land in? That would make it a bit easier to review this. :)

(I think the added benchmark doesn’t need to be squashed together with the rest.)

@joyeecheung
Contributor

@addaleax OK, I've squashed them into two commits. https://github.com/nodejs/node/pull/10408/files?w=1 should be easier to review with the whitespace changes ignored.

lib/internal/url.js
@@ -85,34 +85,40 @@ class TupleOrigin {
}
}
+// Resued by URL constructor and
@addaleax
addaleax Dec 26, 2016 Member

typo: Reused? (This comment can also probably fit into a single line)

@joyeecheung
joyeecheung Dec 26, 2016 Contributor

Thanks for spotting this :)

lib/internal/url.js
+ writable: true,
+ enumerable: true,
+ configurable: true,
+ value: function(options) {
@addaleax
addaleax Dec 26, 2016 Member

nit: I’d suggest naming this function (i.e. function toString(options)), otherwise V8 would infer its name as value

@joyeecheung
joyeecheung Dec 26, 2016 Contributor

So the linter yelled at me when I do this...

168:5 error Function name toString should match property name value func-name-matching

maybe cc/ @trott ?

@Trott
Trott Dec 27, 2016 Member

You can disable the func-name-matching for that one line with a comment.

configurable: true,
// eslint-disable-next-line func-name-matching
value: function toString(options) {

There's also // eslint-disable-line func-name-matching but since that has to be appended to the offending line, it often results in a line over 80 characters long, which violates a different lint rule. :-D

FWIW, this came up on the ESLint issue tracker two months ago but hasn't moved much since then. Doesn't look like there was consensus on whether something should be done or not. eslint/eslint#7423

/cc @not-an-aardvark

+ 'username', 'password', 'host', 'hostname', 'port',
+ 'pathname', 'search', 'searchParams', 'hash'];
+
+assert.deepStrictEqual(props, expected);
@addaleax
addaleax Dec 26, 2016 Member

I wonder: Is there a chance the ordering of the properties changes? Would we want to catch that or do we want to .sort() these?

@joyeecheung
joyeecheung Dec 26, 2016 Contributor

According to https://heycam.github.io/webidl/#idl-interfaces I think the order doesn't really matter at the moment, although FF and Chrome do keep the enumeration order the same as the spec. Maybe this one can be left as-is until the spec changes?

@jasnell jasnell self-assigned this Dec 26, 2016
joyeecheung added some commits Dec 22, 2016
@joyeecheung joyeecheung benchmark: add benchmark for WHATWG URL properties 62d115f
@joyeecheung joyeecheung url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

Fixes: nodejs#10376
126d6e7
@joyeecheung
Contributor

Ping, I've updated the commits.

@jasnell

LGTM with a minor nit

+'use strict';
+
+var common = require('../common.js');
+var URL = require('url').URL;
@jasnell
jasnell Dec 29, 2016 Member

These can be const. The main reason for keeping var in other benchmarks is for compatibility with older versions of Node.js, but since the WHATWG stuff wasn't introduced until v7, it's perfectly fine to use the newer conventions here.

@jasnell jasnell referenced this pull request Dec 29, 2016
Closed

url: performance improvement in URL implementation #10469

2 of 2 tasks complete
@jasnell jasnell added a commit that referenced this pull request Dec 30, 2016
@joyeecheung @jasnell joyeecheung + jasnell benchmark: add benchmark for WHATWG URL properties
PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
b7fadf0
@jasnell jasnell added a commit that referenced this pull request Dec 30, 2016
@joyeecheung @jasnell joyeecheung + jasnell url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
508d976
@jasnell
Member
jasnell commented Dec 30, 2016

Landed in b7fadf0 and 508d976

@jasnell jasnell closed this Dec 30, 2016
@joyeecheung joyeecheung deleted the joyeecheung:whatwg-url-property-descriptors branch Jan 2, 2017
@joyeecheung joyeecheung added a commit to joyeecheung/node that referenced this pull request Jan 2, 2017
@joyeecheung joyeecheung benchmark: add benchmark for WHATWG URL properties
PR-URL: nodejs#10408
Fixes: nodejs#10376
Reviewed-By: James M Snell <jasnell@gmail.com>
33e8b24
@joyeecheung joyeecheung added a commit to joyeecheung/node that referenced this pull request Jan 2, 2017
@joyeecheung joyeecheung url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

PR-URL: nodejs#10408
Fixes: nodejs#10376
Reviewed-By: James M Snell <jasnell@gmail.com>
204e0be
@evanlucas evanlucas added a commit that referenced this pull request Jan 3, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas benchmark: add benchmark for WHATWG URL properties
PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
789d082
@evanlucas evanlucas added a commit that referenced this pull request Jan 3, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
780702b
@evanlucas evanlucas referenced this pull request Jan 3, 2017
Merged

v7.4.0 release proposal #10589

@evanlucas evanlucas added a commit that referenced this pull request Jan 3, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas benchmark: add benchmark for WHATWG URL properties
PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
fe813ec
@evanlucas evanlucas added a commit that referenced this pull request Jan 3, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
eb4f8b7
@evanlucas evanlucas added a commit that referenced this pull request Jan 4, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas benchmark: add benchmark for WHATWG URL properties
PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
0b2bc5e
@evanlucas evanlucas added a commit that referenced this pull request Jan 4, 2017
@joyeecheung @evanlucas joyeecheung + evanlucas url: make WHATWG URL properties spec compliant
* Set exposed attributes of the interface enumerable
  and configurable, as required by the spec.
  See: https://heycam.github.io/webidl/#es-attributes
* Make sure `URL#searchParams` returns `[[SameObject]]`
* Add the missing `URL#href` setter
* Reorder the properties to match
  https://url.spec.whatwg.org/#api
* Add tests for the ECMAScript property attributes

PR-URL: #10408
Fixes: #10376
Reviewed-By: James M Snell <jasnell@gmail.com>
7fbd12f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment