New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to WHATWG URL parser in http.request (and friends) #19468

Closed
TimothyGu opened this Issue Mar 20, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@TimothyGu
Copy link
Member

TimothyGu commented Mar 20, 2018

  • Version: master
  • Platform: all
  • Subsystem: http

Currently, http.request('http://brave.com%60x.code-fu.org/') requests http://brave.com/%60x.code-fu.org/ rather than http://brave.com`x.code-fu.org/. This behavior deviates from the behavior standardized in WHATWG URL Standard and used in browsers, and have caused dangerous security implications for downstream embedders (see talk by @diracdeltas).

This is due to the http.request function using the legacy url.parse function rather than the new WHATWG-compliant URL parser. We should switch the URL parser used for string-typed argument to the standard-complaint parser.

This switch will surely have compatibility implications, but I doubt it will cause major breakage since most request-style libraries pass in an object instead of a string as the first argument anyway.

/cc @annevk @BrendanEich (See original tweet.)

@jasnell

This comment has been minimized.

Copy link
Member

jasnell commented Mar 20, 2018

+1 on this. This was my intent originally when introducing the WHATWG URL parser but I wanted to give the new implementation time to be stable before introducing the change.

There are some hidden gotchas in here. The net code is currently written to assume url.parse() is used and is currently set up at multiple levels to consume the parsed object it provides as opposed to the actual URL class produced by the new parser.

@TimothyGu

This comment has been minimized.

Copy link
Member

TimothyGu commented Mar 20, 2018

@jasnell I was thinking we could just convert the created URL object to the internally used Url format right after parsing, using the conversion function already available to us. That wouldn’t cause as much churn.

@Hackzzila

This comment has been minimized.

Copy link
Contributor

Hackzzila commented Apr 25, 2018

I have implemented this in #20270. I think it would be good improvement to v11. By that time the WHATWG URL parser will have had about 18 months to stabilize.

@BridgeAR BridgeAR closed this in 564048d Apr 29, 2018

isurusiri added a commit to isurusiri/node that referenced this issue Apr 30, 2018

http,https,tls: switch to WHATWG URL parser
This switches the url parser from `url.parse()` to the WHATWG URL
parser while keeping `url.parse()` as fallback.

Also add tests for invalid url deprecations and correct hostname
checks.

PR-URL: nodejs#20270
Fixes: nodejs#19468
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment