Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider always escaping U+0020 #125

Closed
zcorpan opened this issue May 27, 2016 · 7 comments
Closed

Consider always escaping U+0020 #125

zcorpan opened this issue May 27, 2016 · 7 comments

Comments

@zcorpan
Copy link
Member

zcorpan commented May 27, 2016

In ad3660f U+0020 changed from being escaped to be not escaped for schemes like data javascript mailto etc. Rationale seems to be that it's what some browsers did back then, per http://krijnhoetmer.nl/irc-logs/whatwg/20121026#l-573

Test http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4238

It appears that Gecko still escapes space in all of these cases. Are there known Web-compat problems for Gecko because of this?

Round-tripping the space can be problematic because space is like the only character that really separates a URL from the next or from the following text. Features like ping="" and srcset="" will get bogus values if someone assumes that a parse-serialization roundtrip will eliminate spaces from a URL (and it typically does).

@annevk
Copy link
Member

annevk commented May 27, 2016

Can you or @sleevi fix Chromium? Need to have at least two browsers for such a change I think.

@zcorpan
Copy link
Member Author

zcorpan commented May 27, 2016

cc @tyoshino

@annevk
Copy link
Member

annevk commented Oct 19, 2016

@achristensen07 any thoughts on this issue?

@achristensen07
Copy link
Collaborator

alert(new URL("foo:a\u001F\u0020b")); shows that Chrome and Safari do not escape spaces according to the spec. I hope to keep this to minimize change.

@annevk
Copy link
Member

annevk commented Feb 9, 2017

Okay, so per the URL standard we use the "simple encode set" (which doesn't include U+0020) for hosts (doesn't matter since hosts forbid spaces anyway), cannot-be-a-base-URL paths, and fragments.

For both cannot-be-a-base-URL paths and fragments the results are identical: Chrome, Edge, and Safari do not encode U+0020, Firefox does (although Firefox does not for data URLs). I tested various schemes, including https, test, data, and x.

Given this I wonder if @zcorpan has made any progress convincing a non-Firefox browser that this is a problem. Otherwise I think we should consider this a lost cause.

@zcorpan
Copy link
Member Author

zcorpan commented Feb 10, 2017

I have not. Unless @achristensen07 has a change of heart, let's close. 🙁

@annevk
Copy link
Member

annevk commented Feb 10, 2017

I tested Firefox unfairly for data URLs. Firefox has data URLs implemented in the URL parser and therefore it would reject input such as data:test test and echo that back out. If you instead give data:,test test#test test as input, it will encode the spaces as %20.

annevk added a commit to web-platform-tests/wpt that referenced this issue Feb 13, 2017
Also update the description of two tests.

Closes whatwg/url#125.
domenic pushed a commit to web-platform-tests/wpt that referenced this issue Feb 13, 2017
Also update the description of two tests.

Closes whatwg/url#125.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants