Skip to content

Commit

Permalink
URL pathname and search setter incorrectly strips trailing spaces
Browse files Browse the repository at this point in the history
https://bugs.webkit.org/show_bug.cgi?id=259080
rdar://112433299

Reviewed by Alex Christensen.

Potentially (maybe) trim opaque paths after a query or fragment has
been removed. This ensures that re-parsing roundtrips, even when URL
component setters are used.

Additionally, preserve trailing C0 for the path and query setters using
a new shared function.

* LayoutTests/imported/w3c/web-platform-tests/url/resources/setters_tests.json:
* LayoutTests/imported/w3c/web-platform-tests/url/url-setters-a-area.window-expected.txt:
* LayoutTests/imported/w3c/web-platform-tests/url/url-setters.any.worker_exclude=(file_javascript_mailto)-expected.txt:
* LayoutTests/imported/w3c/web-platform-tests/url/url-setters.any_exclude=(file_javascript_mailto)-expected.txt:

This is now synchronized up to the latest commit for the url directory:
01b31f8.

* Source/WTF/wtf/URL.cpp:
(WTF::URL::parseAllowingC0AtEnd):
(WTF::URL::setFragmentIdentifier):
(WTF::URL::potentiallyStripTrailingSpacesFromOpaquePath):
(WTF::URL::removeFragmentIdentifier):
(WTF::URL::removeQueryAndFragmentIdentifier):
(WTF::URL::setQuery):
(WTF::URL::setPath):
* Source/WTF/wtf/URL.h:
* Source/WTF/wtf/URLParser.cpp:
(WTF::URLParser::parse):
* Source/WTF/wtf/URLParser.h:

Canonical link: https://commits.webkit.org/266252@main
  • Loading branch information
annevk committed Jul 24, 2023
1 parent 4d88a0a commit cc626f3
Show file tree
Hide file tree
Showing 8 changed files with 114 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2021,6 +2021,24 @@
"href": "sc:/space%20",
"pathname": "/space%20"
}
},
{
"comment": "Trailing space should be encoded",
"href": "http://example.net",
"new_value": " ",
"expected": {
"href": "http://example.net/%20",
"pathname": "/%20"
}
},
{
"comment": "Trailing C0 control should be encoded",
"href": "http://example.net",
"new_value": "\u0000",
"expected": {
"href": "http://example.net/%00",
"pathname": "/%00"
}
}
],
"search": [
Expand Down Expand Up @@ -2141,6 +2159,24 @@
"href": "sc:space #fragment",
"search": ""
}
},
{
"comment": "Trailing space should be encoded",
"href": "http://example.net",
"new_value": " ",
"expected": {
"href": "http://example.net/?%20",
"search": "?%20"
}
},
{
"comment": "Trailing C0 control should be encoded",
"href": "http://example.net",
"new_value": "\u0000",
"expected": {
"href": "http://example.net/?%00",
"search": "?%00"
}
}
],
"hash": [
Expand Down Expand Up @@ -2311,6 +2347,24 @@
"href": "sc:space ?query",
"hash": ""
}
},
{
"comment": "Trailing space should be encoded",
"href": "http://example.net",
"new_value": " ",
"expected": {
"href": "http://example.net/#%20",
"hash": "#%20"
}
},
{
"comment": "Trailing C0 control should be encoded",
"href": "http://example.net",
"new_value": "\u0000",
"expected": {
"href": "http://example.net/#%00",
"hash": "#%00"
}
}
],
"href": [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -434,10 +434,14 @@ PASS <a>: Setting <non-spec:/>.pathname = '//p'
PASS <area>: Setting <non-spec:/>.pathname = '//p'
PASS <a>: Setting <non-spec:/.//>.pathname = 'p' Drop /. from path
PASS <area>: Setting <non-spec:/.//>.pathname = 'p' Drop /. from path
FAIL <a>: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020 assert_equals: expected "data:/space%20" but got "data:/space"
FAIL <area>: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020 assert_equals: expected "data:/space%20" but got "data:/space"
FAIL <a>: Setting <sc:/nospace>.pathname = 'space ' assert_equals: expected "sc:/space%20" but got "sc:/space"
FAIL <area>: Setting <sc:/nospace>.pathname = 'space ' assert_equals: expected "sc:/space%20" but got "sc:/space"
PASS <a>: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020
PASS <area>: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020
PASS <a>: Setting <sc:/nospace>.pathname = 'space '
PASS <area>: Setting <sc:/nospace>.pathname = 'space '
PASS <a>: Setting <http://example.net>.pathname = ' ' Trailing space should be encoded
PASS <area>: Setting <http://example.net>.pathname = ' ' Trailing space should be encoded
PASS <a>: Setting <http://example.net>.pathname = '\0' Trailing C0 control should be encoded
PASS <area>: Setting <http://example.net>.pathname = '\0' Trailing C0 control should be encoded
PASS <a>: Setting <https://example.net#nav>.search = 'lang=fr'
PASS <area>: Setting <https://example.net#nav>.search = 'lang=fr'
PASS <a>: Setting <https://example.net?lang=en-US#nav>.search = 'lang=fr'
Expand Down Expand Up @@ -468,6 +472,10 @@ PASS <a>: Setting <data:space ?query#fragment>.search = '' Do not drop trailing
PASS <area>: Setting <data:space ?query#fragment>.search = '' Do not drop trailing spaces from non-trailing opaque paths
PASS <a>: Setting <sc:space ?query#fragment>.search = ''
PASS <area>: Setting <sc:space ?query#fragment>.search = ''
PASS <a>: Setting <http://example.net>.search = ' ' Trailing space should be encoded
PASS <area>: Setting <http://example.net>.search = ' ' Trailing space should be encoded
PASS <a>: Setting <http://example.net>.search = '\0' Trailing C0 control should be encoded
PASS <area>: Setting <http://example.net>.search = '\0' Trailing C0 control should be encoded
PASS <a>: Setting <https://example.net>.hash = 'main'
PASS <area>: Setting <https://example.net>.hash = 'main'
PASS <a>: Setting <https://example.net#nav>.hash = 'main'
Expand Down Expand Up @@ -510,6 +518,10 @@ PASS <a>: Setting <data:space ?query#fragment>.hash = '' Do not drop trailing s
PASS <area>: Setting <data:space ?query#fragment>.hash = '' Do not drop trailing spaces from non-trailing opaque paths
PASS <a>: Setting <sc:space ?query#fragment>.hash = ''
PASS <area>: Setting <sc:space ?query#fragment>.hash = ''
PASS <a>: Setting <http://example.net>.hash = ' ' Trailing space should be encoded
PASS <area>: Setting <http://example.net>.hash = ' ' Trailing space should be encoded
PASS <a>: Setting <http://example.net>.hash = '\0' Trailing C0 control should be encoded
PASS <area>: Setting <http://example.net>.hash = '\0' Trailing C0 control should be encoded
PASS <a>: Setting <file:///var/log/system.log>.href = 'http://0300.168.0xF0'
PASS <area>: Setting <file:///var/log/system.log>.href = 'http://0300.168.0xF0'

Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,10 @@ PASS URL: Setting <non-spec:/>.pathname = '/.//p' Serialize /. in path
PASS URL: Setting <non-spec:/>.pathname = '/..//p'
PASS URL: Setting <non-spec:/>.pathname = '//p'
PASS URL: Setting <non-spec:/.//>.pathname = 'p' Drop /. from path
FAIL URL: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020 assert_equals: expected "data:/space%20" but got "data:/space"
FAIL URL: Setting <sc:/nospace>.pathname = 'space ' assert_equals: expected "sc:/space%20" but got "sc:/space"
PASS URL: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020
PASS URL: Setting <sc:/nospace>.pathname = 'space '
PASS URL: Setting <http://example.net>.pathname = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.pathname = '\0' Trailing C0 control should be encoded
PASS URL: Setting <https://example.net#nav>.search = 'lang=fr'
PASS URL: Setting <https://example.net?lang=en-US#nav>.search = 'lang=fr'
PASS URL: Setting <https://example.net?lang=en-US#nav>.search = '?lang=fr'
Expand All @@ -203,6 +205,8 @@ PASS URL: Setting <data:space ?query>.search = '' Drop trailing spaces from trai
PASS URL: Setting <sc:space ?query>.search = ''
PASS URL: Setting <data:space ?query#fragment>.search = '' Do not drop trailing spaces from non-trailing opaque paths
PASS URL: Setting <sc:space ?query#fragment>.search = ''
PASS URL: Setting <http://example.net>.search = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.search = '\0' Trailing C0 control should be encoded
PASS URL: Setting <https://example.net>.hash = 'main'
PASS URL: Setting <https://example.net#nav>.hash = 'main'
PASS URL: Setting <https://example.net?lang=en-US>.hash = '##nav'
Expand All @@ -223,4 +227,6 @@ PASS URL: Setting <data:space
PASS URL: Setting <sc:space #fragment>.hash = ''
PASS URL: Setting <data:space ?query#fragment>.hash = '' Do not drop trailing spaces from non-trailing opaque paths
PASS URL: Setting <sc:space ?query#fragment>.hash = ''
PASS URL: Setting <http://example.net>.hash = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.hash = '\0' Trailing C0 control should be encoded

Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,10 @@ PASS URL: Setting <non-spec:/>.pathname = '/.//p' Serialize /. in path
PASS URL: Setting <non-spec:/>.pathname = '/..//p'
PASS URL: Setting <non-spec:/>.pathname = '//p'
PASS URL: Setting <non-spec:/.//>.pathname = 'p' Drop /. from path
FAIL URL: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020 assert_equals: expected "data:/space%20" but got "data:/space"
FAIL URL: Setting <sc:/nospace>.pathname = 'space ' assert_equals: expected "sc:/space%20" but got "sc:/space"
PASS URL: Setting <data:/nospace>.pathname = 'space ' Non-special URLs with non-opaque paths percent-encode U+0020
PASS URL: Setting <sc:/nospace>.pathname = 'space '
PASS URL: Setting <http://example.net>.pathname = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.pathname = '\0' Trailing C0 control should be encoded
PASS URL: Setting <https://example.net#nav>.search = 'lang=fr'
PASS URL: Setting <https://example.net?lang=en-US#nav>.search = 'lang=fr'
PASS URL: Setting <https://example.net?lang=en-US#nav>.search = '?lang=fr'
Expand All @@ -203,6 +205,8 @@ PASS URL: Setting <data:space ?query>.search = '' Drop trailing spaces from trai
PASS URL: Setting <sc:space ?query>.search = ''
PASS URL: Setting <data:space ?query#fragment>.search = '' Do not drop trailing spaces from non-trailing opaque paths
PASS URL: Setting <sc:space ?query#fragment>.search = ''
PASS URL: Setting <http://example.net>.search = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.search = '\0' Trailing C0 control should be encoded
PASS URL: Setting <https://example.net>.hash = 'main'
PASS URL: Setting <https://example.net#nav>.hash = 'main'
PASS URL: Setting <https://example.net?lang=en-US>.hash = '##nav'
Expand All @@ -223,4 +227,6 @@ PASS URL: Setting <data:space
PASS URL: Setting <sc:space #fragment>.hash = ''
PASS URL: Setting <data:space ?query#fragment>.hash = '' Do not drop trailing spaces from non-trailing opaque paths
PASS URL: Setting <sc:space ?query#fragment>.hash = ''
PASS URL: Setting <http://example.net>.hash = ' ' Trailing space should be encoded
PASS URL: Setting <http://example.net>.hash = '\0' Trailing C0 control should be encoded

26 changes: 23 additions & 3 deletions Source/WTF/wtf/URL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -585,6 +585,11 @@ void URL::parse(String&& string)
*this = URLParser(WTFMove(string)).result();
}

void URL::parseAllowingC0AtEnd(String&& string)
{
*this = URLParser(WTFMove(string), { }, URLTextEncodingSentinelAllowingC0AtEnd).result();
}

void URL::remove(unsigned start, unsigned length)
{
if (!length)
Expand Down Expand Up @@ -653,7 +658,15 @@ void URL::setFragmentIdentifier(StringView identifier)
if (!m_isValid)
return;

*this = URLParser(makeString(StringView(m_string).left(m_queryEnd), '#', identifier), { }, URLTextEncodingSentinelAllowingC0AtEndOfHash).result();
parseAllowingC0AtEnd(makeString(StringView(m_string).left(m_queryEnd), '#', identifier));
}

void URL::maybeTrimTrailingSpacesFromOpaquePath()
{
if (!m_isValid || !hasOpaquePath() || hasFragmentIdentifier() || hasQuery())
return;

parse(makeString(StringView(m_string).left(m_pathEnd)));
}

void URL::removeFragmentIdentifier()
Expand All @@ -662,6 +675,8 @@ void URL::removeFragmentIdentifier()
return;

m_string = m_string.left(m_queryEnd);

maybeTrimTrailingSpacesFromOpaquePath();
}

void URL::removeQueryAndFragmentIdentifier()
Expand All @@ -671,6 +686,8 @@ void URL::removeQueryAndFragmentIdentifier()

m_string = m_string.left(m_pathEnd);
m_queryEnd = m_pathEnd;

maybeTrimTrailingSpacesFromOpaquePath();
}

void URL::setQuery(StringView newQuery)
Expand All @@ -681,12 +698,15 @@ void URL::setQuery(StringView newQuery)
if (!m_isValid)
return;

parse(makeString(
parseAllowingC0AtEnd(makeString(
StringView(m_string).left(m_pathEnd),
(!newQuery.startsWith('?') && !newQuery.isNull()) ? "?"_s : ""_s,
newQuery,
StringView(m_string).substring(m_queryEnd)
));

if (newQuery.isNull())
maybeTrimTrailingSpacesFromOpaquePath();
}

static String escapePathWithoutCopying(StringView path)
Expand All @@ -702,7 +722,7 @@ void URL::setPath(StringView path)
if (!m_isValid)
return;

parse(makeString(
parseAllowingC0AtEnd(makeString(
StringView(m_string).left(pathStart()),
path.startsWith('/') || (path.startsWith('\\') && (hasSpecialScheme() || protocolIsFile())) || (!hasSpecialScheme() && path.isEmpty() && m_schemeEnd + 1U < pathStart()) ? ""_s : "/"_s,
!hasSpecialScheme() && host().isEmpty() && path.startsWith("//"_s) && path.length() > 2 ? "/."_s : ""_s,
Expand Down
3 changes: 3 additions & 0 deletions Source/WTF/wtf/URL.h
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,9 @@ class URL {
unsigned credentialsEnd() const;
void remove(unsigned start, unsigned length);
void parse(String&&);
void parseAllowingC0AtEnd(String&&);

void maybeTrimTrailingSpacesFromOpaquePath();

friend WTF_EXPORT_PRIVATE bool protocolHostAndPortAreEqual(const URL&, const URL&);

Expand Down
2 changes: 1 addition & 1 deletion Source/WTF/wtf/URLParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1126,7 +1126,7 @@ void URLParser::parse(const CharacterType* input, const unsigned length, const U
Vector<UChar> queryBuffer;

unsigned endIndex = length;
if (UNLIKELY(nonUTF8QueryEncoding == URLTextEncodingSentinelAllowingC0AtEndOfHash))
if (UNLIKELY(nonUTF8QueryEncoding == URLTextEncodingSentinelAllowingC0AtEnd))
nonUTF8QueryEncoding = nullptr;
else {
while (UNLIKELY(endIndex && isC0ControlOrSpace(input[endIndex - 1]))) {
Expand Down
2 changes: 1 addition & 1 deletion Source/WTF/wtf/URLParser.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ class URLParser {
// For host names bigger than this, we won't do IDN encoding, which is almost certainly OK.
constexpr static size_t hostnameBufferLength = 2048;

#define URLTextEncodingSentinelAllowingC0AtEndOfHash reinterpret_cast<const URLTextEncoding*>(-1)
#define URLTextEncodingSentinelAllowingC0AtEnd reinterpret_cast<const URLTextEncoding*>(-1)

WTF_EXPORT_PRIVATE static bool allValuesEqual(const URL&, const URL&);
WTF_EXPORT_PRIVATE static bool internalValuesConsistent(const URL&);
Expand Down

0 comments on commit cc626f3

Please sign in to comment.