Skip to content

Commit

Permalink
REGRESSION (STP): Video won't play on bilibili.com
Browse files Browse the repository at this point in the history
https://bugs.webkit.org/show_bug.cgi?id=263196
rdar://117020123

Reviewed by Chris Dumez.

bilibili injects DOM objects in the tree by calling:
```
var t = (new DOMParser).parseFromString(e, "text/html");
return document.adoptNode(t.body.firstChild)
```

Per spec, when parsing HTML to create a document through DOMParser::parseFromString() [1]
you are to skip the whitespace tokens for the "initial" insertion mode [2].
The trailing spaces however must be kept [3]
This behaviour was regressed by 267202@main.

On this site, the observable behaviour was that the firstChild became a "#text"
node rather the expected first DIV one.

So we trim the leading whitespaces before calling the fast parser. Which
restore the original behaviour prior 267202@main

[1] https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#parse-html-from-a-string
[2] https://html.spec.whatwg.org/multipage/parsing.html#the-initial-insertion-mode
[3] https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-afterbody

* LayoutTests/fast/dom/document-contentType-DOMParser-expected.txt:
* LayoutTests/fast/dom/document-contentType-DOMParser.html: Added tests.
* Source/WebCore/html/parser/HTMLDocumentParserFastPath.cpp:
(WebCore::tryFastParsingHTMLFragment):
* Source/WebCore/html/parser/HTMLDocumentParserFastPath.h: Change type to StringView
(WebCore::requires):
* Source/WebCore/xml/DOMParser.cpp:
(WebCore::DOMParser::parseFromString):

Canonical link: https://commits.webkit.org/269457@main
  • Loading branch information
jyavenard committed Oct 18, 2023
1 parent ab95121 commit 55a2864
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ PASS new DOMParser().parseFromString(xslContent, "text/xsl").contentType threw e
PASS new DOMParser().parseFromString(xmlContent, "text/dummy+xml").contentType threw exception TypeError: Type error.
PASS new DOMParser().parseFromString(xmlContent, "text/XML").contentType threw exception TypeError: Type error.
PASS new DOMParser().parseFromString(htmlContent, "TEXT/html").contentType threw exception TypeError: Type error.
PASS parsedContent.body.firstChild.nodeName is "DIV"
PASS parsedContent.body.childNodes.length is 2
PASS parsedContent.body.childNodes[1].nodeName is "#text"
PASS div.firstChild.nodeName is "#text"
PASS div.childNodes.length is 3
PASS div.childNodes[2].nodeName is "#text"
PASS new DOMParser().parseFromString(htmlContentWithJustSpaces, "text/html").body.childNodes.length is 0
PASS successfullyParsed is true

TEST COMPLETE
Expand Down
17 changes: 16 additions & 1 deletion LayoutTests/fast/dom/document-contentType-DOMParser.html
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
</head>
<body>
<script>

var htmlContent =
"<html>" +
"<head>" +
Expand Down Expand Up @@ -80,6 +80,21 @@
shouldThrow('new DOMParser().parseFromString(xmlContent, "text/XML").contentType', "'TypeError: Type error'");
shouldThrow('new DOMParser().parseFromString(htmlContent, "TEXT/html").contentType', "'TypeError: Type error'");

var htmlContentWithSpace = "\n <div class=\"bpx-player-container\"></div> \n";
var parsedContent = new DOMParser().parseFromString(htmlContentWithSpace, "text/html");
shouldBeEqualToString('parsedContent.body.firstChild.nodeName', 'DIV');
shouldBe('parsedContent.body.childNodes.length', '2');
shouldBeEqualToString('parsedContent.body.childNodes[1].nodeName', '#text');

var div = document.createElement("div");
div.innerHTML = htmlContentWithSpace;
shouldBeEqualToString('div.firstChild.nodeName', '#text');
shouldBe('div.childNodes.length', '3');
shouldBeEqualToString('div.childNodes[2].nodeName', '#text');

var htmlContentWithJustSpaces = "\n \n";
shouldBe('new DOMParser().parseFromString(htmlContentWithJustSpaces, "text/html").body.childNodes.length', '0');

</script>
<script src="../../resources/js-test-post.js"></script>
</body>
2 changes: 1 addition & 1 deletion Source/WebCore/html/parser/HTMLDocumentParserFastPath.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -895,7 +895,7 @@ static bool tryFastParsingHTMLFragmentImpl(const std::span<const CharacterType>&
return parser.parse(contextElement);
}

bool tryFastParsingHTMLFragment(const String& source, Document& document, ContainerNode& destinationParent, Element& contextElement, OptionSet<ParserContentPolicy> policy)
bool tryFastParsingHTMLFragment(StringView source, Document& document, ContainerNode& destinationParent, Element& contextElement, OptionSet<ParserContentPolicy> policy)
{
if (!canUseFastPath(contextElement, policy))
return false;
Expand Down
2 changes: 1 addition & 1 deletion Source/WebCore/html/parser/HTMLDocumentParserFastPath.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ class Element;

enum class ParserContentPolicy : uint8_t;

WEBCORE_EXPORT bool tryFastParsingHTMLFragment(const String& source, Document&, ContainerNode&, Element& contextElement, OptionSet<ParserContentPolicy>);
WEBCORE_EXPORT bool tryFastParsingHTMLFragment(StringView source, Document&, ContainerNode&, Element& contextElement, OptionSet<ParserContentPolicy>);

} // namespace WebCore

2 changes: 1 addition & 1 deletion Source/WebCore/xml/DOMParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ ExceptionOr<Ref<Document>> DOMParser::parseFromString(const String& string, cons
bool usedFastPath = false;
if (contentType == "text/html"_s) {
auto body = HTMLBodyElement::create(document);
usedFastPath = tryFastParsingHTMLFragment(string, document, body, body, parsingOptions);
usedFastPath = tryFastParsingHTMLFragment(StringView { string }.substring(string.find(isNotASCIIWhitespace<UChar>)), document, body, body, parsingOptions);
if (LIKELY(usedFastPath)) {
auto html = HTMLHtmlElement::create(document);
document->appendChild(html);
Expand Down

0 comments on commit 55a2864

Please sign in to comment.