Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute value serialization does not take whitespace normalization into account #59

Open
bwrrp opened this issue Jan 7, 2020 · 1 comment

Comments

@bwrrp
Copy link

bwrrp commented Jan 7, 2020

The steps for "to serialize an attribute value" only escape the characters ", &, < and > in the attribute value. White space characters are passed through to the serialization as-is. However, XML processors will replace each space, tab, carriage return or line feed character with a space according to https://www.w3.org/TR/xml11/#AVNormalize unless the character was present as a character reference. It seems therefore that the attribute value serialization algorithm should include a step mapping tab to &#9;, carriage return to &#xD; and line feed to &#xA;.

Testing this in various browsers shows that these already apply a similar substitution:

new XMLSerializer().serializeToString(
    new DOMParser().parseFromString('<root attr="&#x20;&#x9;&#xD;&#xA;"/>', 'text/xml')
)
// <root attr=" &#9;&#xD;&#xA;"/> in Firefox
// <root attr=" &#9;&#13;&#10;"/> in Edge / Chrome

The algorithm as described in this specification would generate <root attr=" \t\r\n"/> (where \t \r and \n represent tab, carriage return and line feed respectively). Only Safari seems to follow the specification here. Unfortunately, this serialization does not survive a round-trip, as it is normalized to four spaces by processors such as the DOMParser:

new XMLSerializer().serializeToString(
    new DOMParser().parseFromString('<root attr=" \t\r\n"/>', 'text/xml')
)
// <root attr="   "/>
@cscott
Copy link

cscott commented Jul 2, 2021

wmfgerrit pushed a commit to wikimedia/mediawiki-libs-Dodo that referenced this issue Jul 3, 2021
webkit-commit-queue pushed a commit to WebKit/WebKit that referenced this issue Jul 11, 2021
https://bugs.webkit.org/show_bug.cgi?id=227844

Reviewed by Darin Adler.

LayoutTests/imported/w3c:

Rebaseline WPT test now that one more subtest is passing.

* web-platform-tests/domparsing/XMLSerializer-serializeToString-expected.txt:

Source/WebCore:

XMLSerializer.serializeToString() doesn't properly escape \n, \n and \t.

This is causing the "check XMLSerializer.serializeToString escapes attribute values for roundtripping" subtest to fail in WebKit on:
http://wpt.live/domparsing/XMLSerializer-serializeToString.html

Chrome and Firefox both escape these and pass this WPT subtest.

The specification does not indicate we should escape those:
- https://w3c.github.io/DOM-Parsing/#dfn-serializing-an-attribute-value
But there is an open bug about this:
- w3c/DOM-Parsing#59

No new tests, rebaselined existing test.

* editing/MarkupAccumulator.cpp:
(WebCore::elementCannotHaveEndTag):
* editing/MarkupAccumulator.h:


Canonical link: https://commits.webkit.org/239576@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@279815 268f45cc-cd09-0410-ab3c-d52691b4dbfc
bertogg pushed a commit to Igalia/webkit that referenced this issue Jul 12, 2021
https://bugs.webkit.org/show_bug.cgi?id=227844

Reviewed by Darin Adler.

LayoutTests/imported/w3c:

Rebaseline WPT test now that one more subtest is passing.

* web-platform-tests/domparsing/XMLSerializer-serializeToString-expected.txt:

Source/WebCore:

XMLSerializer.serializeToString() doesn't properly escape \n, \n and \t.

This is causing the "check XMLSerializer.serializeToString escapes attribute values for roundtripping" subtest to fail in WebKit on:
http://wpt.live/domparsing/XMLSerializer-serializeToString.html

Chrome and Firefox both escape these and pass this WPT subtest.

The specification does not indicate we should escape those:
- https://w3c.github.io/DOM-Parsing/#dfn-serializing-an-attribute-value
But there is an open bug about this:
- w3c/DOM-Parsing#59

No new tests, rebaselined existing test.

* editing/MarkupAccumulator.cpp:
(WebCore::elementCannotHaveEndTag):
* editing/MarkupAccumulator.h:


git-svn-id: http://svn.webkit.org/repository/webkit/trunk@279815 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants