Skip to content

serialize() should escape '<' and '>' in attributes, according to the HTML Standard #696

@pizzacat83

Description

@pizzacat83

The current implementation of html5ever::serialize::serialize does not escape the characters < and > within attribute values.

Historically this was the behavior required by the HTML Standard, but recently there was a change in the HTML Standard to mandate escaping < and > in attributes, to mitigate a class of XSS attacks.

As a result, the output of html5ever::serialize::serialize does not conform to the latest HTML Standard.

Expected behavior

<span id="<>"></span> should be serialized as <span id="&lt;&gt;"></span>.

The following test (for rcdom/tests/html-serializer.rs) demonstrates this.

test!(
    attribute_ltgt,
    r#"<span id="<>"></span>"#,
    r#"<span id="&lt;&gt;"></span>"#
);

However, the above test fails in the current implementation, and the actual serialized output is <span id="<>"></span>.

Note that Firefox (release note) and Chrome (release note) adhere to this expected behavior.

Proposed fix

By applying the patch below, the implementation becomes conformant with the above change of the HTML Standard, and the above test passes.

diff --git a/html5ever/src/serialize/mod.rs b/html5ever/src/serialize/mod.rs
index 710066a..eddca78 100644
--- a/html5ever/src/serialize/mod.rs
+++ b/html5ever/src/serialize/mod.rs
@@ -107,8 +107,8 @@ impl<Wr: Write> HtmlSerializer<Wr> {
                 '&' => self.writer.write_all(b"&amp;"),
                 '\u{00A0}' => self.writer.write_all(b"&nbsp;"),
                 '"' if attr_mode => self.writer.write_all(b"&quot;"),
-                '<' if !attr_mode => self.writer.write_all(b"&lt;"),
-                '>' if !attr_mode => self.writer.write_all(b"&gt;"),
+                '<' => self.writer.write_all(b"&lt;"),
+                '>' => self.writer.write_all(b"&gt;"),
                 c => self.writer.write_fmt(format_args!("{c}")),
             }?;
         }

I am willing to submit a PR with the fix and the test case. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions