Harden protection against mutation XSS caused by namespace switching #495

securityMB · 2020-12-17T11:18:46Z

This pull request is meant to add a substantial defense mechanism against instances of mutation XSS that happened within the last year as well as against (not yet disclosed) bugs exploiting differences between fragment parsing and document parsing modes.

Background & Context

One year ago I reported a DOMPurify bypass that exploited a parser bug (or actually a spec bug) in Chromium and Safari. In a nutshell, the following snippet of html:

<svg></p>

was parsed into the following DOM tree:

┗ svg svg
  ┗ html p

(in all DOM trees, I'm using a notation that all tag names are prepended by their namespace).

When the same DOM tree is serialized, it has the following form:

<svg><p></p></svg>

When the snippet is parsed again (so we have a roundtrip: DOM tree -> markup -> DOM tree), it is parsed differently:

┣ svg svg
┗ html p

This is defined in HTML spec that if we're inside foreign content (that has special parsing rules, and is open by either <svg> or <math>. The rules is that a specific list of tags close foreign content and go back to latest element in HTML namespace or MathML text integration point or HTML or HTML integration point (more on that later).

Because DOMPurify is usually used the following way:

element.innerHTML = DOMPurify.sanitize(unsafeMarkup)

the round-trip of parsing into DOM tree, serializing and parsing again happens in this case.

The issue was with the following markup:

<svg></p><style><a title="</style><img src onerror=alert(1)>">

which is parsed into the following DOM tree:

┗ svg svg
  ┣ html p
  ┗ svg style
    ┗ svg a title="</style><img src onerror=alert(1)">

DOMPurify considers the DOM tree harmless because the XSS payload is fully within title. Then it proceeds to serialize it to:

<svg><p></p><style><a title="</style><img src onerror=alert(1)>"></style></svg>

After reparsing, a completely different DOM tree is created:

┣ svg svg
┣ html p
┣ html style
┃ ┗ #text: <a title="
┣ html img src="" onerror="alert(1)"
┗ #text: ">

Because the foreign content was escaped when <p> was encountered, all subsequent tags were also parsed in HTML namespace, leading to dangerous <img> tag in the DOM tree.

Now let's have a look this year's DOMPurify bypass via namespace confusion. The payload was:

<form><math><mtext></form><form><mglyph><style></math><img src onerror=alert(1)>

It was parsed into the following DOM tree:

┗ html form
  ┗ math math
    ┗ math mtext
      ┗ html form
        ┗ html mglyph
          ┗ html style
            ┗ #text: </math><img src onerror=alert(1)>

The namespace is switched from math to html on <mtext> because it is one of MathML text integration points whose purpose is to be able to embed html elements within MathML.

The DOM tree is again harmless according to DOMPurify as all elements are in the allow-list, and then serialized to:

<form><math><mtext><form><mglyph><style></math><img src onerror=alert(1)></style></mglyph></form></mtext></math></form>

The markup contains an embedded form element which is disallowed by HTML spec. Hence, when this markup is parsed again, it yields a different DOM tree:

┣ html form
┃ ┗ math math
┃   ┗ math mtext
┃     ┗ math mglyph
┃       ┗ math style
┗ html img src="" onerror="alert(1)"

Note that <mglyph> was initially in HTML namespace but now it switched to MathML namespace making all subsequent tags to be parsed differently. This made it possible to close <math> and to inject XSS payload.

Even though these two examples of payloads abused different mechanics to bypass DOMPurify, I'd argue that their root cause was the same; that is, in both cases the DOM tree that DOMPurify worked on had elements in unexpected namespaces.

In the first case, there was a <html p> element which was a direct child of <svg svg> or <math math>. This can never happen in a spec-compliant parser. The only way for a child element in foreign content to have a different namespace than its parent is via well defined MathML text integration points or HTML integration points. If it happens via any other tag, then this element should be deleted.

In the first case, the initial parsing yielded <html mglyph>. Even though mglyph is in the default allow-list, it should never be in HTML namespace. Hence, it should also be dropped, rendering the whole bypass impossible.

In this pull request, I'm proposing to add a code that adds a _checkValidNamespace function whose job is to ensure that:

there are not unexpected namespace switches
elements specific to SVG or MathML can only appear in their respective namespaces.

Here's a short overview of inner workings of the function:

If current element is in SVG namespace and the parent is in HTML namespace, then we ensure that current tag name is svg as it is the only way to make this switch.
If current element is in SVG namespace and the parent is in MathML namespace, then we ensure that the parent is MathML text integration point.
If current element is in MathML namespace and the parent is in HTML namespace, then we ensure that current tag name is math as it is the only way to make this switch.
If current element is in MathML namespace and the parent is in SVG namespace, then we ensure that current tag name is math as it is the only way to make this switch.
If current element is in HTML namespace and the parent is in another namespace, we ensure that the parent is MathML text integration point or HTML integration point.
For SVG and MathML elements we ensure that the element name is defined in the respective spec.
For HTML elements we ensure that the element name is not specific to SVG or MathML. If it is, then the element is dropped.

The second issue this pull request protect against is the differences between document parsing mode and fragment parsing mode. DOMPurify uses document parsing mode (via new DOMParser().parseFromString()) but it could reparse the resulting DOM tree in fragment parsing mode via insertAdjacentHTML. This can lead to bypasses if the result is assigned to sinks like srcdoc. In this pull request, I got rid of insertAdjacentHTML completely and work on DOM nodes directly.

Tasks

Because this pull request is a pretty significant change to DOMPurify, I'd love a double check from Mario and/or Masato and anyone else who'd like to test it :) All tests are passing so this should be fine in theory but it always worth it to check.

Comments are welcome :)

since the new approach should fix them anyway

…ORBID_CONTENTS

…_FOR_JQUERY which we don't want to be back

…ever occur inside svg

into main

DOMPurify 2.2.6 comes with a new hardening to defend against mXSS. Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6 PR introducing the new mechanism: cure53/DOMPurify#495 Change-Id: I9ac0534d5bd200fc6db0a706ab202a029f8ae20c

DOMPurify 2.2.6 comes with a new hardening to defend against mXSS. Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6 PR introducing the new mechanism: cure53/DOMPurify#495 Change-Id: I4a499eb8efa227460040a1630ff7760f96373de8

DOMPurify 2.2.6 comes with a new hardening to defend against mXSS. Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6 PR introducing the new mechanism: cure53/DOMPurify#495 Change-Id: Icab6ee5153969207195a67395ed5b5ec9cb134b0

securitum-mb added 19 commits December 15, 2020 17:47

Get rid of insertAdjacentHTML to avoid reparsing

43b0784

Delete latest mXSS and namespace confusion fixes

b30d7ec

since the new approach should fix them anyway

Add initial version of namespace checker

178b43e

Delete elements that are not truly SVG

1444e7e

Tests assumed that noscript contents should be deleted. Added it to F…

963945e

…ORBID_CONTENTS

Restore the original namespace confusion check as it also killed SAFE…

4a93eac

…_FOR_JQUERY which we don't want to be back

Fix test 35 -<line> is only allowed in SVG

1a7ef88

Fix mXSS test; because of the new namespace checks, textarea should n…

e9fbca6

…ever occur inside svg

Fix a bunch of mXSS tests

d2eba54

Experiment with the element removal behavior

8179daf

Fix another two mXSS tests

bce6bad

Change svgFilters test so that it also requires svg in allowed tags

4ab4ff8

Harden node removal against DOM clobbering

8110d44

Add a bunch of tests to check namespace enforcement

ccc2d31

Experiment with anticlobber approach

0d42de0

Fix a terrible mistake in anti-clobber

6b2b871

Another fix in anti-clobber: getChildNodes -> childNodes

21baa58

Move anti-clobber to purify.js

e8c8e89

Merge branch 'main' of https://github.com/cure53/DOMPurify

808cab3

into main

cure53 merged commit 9ee3d95 into cure53:main Dec 17, 2020

securityMB mentioned this pull request Feb 1, 2021

Consider whether DOM serialization is sufficient WICG/sanitizer-api#56

Closed

This was referenced Apr 25, 2021

svg elements that not contained in <svg> been removed after 2.2.6 #533

Closed

feat: add NAMESPACE config #534

Merged

otherdaniel mentioned this pull request Nov 10, 2021

SVG and MathML WICG/sanitizer-api#137

Merged

This was referenced Jun 19, 2023

Bypass using mXSS LavaMoat/snow#91

Closed

Fix issue 91 LavaMoat/snow#106

Closed

luxaritas mentioned this pull request Nov 17, 2023

[DRAFT] Support happy-dom #878

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden protection against mutation XSS caused by namespace switching #495

Harden protection against mutation XSS caused by namespace switching #495

securityMB commented Dec 17, 2020

Harden protection against mutation XSS caused by namespace switching #495

Harden protection against mutation XSS caused by namespace switching #495

Conversation

securityMB commented Dec 17, 2020

Background & Context

Tasks