Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harden protection against mutation XSS caused by namespace switching #495

Merged
merged 19 commits into from Dec 17, 2020
Merged

Harden protection against mutation XSS caused by namespace switching #495

merged 19 commits into from Dec 17, 2020

Conversation

securityMB
Copy link

This pull request is meant to add a substantial defense mechanism against instances of mutation XSS that happened within the last year as well as against (not yet disclosed) bugs exploiting differences between fragment parsing and document parsing modes.

Background & Context

One year ago I reported a DOMPurify bypass that exploited a parser bug (or actually a spec bug) in Chromium and Safari. In a nutshell, the following snippet of html:

<svg></p>

was parsed into the following DOM tree:

┗ svg svg
  ┗ html p

(in all DOM trees, I'm using a notation that all tag names are prepended by their namespace).

When the same DOM tree is serialized, it has the following form:

<svg><p></p></svg>

When the snippet is parsed again (so we have a roundtrip: DOM tree -> markup -> DOM tree), it is parsed differently:

┣ svg svg
┗ html p

This is defined in HTML spec that if we're inside foreign content (that has special parsing rules, and is open by either <svg> or <math>. The rules is that a specific list of tags close foreign content and go back to latest element in HTML namespace or MathML text integration point or HTML or HTML integration point (more on that later).

Because DOMPurify is usually used the following way:

element.innerHTML = DOMPurify.sanitize(unsafeMarkup)

the round-trip of parsing into DOM tree, serializing and parsing again happens in this case.

The issue was with the following markup:

<svg></p><style><a title="</style><img src onerror=alert(1)>">

which is parsed into the following DOM tree:

┗ svg svg
  ┣ html p
  ┗ svg style
    ┗ svg a title="</style><img src onerror=alert(1)">

DOMPurify considers the DOM tree harmless because the XSS payload is fully within title. Then it proceeds to serialize it to:

<svg><p></p><style><a title="</style><img src onerror=alert(1)>"></style></svg>

After reparsing, a completely different DOM tree is created:

┣ svg svg
┣ html p
┣ html style
┃ ┗ #text: <a title="
┣ html img src="" onerror="alert(1)"
┗ #text: ">

Because the foreign content was escaped when <p> was encountered, all subsequent tags were also parsed in HTML namespace, leading to dangerous <img> tag in the DOM tree.

Now let's have a look this year's DOMPurify bypass via namespace confusion. The payload was:

<form><math><mtext></form><form><mglyph><style></math><img src onerror=alert(1)>

It was parsed into the following DOM tree:

┗ html form
  ┗ math math
    ┗ math mtext
      ┗ html form
        ┗ html mglyph
          ┗ html style
            ┗ #text: </math><img src onerror=alert(1)>

The namespace is switched from math to html on <mtext> because it is one of MathML text integration points whose purpose is to be able to embed html elements within MathML.

The DOM tree is again harmless according to DOMPurify as all elements are in the allow-list, and then serialized to:

<form><math><mtext><form><mglyph><style></math><img src onerror=alert(1)></style></mglyph></form></mtext></math></form>

The markup contains an embedded form element which is disallowed by HTML spec. Hence, when this markup is parsed again, it yields a different DOM tree:

┣ html form
┃ ┗ math math
┃   ┗ math mtext
┃     ┗ math mglyph
┃       ┗ math style
┗ html img src="" onerror="alert(1)"

Note that <mglyph> was initially in HTML namespace but now it switched to MathML namespace making all subsequent tags to be parsed differently. This made it possible to close <math> and to inject XSS payload.

Even though these two examples of payloads abused different mechanics to bypass DOMPurify, I'd argue that their root cause was the same; that is, in both cases the DOM tree that DOMPurify worked on had elements in unexpected namespaces.

In the first case, there was a <html p> element which was a direct child of <svg svg> or <math math>. This can never happen in a spec-compliant parser. The only way for a child element in foreign content to have a different namespace than its parent is via well defined MathML text integration points or HTML integration points. If it happens via any other tag, then this element should be deleted.

In the first case, the initial parsing yielded <html mglyph>. Even though mglyph is in the default allow-list, it should never be in HTML namespace. Hence, it should also be dropped, rendering the whole bypass impossible.

In this pull request, I'm proposing to add a code that adds a _checkValidNamespace function whose job is to ensure that:

  • there are not unexpected namespace switches
  • elements specific to SVG or MathML can only appear in their respective namespaces.

Here's a short overview of inner workings of the function:

  1. If current element is in SVG namespace and the parent is in HTML namespace, then we ensure that current tag name is svg as it is the only way to make this switch.
  2. If current element is in SVG namespace and the parent is in MathML namespace, then we ensure that the parent is MathML text integration point.
  3. If current element is in MathML namespace and the parent is in HTML namespace, then we ensure that current tag name is math as it is the only way to make this switch.
  4. If current element is in MathML namespace and the parent is in SVG namespace, then we ensure that current tag name is math as it is the only way to make this switch.
  5. If current element is in HTML namespace and the parent is in another namespace, we ensure that the parent is MathML text integration point or HTML integration point.
  6. For SVG and MathML elements we ensure that the element name is defined in the respective spec.
  7. For HTML elements we ensure that the element name is not specific to SVG or MathML. If it is, then the element is dropped.

The second issue this pull request protect against is the differences between document parsing mode and fragment parsing mode. DOMPurify uses document parsing mode (via new DOMParser().parseFromString()) but it could reparse the resulting DOM tree in fragment parsing mode via insertAdjacentHTML. This can lead to bypasses if the result is assigned to sinks like srcdoc. In this pull request, I got rid of insertAdjacentHTML completely and work on DOM nodes directly.

Tasks

Because this pull request is a pretty significant change to DOMPurify, I'd love a double check from Mario and/or Masato and anyone else who'd like to test it :) All tests are passing so this should be fine in theory but it always worth it to check.

Comments are welcome :)

@cure53 cure53 merged commit 9ee3d95 into cure53:main Dec 17, 2020
LeSuisse added a commit to Enalean/tuleap that referenced this pull request Dec 21, 2020
DOMPurify 2.2.6 comes with a new hardening to defend against mXSS.

Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6
PR introducing the new mechanism: cure53/DOMPurify#495

Change-Id: I9ac0534d5bd200fc6db0a706ab202a029f8ae20c
LeSuisse added a commit to Enalean/tuleap that referenced this pull request Mar 30, 2022
DOMPurify 2.2.6 comes with a new hardening to defend against mXSS.

Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6
PR introducing the new mechanism: cure53/DOMPurify#495

Change-Id: I4a499eb8efa227460040a1630ff7760f96373de8
LeSuisse added a commit to Enalean/tuleap that referenced this pull request Feb 3, 2023
DOMPurify 2.2.6 comes with a new hardening to defend against mXSS.

Changelog: https://github.com/cure53/DOMPurify/releases/tag/2.2.6
PR introducing the new mechanism: cure53/DOMPurify#495

Change-Id: Icab6ee5153969207195a67395ed5b5ec9cb134b0
This was referenced Jun 19, 2023
@luxaritas luxaritas mentioned this pull request Nov 17, 2023
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants