Skip to content

Upstream sanitizer api#12395

Open
noamr wants to merge 35 commits intowhatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api
Open

Upstream sanitizer api#12395
noamr wants to merge 35 commits intowhatwg:mainfrom
noamr:zcorpan/upstream-sanitizer-api

Conversation

@noamr
Copy link
Copy Markdown
Collaborator

@noamr noamr commented Apr 21, 2026

Convert the incubated spec in https://wicg.github.io/sanitizer-api/ to the HTML format and make it part of the HTML standard.

(See WHATWG Working Mode: Changes for more details.)


/canvas.html ( diff )
/comms.html ( diff )
/dom.html ( diff )
/dynamic-markup-insertion.html ( diff )
/edits.html ( diff )
/embedded-content-other.html ( diff )
/form-elements.html ( diff )
/forms.html ( diff )
/grouping-content.html ( diff )
/iframe-embed-object.html ( diff )
/image-maps.html ( diff )
/imagebitmap-and-animations.html ( diff )
/index.html ( diff )
/indices.html ( diff )
/infrastructure.html ( diff )
/interaction.html ( diff )
/interactive-elements.html ( diff )
/microdata.html ( diff )
/parsing.html ( diff )
/references.html ( diff )
/rendering.html ( diff )
/sections.html ( diff )
/semantics.html ( diff )
/system-state.html ( diff )
/tables.html ( diff )
/text-level-semantics.html ( diff )
/timers-and-user-prompts.html ( diff )
/web-messaging.html ( diff )
/webstorage.html ( diff )
/workers.html ( diff )

@noamr noamr marked this pull request as draft April 21, 2026 13:16
@noamr noamr changed the base branch from zcorpan/upstream-sanitizer-api to main April 21, 2026 13:17
@noamr noamr changed the title WIP upstream sanitizer api Upstream sanitizer api Apr 21, 2026
@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from 223a4d1 to d2034e5 Compare April 21, 2026 19:42
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 21, 2026

@zcorpan @evilpie @mozfreddyb @otherdaniel
initial review? :)
this is quite a big PR...

@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

For anyone else looking at this, the gist of the changes are in dynamic-markup-insertion.html.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.

The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Comment thread source
Comment thread source Outdated
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 22, 2026

Amazing, thanks for working on this.
The built-in safe default configuration is pretty integral to the API, where did I go?

Oh you're right I had it on my todo list and forgot. Getting to it. Thanks!

Done.

@noamr noamr marked this pull request as ready for review April 22, 2026 10:56
Comment thread source Outdated
@annevk
Copy link
Copy Markdown
Member

annevk commented Apr 22, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I thought as part of moving this into the HTML standard we'd also address the parser integration issue?

This is a huge PR so I thought doing it in two stages, the first one being a purely technical upstream, would be easier to review?

Open and happy to incorporate the stream-while-parsing changes in this PR if you and @zcorpan are ok to review that in one go.

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

@zcorpan @annevk can we align on whether we upstream the sanitizer as is and then change it to stream-while-parsing, or do it in one go? I'm perfectly happy with both options.

@noamr noamr added the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@zcorpan
Copy link
Copy Markdown
Member

zcorpan commented Apr 23, 2026

I prefer doing the parser integration in a follow-up PR.

Comment thread source
@noamr noamr removed the agenda+ To be discussed at a triage meeting label Apr 23, 2026
@evilpie
Copy link
Copy Markdown
Member

evilpie commented Apr 23, 2026

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 23, 2026

I think these three PRs would be good to merge before merging into the HTML standard:

Since some security sensitive changes rely on "sanitizing while parsing", and that in turn relies on the current post-processing sanitizer being upstreamed, I don't think we should delay upstreaming any further.

Can we race it? If any of these go in before the upstream PR is in I'll incorporate them into the HTML PR.

@noamr noamr closed this Apr 23, 2026
@noamr noamr reopened this Apr 23, 2026
Comment thread source
data-x="dom-SanitizerProcessingInstruction-target">target</code> member.</p>
</div>

<div algorithm>
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These algorithms look like they belong in infra... would people be open to adding an optional comparator predicate to those, or to the definition of list/order set?

@annevk @zcorpan

Copy link
Copy Markdown
Contributor

@otherdaniel otherdaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and I'm super happy to see this happening!


I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

Comment thread source
Comment thread source Outdated
Comment thread source Outdated
Copy link
Copy Markdown
Contributor

@otherdaniel otherdaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, and I'm super happy to see this happening!


I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

@noamr noamr force-pushed the zcorpan/upstream-sanitizer-api branch from ea79a5b to 1e065df Compare April 28, 2026 13:07
@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 28, 2026

Thank you, and I'm super happy to see this happening!

I wonder if we can link to the "Security Considerations" section in the current spec; or have them in a supplementary document somewhere?

I've upstreamed them instead into a security consideration subsection

Comment thread source Outdated
<li><p>Return <var>document</var>.</p></li>
</ol>
</div>

</div>

<!-- https://github.com/WICG/sanitizer-api/commit/c4e328037ab6cd9c753b12694f5dcfc14988dec5 -->

<h4>Safe HTML parsing methods</h4>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should use "Safe". Just "HTML parsing methods" is fine. For the same reason we don't say "safe" in APIs.

Copy link
Copy Markdown
Collaborator Author

@noamr noamr Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved them together with the "unsafe" methods and explained the difference.

Comment thread source Outdated
into an element's <code data-x="dom-Element-innerHTML">innerHTML</code> is fraught with risk, as
it can cause script execution in a number of unexpected ways.</p>

<p>Libraries like <cite>DOMPurify</cite> attempt to manage this problem by carefully parsing and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should trim a bunch of this text. This is a standard, not a justification for the existence of this feature. We also can't assume familiarity with libraries so it's best to just not mention them.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimmed considerably

Comment thread source Outdated
</li>
</ul>

<h4>Processing model</h4>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section appears to define API. "Processing model" is generally reserved for something more abstract.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread source Outdated
}</p></li>
</ul>

<h4 id="sanitizer-security-considerations">Security Considerations</h4>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the headings here don't appear to follow our title case convention.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@noamr
Copy link
Copy Markdown
Collaborator Author

noamr commented Apr 30, 2026

I've refactored some of the sanitization constants to go into each element's definition instead of being in one huge table. I think that makes it less error prone when we add new elements in the future. If that's undesirable I'm happy to revert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants