Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitizer vs Trusted Types #20

Closed
otherdaniel opened this issue Jul 2, 2020 · 27 comments
Closed

Sanitizer vs Trusted Types #20

otherdaniel opened this issue Jul 2, 2020 · 27 comments
Labels
sanitizer-api issues with the API

Comments

@otherdaniel
Copy link
Collaborator

The resulting APIs should be integratable with Trusted Types.

@otherdaniel
Copy link
Collaborator Author

DOMPurify adds an option for Trusted Types as a return type.

An alternative might be a change to TT, so that a TT policy can accept a sanitizer config in lieu of a callback function. That would preserve TT's review-ability, as it maintains the property that only TT policies can generate trusted type instances.

@shhnjk
Copy link

shhnjk commented Aug 20, 2020

I think sanitizer should have option to return TrustedHTML instead of string. To me, Trusted Types is a way to enforce proper sanitization on dangerous sinks. Creating Trusted Type policy with selecting sanitizer API in createHTML does not make sense to me, if browser already knows that sanitized output is known to be good.
Instead, output from sanitizer API should return TrustedHTML, if there was no modification made to default (i.e. known to be good) sanitization option.

CC: @koto

@otherdaniel
Copy link
Collaborator Author

Hmm. If Sanitizer(s) can create TrustedHTML, and if the user can create different sanitizers with potentially silly options (e.g. {allow: ["script"]}), then how do we select which sanitizers can create TrustedHTML instances? Default-only would seem safe, but quite restrictive.

Or would this power be automatically conferred to all sanitizer instances? (Presumably with the understanding that sanitizer configs can only tighten the defaults, but not loosen them?)

@shhnjk
Copy link

shhnjk commented Aug 20, 2020

So, let's say safe option we believe is {allow: ["b","i","s"]}. And we will return TrustedHTML as long as user's config is more restrictive than safe option (i.e. {allow: ["b","i"]}).

This would really benefit folks who would like to use Trusted Types. Because currently, readability of JS code is a challenge when converting existing code to Trusted Types. And usually this would mean to creating Trusted Type policy with static HTML.
You can see one of the example here. Converting static HTML using DOM API is ugly, but in order to use static HTML as a string as-is, we need to create TT policy, which in this case means all website that uses this library has to allow new TT policy on their website (which is a pain).

Maybe, how about we create toSafeHTML in sanitizer API which would only work with default safe config or more restrictive config?

@shhnjk
Copy link

shhnjk commented Aug 20, 2020

Reading at the syntax, maybe we would only expose toSafeHTML method if new Sanitizer uses default safe config or more restrictive config.

@shhnjk
Copy link

shhnjk commented Aug 20, 2020

Or would this power be automatically conferred to all sanitizer instances? (Presumably with the understanding that sanitizer configs can only tighten the defaults, but not loosen them?)

This also makes sense, but then I don't like the naming of toString which might return TrustedHTML. Anyways, I think @koto might have better idea on which way to go or no go 😉

@mozfreddyb
Copy link
Collaborator

mozfreddyb commented Aug 21, 2020

Hmm. If Sanitizer(s) can create TrustedHTML, and if the user can create different sanitizers with potentially silly options (e.g. {allow: ["script"]}), then how do we select which sanitizers can create TrustedHTML instances? Default-only would seem safe, but quite restrictive.

I think a main contribution of the spec should be to define a safe subset of HTML.
If we get this right, we can (and should!) disallow adding random elements to the allow list.
I don't want to allow adding things (besides custom elements) at all, only removing.
Allowed elements to be passed to the constructor should only be allowed to opt-in to stricter checking, never weaker.

@otherdaniel
Copy link
Collaborator Author

I think a main contribution of the spec should be to define a safe subset of HTML.
If we get this right, we can (and should!) disallow adding random elements to the allow list.

Interesting. I had assumed that users could configure whatever they want, but actually don't have a good reason for why.

If we do think we can do always-safe sanitizer configs, then we could maybe use a CSP keyword in the TT directive (e.g. trusted-types 'sanitizer') to enable direct TT generation by all sanitizer instances.

@koto
Copy link

koto commented Aug 21, 2020

Small nit: There is no always-safe sanitizer config. Due to script gadgets (which are common, as we've found), any DOM tree can potentially lead to JS execution. If the intention is to enable easy (e.g. configured only declaratively) sanitization for the long tail of websites, said websites remain practically vulnerable to this "second order" XSS. That is not a direct cause for concern for the Sanitizer API, but an argument against overpromising.

@shhnjk
Copy link

shhnjk commented Aug 21, 2020

I agree with @koto here. There is no always-safe sanitizer configs for script gadgets, and no customization to sanitizer is a deal breaker.

When it comes to integration with Trusted Types, we would want to block configuration that does not match with Trusted Types (thus require TT policy creation). That is, Sanitizer API can return TrustedHTML as long as config is known to prevent XSS with vanilla JS.

For example, we would allow assigning arbitrary value to id attribute in default config and return that as TrustedHTML, even though we know that id attribute might be able to cause an XSS through script gadgets.

This makes sense because DOM API would be able to do the same thing without causing a TT violation, and TT would only get angry when specific library tries to evaluate id attribute's value as an HTML, which is the Trusted Types' type enforcement job, and not a job for sanitizer.

@mozfreddyb
Copy link
Collaborator

mozfreddyb commented Aug 26, 2020

The specification cannot account for arbitrary scripts on the page that turn an attribute (id, data-.., etc.) into HTML / script though..

@koto
Copy link

koto commented Aug 26, 2020

I think you're both saying the same thing. @mozfreddyb, are you referring to something else than #20 (comment)? TT set a specific boundary, and the bug is to make sure the sanitizer is not a bypass vector. Which it's not, unless a sanitizer can produce nodes from strings that would normally use the "native" DOM XSS sinks. If the sanitizer output has ids, names and other gadget-inducing attributes, it could still be TrustedHTML or a DocumentFragment. The latter case is actually a perfect example - sanitizer produced a native-benign object, and if the application transforms it with its gadgets, TT complain (for most gadget types, at least).

@otherdaniel
Copy link
Collaborator Author

I've tried to take the earlier work from Jun, and to expand it a bit based on the current state of the spec, resulting in PR #49. I'd appreciate in @shhnjk @koto and @iVanlIsh could have a look.

Overview:

  • I've put all of it into an appendix, since TT is just an editor's draft.
  • I've tried to be generous, and specified two integrations. I'm not sure if whether it makes sense to have both.
    • sanitizeToTrustedHTML, much like Jun had proposed, Unlike Jun's proposal, it's enabled by a 'sanitizer-api' keyword in the trusted-types CSP directive.
    • Using SanitizerConfig as an input directly in TrustedTypePolicyOptions.

@otherdaniel
Copy link
Collaborator Author

New Trusted Types PR in #55.

I don't think we reached consensus on PR #49, so I've tried to take what I've learned and made have now made a new attempt. Please liberally comment - here or there - on whether think it's agreeable.

Main differences between #49 and #55:

  • Substance:
    • By default, Sanitizer is not a TT boundary.
    • It's still controllable via CSP header.
    • New helper method to make Sanitizer easier to use within a TT policy, without said method being TT specific.
  • Editorial:
    • Moved into an appendix, since it's really more explanatory, and arguably shouldn't be in the Sanitizer spec at all.
    • Contains explanations on potential uses, since I figured only giving the defitinions was pretty obtuse.

@otherdaniel
Copy link
Collaborator Author

New New Trusted Types proposal. No PR yet. Doc here: https://github.com/otherdaniel/purification/blob/trustedtypes-3/trustedtypes.md

(I'll gladly turn it into a proper PR, it that helps.)

I think we didn't really reach consensus on PR #55 either, so this time it's more of an Explainer / Position Paper about how to use Santizer and Trusted Types together. It's less about what to specify, and more about how to use these two things to achieve the common goal, XSS-free apps.

I've been careful not to mix explainer stuff with new proposals. It doesn't assume any change to the current spec. It does mention several options at the bottom, though.

My intent is to maybe leave it at that for now. We could then wait for external feedback and base our future path on what we hear from developers.

Does this make any sense?
If so, should this become a document in the sanitizer-api repo, or maybe TT, or should this just remain my personal opinion?

@shhnjk
Copy link

shhnjk commented Mar 3, 2021

Can we agree somewhere in the middle round?

  • Are folks comfortable with not throwing a TT error on sanitze? If so at least we should do that.
    • My point was that I don't see a difference between this and returning TrustedHTML, but if folks only likes sanitize, then let's at least do that.
  • Is opt-out for Sanitizer API to return TrustedHTML not enough for folks? I think that's a good compromise for both parties?

I think we aren't reaching consensus on Sanitizer API's defaults VS built-ins, but I don't think that matter much to the problem of static HTML assignment to dangerous sinks. As long as Sanitizer API returns natively XSS-safe output, than we should provide a way to not cause a TT violation when assigning unsafe input to dangerous sinks through Sanitizer API (either with sanitize or sanitizeToTrustedHTML).

@otherdaniel
Copy link
Collaborator Author

My intent here now is to concentrate on what everyone agrees on - which isn't very much - and to punt the rest to later. I understand that we'll have to do a second round on this. I also understand this makes almost no-one happy - certainly not me - but I hope we can at least make some progress that way.

IMHO, if we can't agree on API structure, then the real solution is to push this towards a public trial, and then figure out what developers out there are actually missing. We should avoid constraining the future path until then.

  • On (potentiall) blocking Sanitizer.sanitize (via Trusted Types): IMHO allowing TT to block .sanitize would make sense, but the argument against this is that if third parties / libraries cannot rely on Sanitizer being present, then they might be tempted to avoid it alltogether, in favour of a home-grown sanitizer. That is the opposite of what we're trying to accomplish with this project. While I'm not 100% convinced, I do think the objection is strong enough that I can't just ignore it.

  • On equivalence of Sanitizer.sanitize and .sanitizeToTrustedHTML: I agree with this, too, but I don't think I'm seeing wide agreement on this. The equivalence argument also goes the other way, in that devs can fairly easily polyfill the missing capability, because they're largely equivalent. If we see devs actually do that, we'd hav a much stronger argument.

  • On the use case of static assignments to dangerous sinks (I guess: bla.innerHTML = "<em>text</em>";): What is the use case that isn't easily covered by a sanitizing default policy, or by changing .innerHTML = "..." to .replaceChildren(s.sanitize("...")) ? Is it only syntactical, or is there some more involved thing I'm missing? Would it help to have an additional setter (.saneInnerHTML or so) on HTMLElement, that implements sanitizing in a TT safe way (if TT is enabled, and without TT or sanitizer if those are absent)?

@shhnjk
Copy link

shhnjk commented Mar 4, 2021

On the use case of static assignments to dangerous sinks (I guess: bla.innerHTML = "text";): What is the use case that isn't easily covered by a sanitizing default policy, or by changing .innerHTML = "..." to .replaceChildren(s.sanitize("...")) ?

IIUC, string argument to sanitize will cause a TT error if we don’t integrate Sanitizer API with TT (@koto let me know if I’m wrong). Because it’s a string to a DocumentFragment conversion.

And default TT policy shouldn’t be used in JS libraries/frameworks, as default TT policy is meant to be used by Web Apps.

@Sora2455
Copy link

Sora2455 commented Mar 4, 2021

As a rando web dev, my two cents is that converting from a string to a document fragment to a string to another document fragment (which is my understanding of what element.innerHTML = sanitizer.sanitizeToTrustedHTML("<em>HTML</em>") does) has not only performance issues, but is also vulnerable to mXSS, neither of which is the case with element.replaceChildren(sanitizer.sanitize("<em>HTML</em>")).

@shhnjk
Copy link

shhnjk commented Mar 4, 2021

If an output from Sanitizer API can cause an mXSS, then that's a problem of Sanitizer API itself. If we promised to provide natively XSS-free sanitizer, then producing an mXSS output is a deal breaker.

mXSS is caused by a mutation of the browser itself. So Sanitizer API is in the best position to solve that issue. If we can't solve that, then I'm opposed of even providing sanitizeToString method :)

koto added a commit to koto/trusted-types that referenced this issue Mar 10, 2021
koto added a commit to w3c/trusted-types that referenced this issue Mar 10, 2021
@shhnjk
Copy link

shhnjk commented Mar 27, 2021

Here is my official feedback with my developer's hat on:
https://microsoftedge.github.io/edgevr/posts/eliminating-xss-with-trusted-types/#the-future-work

@shhnjk
Copy link

shhnjk commented Jun 14, 2021

Coming back to this, I think my problem of allowing developers to assign static HTML can be solved with HTML Modules.

However, I think there is a major problem of integrating Sanitizer API with Trusted Types for sanitize method. Because sanitize method returns DocumentFragment, Trusted Types doesn't have corresponding types for it (as @mikewest mentions).

So if we don't think about integration, usage of Sanitizer API under Trusted Types enforcement will probably result in more usage of sanitizeToString method, which is a performance loss.

@otherdaniel
Copy link
Collaborator Author

Thanks for bringing this up again. In the mean time, #42 happened, and I think the APIs proposed there change the situation a good bit, hopefully for the better:

  • Node.SetInnerHTML(DOMString, Sanitizer): I think this addresses your main point
  • It reduces the string exposure in general: We'd have zero functions left that return a string, and only two that accept one (Sanitizer.sanitizeFor and Node.SetInnerHTML).
  • There's be no more .sanitizeToString.

I think the core issue for Trusted Types integration - whether Sanitizer is considered a TT boundary or not - is still open. As I read it, the majority opinion is that - if Sanitizer is XSS-safe by default - then it shouldn't do TT checks. I'd be okay with that.

Please take a minute and hop over to #42, whether this makes your life easier (or not). I'd like to wrap that one up soon-ish, so feedback would be much appreciated.

@shhnjk
Copy link

shhnjk commented Jun 14, 2021

I think the core issue for Trusted Types integration - whether Sanitizer is considered a TT boundary or not - is still open. As I read it, the majority opinion is that - if Sanitizer is XSS-safe by default - then it shouldn't do TT checks. I'd be okay with that.

For existing TT, it sounds okay. However, if TT implements something like require-trusted-types-for 'style', then Sanitizer API will be a bypass, because Sanitizer API won't block style tag but TT would want to block it.

So there needs to be a way to control Sanitizer globally in the document, either in the Sanitizer API or in the Trusted Types. But I think it should be done in the Trusted Types (I personally like this example by @koto).

The question is, if Trusted Types has new restrictions (e.g. blocking text assignment to style element without TrustedStyle), could Sanitizer API comply? If so, what does it look like for developers (suddenly the output from Sanitizer API might be different based on Trusted Types configuration)?

If we agree on TT controlling Sanitizer API globally in the future, then it's important to tell developers that Sanitizer API should be used only for sanitizing untrusted input. If they use it for static html assignment, the output might change in the future based on Trusted Types config (especially in the third-party frameworks/libraries).

@mozfreddyb
Copy link
Collaborator

Fwiw, I know this will be hard, but I'd strive for the sanitizer to always return an xss-safe value, which should make this all a bit simpler.

@shhnjk
Copy link

shhnjk commented Jun 15, 2021

Fwiw, I know this will be hard, but I'd strive for the sanitizer to always return an xss-safe value, which should make this all a bit simpler.

Sure, no doubt about that 😊

But what I'm saying here is, what happens if Trusted Types will create a new mode, where it will do type enforcement for CSS. Sanitizer API doesn't care about CSS injection, so it'll allow style tags. But Trusted Types would want to block that. In that situation, I think that the Sanitizer API's output should comply with Trusted Types, regardless of what config was passed (because the config of Sanitizer API might come from third-party library, but Trusted Types enforcement always comes from the Website owner).

@mozfreddyb mozfreddyb added the v1 label Mar 23, 2022
@otherdaniel
Copy link
Collaborator Author

Closing. (Please reopen if someone strongly disagrees.)


With the introduction of a non-overridable "baseline" into the Sanitizer, the Sanitizer maintains a guarantee to never contain script-ish output. A script-ish sink in the Sanitizer output should be considered a blocking bug in the Sanitizer itself, rather than "only" a Trusted Types issue. With that, I think we don't have to treat the Sanitizer as a TT sink.

Personally, I kinda like the result: We'll have the Sanitizer as an easy-to-use XSS-safe means to create DOM trees, and TT as a means to prevent other, unsafe means from doing the same. That's a good combo.

I think the concern that future evolutions of either Sanitizer or TT might evolve in different directions still stands. But I don't see a way to handle that preemptively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sanitizer-api issues with the API
Projects
None yet
Development

No branches or pull requests

5 participants