Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create [TrustedTypes].fromLiteral method #347

Closed
shhnjk opened this issue Jul 20, 2021 · 23 comments
Closed

Create [TrustedTypes].fromLiteral method #347

shhnjk opened this issue Jul 20, 2021 · 23 comments

Comments

@shhnjk
Copy link
Member

shhnjk commented Jul 20, 2021

Requirement of Trusted Types policy for assigning static HTML, Script, and ScriptURL is a pain point of TT conversion.
Creating some element using DOM APIs are hard, and creating policy just for assigning a static ScriptURL increases work and decreases readability without (almost) any benefit.

In order to make migration easier for website to increase adaptation of Trusted Types, safe code should require much less work to migrate.

While Array.isTemplateObject aims to provide this primitive, it's unsure if that's landing anywhere, and that won't eliminate the requirement of TT Policy.

Instead, can we create a fromLiteral method under each Trusted Types, so that staticness can be verified in the Web Platform (using same primitive as Array.isTemplateObject)?

Example would be:

const s = TrustedHTML.fromLiteral`<s>Trusted</s>`; // returns TrustedHTML
node.innerHTML = s; // No TT violation as TrustedHTML was assigned
@Sora2455
Copy link

In my uninformed opinion, this could easily be a footgun used by server-side templating languages to bypass the protection offered by Trusted Types. I could easily see something like this happening:

loginDescripter.innerHTML = TrustedHTML.fromLiteral`<i><?php echo $username;?></i>`;

@koto
Copy link
Member

koto commented Jul 20, 2021

Trusted Types are explicitly not covering server-side injections. Without fromLiteral, what prevents the developers from just outputting arbitrary strings in the HTML response, or interpolating into the TT policy body?

@shhnjk
Copy link
Member Author

shhnjk commented Jul 28, 2021

@koto, what's your thought on this?

@koto
Copy link
Member

koto commented Aug 4, 2021

I have some thoughts.

  1. It offers a way to create TT instances, so, like policies, it should be guardable by trusted-types directive (so - keyword). I can imagine how it's undesireable for developers to load scripts from arbitrary URLs, even in an absence of script-src.
  2. I see a clear advantage - it's both secure, and easier to use than policies for these limited cases. I'm not sure how much in practice this buys, given that this can already be done (modulo foolproofness of Array.isTemplateObject) in a library that internally uses a policy (in fact, we already have two such libraries). The API could only do this securely if there's no interpolation whatsoever, whereas the library can be configurable (e.g. allow dynamic hostnames for a script URL, but only for these hostnames, etc.). The libraries are always easier to experiment on the API shape of, and quicker to iterate on than the Web APIs. The libraries also could workaround transpilation to ES version that does not support template literals, whereas an API cannot.

So I'm split. I think we should, however, explore how feasible the spec+impl could be. @domenic, are you aware of any Web API that is a template tag function? Is there a reason not to have one? WebIDL wise, I think just defining as

interface TrustedHTML {
  TrustedHTML fromLiteral(object literal, any... args);
}

would work? The algorithm would have to lookup the argument in the [[TemplateMap]] slot of the Realm. I'm not sure if such peeking into ES slots is OK, any advice?

It's a similar issue to WICG/sanitizer-api#102, but we don't have the side-effects concerns here.

@domenic
Copy link

domenic commented Aug 4, 2021

Strategically, I think this kind of thing is better than waiting on Array.isTemplateObject.

No existing web API defines a template tag function. Normally they would define it using the IDL fromLiteral(sequence<DOMString> strings, any... args), but I guess then you would not be able to perform the verification you want. So indeed you should use object, perform the verification, and then use prose to convert to sequence<DOMString>.

Peeking into ES slots is discouraged in general---Web IDL is supposed to provide a uniform interface for interfacing with JS objects. But doing so makes sense in some cases, and IMO this is one of them. If it comes up more than a few times then we should define a wrapper algorithm in the IDL spec (e.g., the "underlying buffer" wrapper algorithm abstracts away the process of looking at the [[ViewedArrayBuffer]] internal slot.)

One thing I didn't see mentioned is how you plan to do the interpolation? E.g. if someone does

const foo = `<script>console.log(1)</script>`
const s = TrustedHTML.fromLiteral`<s>${foo}</s>`;

what is the result? Since you control the tag you could make the policy be that any interpolations involve HTML escaping, i.e. the result would be a trusted HTML string <s>&lt;script>console.log(1)&lt/script></s>. Is that the intention?

@koto
Copy link
Member

koto commented Aug 4, 2021

Thanks @domenic, this seems workable :)

how you plan to do the interpolation?

At least here, the intention is to throw on any interpolation attempt (i.e. we assert that the template array is of length 1). Template literals are just a way to enforce constant literals that we know are safe in a TT model (i.e. they are not user-injected).

I would love to eventually allow to e.g. contextually escape HTML, but I'm not sure TT (or any web API to be honest) would be the right layer to implement it. lit-html already does it, so we know it's feasible in a lib, and it's probably better suited there.

@domenic
Copy link

domenic commented Aug 4, 2021

Interesting. I would have thought that just escaping all of <, >, ", ', and & would be good enough. (In the sense that it would definitely be "secure", and if it over-interpolates for some use cases, then oh well, people can just not use it.) But I am not an expert.

@koto
Copy link
Member

koto commented Aug 4, 2021

Nope :) TrustedHTML`<${'script'}>'boom!'` or TrustedHTML`<div ${'onclick'}="${'boom!'}">` . Strings are just bad without knowing the exact context.

@shhnjk
Copy link
Member Author

shhnjk commented Aug 4, 2021

  1. It offers a way to create TT instances, so, like policies, it should be guardable by trusted-types directive (so - keyword). I can imagine how it's undesireable for developers to load scripts from arbitrary URLs, even in an absence of script-src.

If this requires policy, then I'm not sure why we don't require policy for setHTML.

2. I see a clear advantage - it's both secure, and easier to use than policies for these limited cases. I'm not sure how much in practice this buys, given that this can already be done (modulo foolproofness of Array.isTemplateObject) in a library that internally uses a policy (in fact, we already have two such libraries). The API could only do this securely if there's no interpolation whatsoever, whereas the library can be configurable (e.g. allow dynamic hostnames for a script URL, but only for these hostnames, etc.). The libraries are always easier to experiment on the API shape of, and quicker to iterate on than the Web APIs. The libraries also could workaround transpilation to ES version that does not support template literals, whereas an API cannot.

IF we can have this primitive without a policy, there are few advantages over library.

  1. We can use this mechanism in migration of third-party library (if keyword is required, this method will likely not used in third-party library).
  2. There will be less policies in TT (compared to every third-party lib creating their own policy for doing same thing).
  3. Doesn't require 'allow-duplicates' as opposed to custom library for fromLiteral where it might be used in 2 different codebases within the same page.

@koto
Copy link
Member

koto commented Aug 4, 2021

  1. It offers a way to create TT instances, so, like policies, it should be guardable by trusted-types directive (so - keyword). I can imagine how it's undesireable for developers to load scripts from arbitrary URLs, even in an absence of script-src.

If this requires policy, then I'm not sure why we don't require policy for setHTML.

To clarify, I meant guards, and not TT policy. For example, a keyword in trusted-types that can enable/disable fromLiteral. I feel like we might be losing a lot control - TT allow to put guards on script loading mechanisms done by large codebases that don't share the problems of script-src.

setHTML, as I understand it, currently sanitizes, and this is a non-sanitizing (but only accepting constants) equivalent?

  1. I see a clear advantage - it's both secure, and easier to use than policies for these limited cases. I'm not sure how much in practice this buys, given that this can already be done (modulo foolproofness of Array.isTemplateObject) in a library that internally uses a policy (in fact, we already have two such libraries). The API could only do this securely if there's no interpolation whatsoever, whereas the library can be configurable (e.g. allow dynamic hostnames for a script URL, but only for these hostnames, etc.). The libraries are always easier to experiment on the API shape of, and quicker to iterate on than the Web APIs. The libraries also could workaround transpilation to ES version that does not support template literals, whereas an API cannot.

IF we can have this primitive without a policy, there are few advantages over library.

  1. We can use this mechanism in migration of third-party library (if keyword is required, this method will likely not used in third-party library).

I see. It could be an opt-out keyword though.

@shhnjk
Copy link
Member Author

shhnjk commented Aug 4, 2021

I see. It could be an opt-out keyword though.

Yup, opt-out sounds great!

@shhnjk
Copy link
Member Author

shhnjk commented Aug 6, 2021

Assuming this will be an opt-out keyword, how should we go about designing what opt-out looks like?

My suggestion would be to create a built-in policy (just like default) where an application developer can create a policy specific to output of fromLiteral.

Example:

<meta http-equiv="Content-Security-Policy" content="require-trusted-types-for 'script'; trusted-types deny-literals;">

<script>
  trustedTypes.createPolicy('deny-literals', {
    // HTML and Script are fine
    createHTML: html => html,
    createScript: script => script,
    // Let's make sure script is same-origin
    createScriptURL: url => {
      const script_url = new URL(url);
      if (globalThis.origin === script_url.origin) {
        return script_url;
      }
      throw("Bad script URL");
    }
  });

  // returns true
  console.log(TrustedHTML.fromLiteral`test` instanceof TrustedHTML);
  // throws
  console.log(TrustedScriptURL.fromLiteral`https://evil.example/exploit.js`);
</script>

@koto
Copy link
Member

koto commented Aug 6, 2021

I'm confused - I thought the whole advantage of fromLiteral is that one doesn't need to create a policy at all :)

A single, global policy - like a default policy - looks attractive, but is problematic - the model breaks if you integrate two codebases that use it in a single document.

Policy in a fromLiteral introduces an additional concern. Tag functions are supposed not to have side effects, and policy functions can return different values every time, or alter global state.

If rules for conversions are needed, I'd rather we keep using the regular policy API , as that one allows devs to have local rules (e.g. per module). We could, for example, change the first argument of TrustedTypePolicy.prototype.createXYZ functions to any, such that it can be called as a template literal:

const foo = trustedTypes.createPolicy('literalsOnly', { 
  createScriptURL(template, ...args) {
     // in the future - Array.isTemplateObject
     if (!Array.isArray(template) || !Array.isArray(template.raw) || template.length !== 1) {
        throw 'bad';
    }
    // your rules here.
    return template[0];
  }
});

foo.createScriptURL`https://foo.bar`

TrustedXYZ.fromLiteral is, I think, only possible as a native API if there are no custom rules (due to side effects).

@shhnjk
Copy link
Member Author

shhnjk commented Aug 7, 2021

I do want fromLiteral without a policy. But I was thinking maybe we should give a context for folks who opt-out of fromLiteral, in such a way that they know something came to default policy (or some other policy) because they opted-out from fromLiteral (otherwise, you can't make logic where you want to allow HTML and Script from fromLiteral but not ScriptURL).
But given that it can't have side effects, it makes sense that we can't do this.

In that case, we can simply let fromLiteral return a string when an app opt-out from fromLiteral?

@koto
Copy link
Member

koto commented Sep 13, 2021

can we simply let fromLiteral return a string when an app opt-out from fromLiteral?

I would rather avoid returning different return types from a function, API like that is hard to define types for, and requires either explicit type checking at the callsite, or leads to functional bugs.

For example, devs may expect getting a string and call String.prototype.replace on it, whereas code returns a TrustedHTML instead on some browsers, and this breaks. This is in fact exactly what happened with DOMPurify.sanitize and its TT support (which resulted in functional breakage).

I'm afraid that, to avoid such errors, at most we should be returning null or undefined (but, actually, we should be throwing instead). The code to support it though is verbose, and requires checking for support separately :/

I changed my mind - if fromLiteral is safe (it is), it should "just" be always available, without any toggling via CSP. Folks who would wish to limit it, can just delete TrustedHTML.fromLiteral.

That way a "best effort" return value would be:

foo.innerHTML = (TrustedHTML?.fromLiteral || (s => s[0]))`<p>this will be a TrustedHTML, or a string</p>`

// You can also store the tag function separately, and e.g. add some more checks for strings.

const bestLiteral = TrustedHTML?.fromLiteral || (s => {
  if (s.length > 1) {
    console.log('Please don't interpolate here'); // You can also throw.
  } 
  return s[0]
});

bestLiteral`<p>ha!</p>`

@koto
Copy link
Member

koto commented Oct 6, 2021

We had a bit of discussions internally. It generally looks great!

That said, an issue came up with TrustedHTML.fromLiteral type. While not problematic on its own, it does enable an error-prone practice of operating on HTML ignoring that HTML has multiple contexts. For example TrustedHTML.fromLiteral`<div foo="` is normalized when added to the DOM, but before that it's error-prone to use it.

In particular, concatenating TrustedHTMLs is not safe anymore, while it's reasonable to assume someone would (and thus, would shoot themselves in the foot). For example, we generally allow concatenating of our safe HTMLs.

There is a simple and relatively unobtrusive change we could add to address that. In TrustedHTML.fromLiteral, invoke a DOM parser on the input (with, say, <div> node as a context), and create an instance wrapping a string serialization of the result, obtaining a well-balanced TrustedHTML:

TrustedHTML.fromLiteral`<div foo>`.toString() // <div foo=""></div>

It's easy to polyfill with a DOMParser. It also opens up a potential internal performance optimization in browsers, that could cache the document fragment it used for parsing, and reuse that when working with a sink.

For b/w compatibility, this would not apply to existing policy-based TrustedHTMLs. There the user has full control over the types. We might need another function for creating a full document HTML if that's needed (if someone does TrustedHTML.fromLiteral`<html><body>....` we would normally just discard the body), but apart from that I see little downsides, given that this is a new function anyway. @shhnjk, wdyt?

@domenic
Copy link

domenic commented Oct 6, 2021

If you go that direction, use template instead of div as the wrapper; it's canonical for these sorts of context-free scenarios.

@shhnjk
Copy link
Member Author

shhnjk commented Oct 6, 2021

Mostly sounds good, however, I don't think concatenating TrustedHTMLs is supported today.

I've verified this in Chrome Canary.

// Devtools attached to chrome://settings
const {getTrustedHTML} = await import('chrome://resources/js/static_types.js');
const a = getTrustedHTML`a`;
console.log(a instanceof TrustedHTML); // true
const b = getTrustedHTML`b`;
console.log(b instanceof TrustedHTML); // true
const c = a + b;
console.log(c instanceof TrustedHTML); // false

Is concatenating TrustedHTMLs something that you want to support with TrustedHTML.fromLiteral or is it just a use case that you want to support safely?

@koto
Copy link
Member

koto commented Oct 6, 2021

Is concatenating TrustedHTMLs something that you want to support with TrustedHTML.fromLiteral or is it just a use case that you want to support safely?

The latter. To some extent, for example you can still shoot yourself in a foot by creating a TrustedHTML of any string via a policy, but at least the policies can be guarded.

@shhnjk
Copy link
Member Author

shhnjk commented Oct 6, 2021

Got it. That sounds good to me :)

koto added a commit to koto/trusted-types that referenced this issue Oct 21, 2021
koto added a commit to koto/trusted-types that referenced this issue Oct 21, 2021
@koto
Copy link
Member

koto commented Oct 21, 2021

@domenic I created a draft for fromLiteral in #350. Could you give it a brief look?

@shhnjk
Copy link
Member Author

shhnjk commented Nov 13, 2021

Any plan to merge the PR and start an implementation?

@koto
Copy link
Member

koto commented Nov 15, 2021

We still have some thoughts on the TrustedHTML variant, but I think the other two types are uncontroversial, the delay is mostly because I want to write the polyfill too.

github-actions bot added a commit that referenced this issue Nov 17, 2021
SHA: 49af155
Reason: push, by @koto

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to koto/trusted-types that referenced this issue Jan 12, 2022
SHA: 49af155
Reason: push, by @koto

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants