-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: html/template: add a hardened version of it to the standard library #27926
Comments
This looks great! One other feature I miss from Soy is automatic CSP nonce-injection for <script> tags. Any chance safehtml/template could support this? Or at the very least, take the opportunity to ensure that we don't rule it out in the API. |
We could certainly consider doing this for |
As @stjj89 pointed out we are also working on the auto-noncing step and on a package that ensures a strict CSP policy is always set. Since nonces are per-request contextual and all the template packages don't currently have an API that uses context we are discussing how would this best fit in the go ecosystem. CC @FiloSottile that is reviewing the design for that. |
I feel like there's a new Best Way to generate HTML every year or so. Why does this need to live in the standard library where we can't change anything? It's probably best if it just lives on GitHub somewhere and is properly versioned so people can upgrade to new best practices over time and you could make breaking changes. |
I can't say with certainty how the best practices for preventing HTML code-injection attacks will change a few years from now, but historically, large paradigm changes like this haven't happened that often. @mikesamuel wrote I don't foresee this paradigm (i.e. safe-by-design string wrappers recognized by an autoescaping template system) changing drastically any time soon. We might expand the That all being said, I don't think it will be a terrible idea to place |
Even if safehtml/template is intended for the standard library eventually, it should start somewhere else. That could be in the Go project or not, depending on who is going to do the core development. If this was a security team project it could just be github.com/google/safehtml. Or we could put it in a Go repo somewhere (x/net/html/...?). Probably the latter is better. |
Marking this declined but really it just needs to happen somewhere else first. |
Ok, that makes sense. I'll talk to the Go team internally to decide if this is a good fit for a Go repo. |
Sorry for being late to this party. Fwiw, I think safehtml is great and would recommend people use it however it is provided. I don't know what @bradfitz has in mind re "best new ways", but we in security engineering have mostly managed to avoid large-scale breaking changes, and Samuel has put a lot of thought into making safehtml/template a good migration target while updating to the realities of the modern web. Any code that aims to preserve broad security properties has to change as the threat environment changes. Since html/template shipped, there have been changes to the threat environment:
There's a tendency to try to separate code into core library parts and user-configurable policy parts. I did that with the safe strings in html/template which turned out to be a source of footguns. How might we manage change when
Might it be easier to avoid breaking changes if DSLs like templates were go-fixable? |
I agree that the model of white-list based CSPs has failed. Nonce-based CSPs on the other hand can provide a strong mitigation and can be rolled out with very little effort if the templating system supports auto-noncing. |
@lweichselbaum I agree that nonce-based CSP are a useful mitigation. When I call it a failure, I'm talking about compared to the expectations years ago when many including I thought it would end XSS as the most common web security vulnerability. |
Overview
Add a hardened version of
html/template
to the Go standard library. This new template package will incorporate security engineering best practices employed within Google to guarantee–with high degree of confidence–that the HTML rendered by the template system is safe against code injection.Background
Package
html/template
implements data-driven templates for generating HTML output that is safe against code injection.html/template
is a "contextually auto-escaping template engine": it treats template data as untrusted plain text and escapes them so that they can be safely embedded in its HTML output. The kind of escaping applied to the data depends on the context that the data appears in (e.g. HTML, JS, CSS, URI).Issues with
html/template
While
html/template
is significantly better thantext/template
, string-formatting functions (e.g.fmt.Sprintf
), and ad-hoc string concatenation for generating HTML safe against code injection, it has several shortcomings which I describe in the following sections.Typed strings
html/template
provides developers a set of typed strings (e.g.template.HTML
,template.JS
) to flag known-safe template data that are intended to be used without escaping or validation. This mechanism is necessary to accommodate use cases in real-world applications with complex dataflows, where developers want HTML markup, trusted data (e.g. programmer-controlled strings), or already-sanitized content that are generated in one part of their application to be preserved when it is rendered in HTML templates in remote parts of the application.Unfortunately, these typed strings are easily misused. The most obvious way to misuse these types is create them from arbitrary, dynamic strings. This essentially disables contextual auto-escaping, thereby negating the benefit of using
html/template
in the first place. For example:See here and here for a real-world examples of code that explicitly opt out of auto-escaping.
In other cases, developers make more of an effort to sanitize strings before converting them into typed strings. Unfortunately, this is an error-prone process that often incorrectly duplicates the work that
html/template
already does under the hood. For example:While the generated
button
element might appear to be safe sincemsg
is JavaScript-escaped, it is vulnerable to XSS due to the lack of HTML-escaping. When the browser evaluates this markup, it first HTML-unescapes the value of theonclick
attribute before evaluating the JavaScript expression. Therefore, a value ofmsg
like');attackScript();//
will be HTML-unescaped andevaluated as
alert('');attackScript();//')
, which results in the execution of the attacker's script.1See here for a real-world example of code that does not validate or escape untrusted URLs before embedding them into a
template.HTML
value.The lack of constraints on producing
html/template
typed string values seems to encourage developers to move more HTML-generation logic outside of templates into error-prone, hand-written routines. See real-world examples here and here.Each conversion into a
html/template
typed string represents a potential vulnerability, and therefore must be carefully reviewed by a reviewer knowledgeable about the subtleties of HTML-injection bugs. The reviewer must ensure that the string being converted is in fact safe to use in the type'scorresponding context for all possible values of that string. Asserting this property is difficult when the data flow into the typed string conversion is sufficiently complex. Moreover, these typed string conversions make it possible for future changes in one (upstream) part of the application to cause security bugs in another (downstream, remote) part of the application. Therefore, the more often typed strings are used in a Go program, the more difficult it is to guarantee that it produces HTML that is free of code-injection vulnerabilities.
Single URL context
html/template does not differentiate between URLs that load code and those that do not. This has significant security implications. For example, the template:
will produce the following HTML output when
URL
is"http://www.untrustedsite.com/script.js"
:html/template
did not filter out theURL
value because it contains the benignhttp
scheme. Whilehttp://www.untrustedsite.com/script.js
is safe to navigate to as a link (i.e. since the navigation will not cause untrusted, same-origin script execution in the browser), it is not safe to load an executable script from (i.e. since it will be loaded over HTTP and the contents of the script are not trusted).JavaScript and CSS parsing
html/template
allows template data to be interpolated into JavaScript (JS) and Cascading Style Sheet (CSS) contexts. It parses the JS and CSS surrounding the template data in order to contextually escape the data. For example, the following template:is rendered by
html/template
as:This functionality is problematic for two reasons. The first is that JS and CSS parsing is error-prone. JS and CSS are rapidly evolving languages that our parser might not always handle correctly; layering parsers for these two languages on top of our mixed HTML-template language parser introduces more
complexity and points of failure to the package.
The second issue is that this feature encourages the security anti-pattern of using inline scripts and stylesheets. This prevents the adoption of strict Content Security Policy (CSP), where all scripts loaded by the browser undergo explicit validation before being executed. See here for more details on why inline scripts are dangerous.
Blacklist-based sanitization
html/template
only understands the semantics of a certain subset of HTML elements and attributes. Elements and attributes outside of this set are assumed to have no special semantics.2This sanitization policy is too permissive. Elements or attributes that are not understood by the
html/template
escaper may have semantics that are security-sensitive, particularlycustom elements and those introduced in future revisions of the HTML standard. Properly sanitizing these elements and attributes in the future will require backward-incompatible changes to
html/template
.Dynamic template sources
html/template
allows templates to be parsed from arbitrary strings and filenames. This makes the templates themselves susceptible to injection attacks. For example,If an attacker can fully or partially control the values of
bodyTmpl
orfilename
, then the attacker can control the templates being loaded and hence the HTML output. Such an attack completely undermines the assumption that template authors are trustworthy.Proposed solution
Add a new
safehtml/template
library that addresses the above issues. In particular, this package will:Replace
html/template
typed strings with a set of types with a richer, but constrained API. These types will live in a separate packagesafehtml
, which will provide a safe-by-design API for constructing values of these types. The following sketch illustrates a subset of this API:Package
safehtml
should provide constructors that satisfy most common use cases for constructing known-safe values outside of a template sytem. Values of these types carry strong security guarantees about strings they encapsulate; when passed around a Go program, they enable developers to depend on these properties without having to reason about whole-program dataflows. Not surprisingly,safehtml/template
will also producesafehtml.HTML
. This allows values from separate-evaluated HTML templates to be composed, while maintaining strong security guarantees.For the minority of use cases that the
safehtml
API does not accommodate, we will provide a separatesafehtml/uncheckedconversions
package that converts plain strings into safe types:These unsafe constructors will live their own separate package, much like how memory-unsafe operations all live in package
unsafe
, and crypto APIs prone to misuse live in packagecypto/subtle
. This makes code easier to security-review (i.e. "if the program doesn't import uncheckedconversions, the HTML produced by safehtml/template is definitely safe"), makes it easier to restrict the use of these functions (e.g. build systems like Bazel allow package-level visibility restrictions), and will hopefully discourage developers from unnecessarily reaching for these conversions (i.e. by requiring them to import the package and call a dangerous-sounding functions).Add different sanitization contexts for URLs that load code and those that do not. These contexts map to the
safehtml.TrustedResourceURL
andsafehtml.URL
types described above. The former type of URL will be validated more strictly than the latter.Disallow template data from being interpolated into JS and CSS contexts. The template parser will no longer attempt to parse JS or CSS. We will allow
safehtml.Script
andsafehtml.StyleSheet
values to be used in these contexts, but the constructor API around these types will be deliberately constrained. We might potentially add a switch that causessafehtml/template
to disallow inline scripts and stylesheets completely, even if they appear in the programmer-controlled template text. This switch will help developers ensure that all the markup by their template is CSP-compatible.Use whitelist-based sanitization. Elements or attributes not explicitly understood by the sanitizer will be disallowed by default. Developers must explicitly whitelist these attributes using an API along the lines of:
Provide a safe-by-design API for loading template text. This API will only allow templates to be loaded from programmer-controlled strings (i.e. untyped string constants) or resources under application control (e.g. environment variables, command-line flags). The following is a snippet of this new template-loading API:
The APIs described above are explained in great detail in Christoph Kern's Securing the Tangled Web (see "Strictly Contextually Auto-Escaping Template Engines", "Security Type Contracts", and "Unchecked Conversions"). Within Google, the security team has deployed strict contextual-autoescaping template systems (e.g. Closure templates) and safe HTML type implementations (e.g. Java and JavaScript types) across several different languages and frameworks. These packages have significantly decreased the incidence of XSS bugs without significantly impacting developer workflow.
Internal implementations of package
safehtml
andsafehtml/template
were deployed within Google a year ago. Currently, approximately 691 and 1153 Go packages usesafehtml/template
andsafehtml
respectively. Only about 15 of these packages useuncheckedconversions
, all of which have been manually reviewed by our security team. Consequently, we have developed a high degree of confidence that the current API is usable and meets most developers' use cases.Since
safehtml/template
is stricter thanhtml/template
by design, the features of the former cannot be easily integrated into the latter without making backward-incompatible changes.Open Questions
safehtml/template
even be in the standard library? We already havehtml/template
andtext/template
. Perhaps adding yet another template library will confuse users and bloat the standard library.safehtml/template
could potentially live in golang.org/x/tools or a separate GitHub repo altogether. Yet another option is to add all of this as optional functionality inhtml/template
that can be enabled by a flag.safehtml
types look like? The current internal implementation ofsafehtml
is customized for the ways that Google Go programmers generate HTML and HTML-related values, which might not translate well to external use cases. Since adding new constructors to packagesafehtml
is backwards-compatible, we can start by releasing the library with its current, field-tested API and respond to feature requests as external users adopt the package.uncheckedconversions
are essentially equivalent to the easily-misused typed strings inhtml/template
, except with spookier-sounding names. Within Google, we use our build system to restrict the use ofuncheckedconversions
; any developers attempting to import the package must receive code-review approval from a security team member. Without these restrictions, will most external Go developers follow the path of least resistance and misuseuncheckedconversions
, or will they adopt the principled approach and carefully review each use of those functions, preferring packagesafehtml
constructors wherever possible?html/template
tosafehtml/template
automatable?safehtml/template
is inherently stricter thanhtml/template
, so I expect that many migrations will require manual refactoring and reasoning about program dataflows. However, we might be able to write ago fix
that performs all automatable migrations, and lists the remaining areas that require manual attention.html/template
actually uses heuristics to infer the semantics of non-blacklisted attributes. If this inference fails, the attribute is assumed to have no special semantics.The text was updated successfully, but these errors were encountered: