This repository has been archived by the owner. It is now read-only.

review site for html injections #722

Closed
chadwhitacre opened this Issue Mar 12, 2013 · 52 comments

Comments

Projects
None yet
6 participants
@chadwhitacre
Contributor

chadwhitacre commented Mar 12, 2013

I just discovered that I wasn't escaping statements. :-(

I'm not sure how that happened, but it did. We need to review the whole site for html injections.


Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

@wilkie

This comment has been minimized.

Contributor

wilkie commented Mar 12, 2013

In investigating the issue 8506070 caused, I've audited participant statement usage throughout the site.

User Statement

stored non-escaped

statement.json: wrapped!
index.html: meta og:description: SERIOUSLY BAD: not-escaped (I confirmed in a development environment that I can indeed close the tag and inject a <link> or <meta> etc in <head>, which is obvi-bad)
index.html: displaying the statement: escaped twice! see #723 This doesn't seem to be a place where escape is needed, oddly.
about/stats: escaped for every user
about/goals.html: BAD: not escaped
templates/profile-edit.html: intentionally not escaped

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Mar 14, 2013

Fixed in b4bdc32: index.html: meta og:description: SERIOUSLY BAD: not-escaped (I confirmed in a development environment that I can indeed close the tag and inject a <link> or <meta> etc in <head>, which is obvi-bad)

Fixed in 959e72a: index.html: displaying the statement: escaped twice! see #723 This doesn't seem to be a place where escape is needed, oddly.

Fixed in cca632d: about/goals.html: BAD: not escaped

Good to close?

@wilkie

This comment has been minimized.

Contributor

wilkie commented Mar 14, 2013

well, the issue suggests looking for html injections throughout the site. I only focused on statement fields, so technically the issue is not yet fulfilled.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Mar 14, 2013

Good point. :)

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Apr 3, 2013

@buttscicles caught a few more of these (thanks!). I'm beginning to realize now why Tornado moved to HTML-escaping by default in 2.x. This feels like whack-a-mole. Gittip uses Aspen's default templating language, Tornado 1.x. We have a ticket over on Aspen to upgrade to 2.x (AspenWeb/pando.py#158). Picking up over there ...

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 5, 2014

Let's take this up now that we've migrated to Jinja2 (#1783).

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

@clone1018 Are we good to close now that we're on Jinja2?

@clone1018

This comment has been minimized.

Contributor

clone1018 commented Jan 30, 2014

I wouldn't say we're good but we're good for now.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

@clone1018 Would you be willing to conduct a review so we can close this ticket?

@clone1018

This comment has been minimized.

Contributor

clone1018 commented Jan 30, 2014

@whit537 I wouldn't feel completely safe until we change Jinja2 to being autoescape by default

http://jinja.pocoo.org/docs/extensions/#autoescape-extension

@clone1018

This comment has been minimized.

Contributor

clone1018 commented Jan 30, 2014

I'm in favor of closing this specific ticket for now, we can reticket any other issues and if escaping becomes a problem we can ticket the autoescape extension.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

I'm leery of autoescape, because the rules for escaping differ with the syntax you're templating into. If we're inserting into an href attribute like href="{{ foo }}" then we have different escaping needs than if we are inserting into an HTML context. Auto-escaping can give us a false sense of security that everything is escaped properly when we really still need to be aware of every single place in templates where we insert stuff and whether or not we're doing it right.

Even with autoescape we still need this audit.

@seanlinsley

This comment has been minimized.

Contributor

seanlinsley commented Jan 30, 2014

Could you go into detail @whit537 about the sorts of escaping rules that could cause trouble? In general, automatically escaping everything is a boon to security because it gives you a much smaller surface area to check for holes.

@chadwhitacre

This comment has been minimized.

@seanlinsley

This comment has been minimized.

Contributor

seanlinsley commented Jan 30, 2014

So you're saying... it might not use the right type of escaping for the given context?

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

Yup. We got bit by this once already, iirc.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

Basically, we need an audit based on that cheat sheet. Whether or not we use autoescaping is orthogonal.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Jan 30, 2014

Really, it would be great to have some tooling that presented a list of all the places in templates where we insert stuff, so we can review and check them off one by one.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Nov 3, 2014

Now that we have i18n we should consider including that in a review. IRC

@benhc123

This comment has been minimized.

Contributor

benhc123 commented Nov 11, 2014

I'll start taking a look at this. Thanks.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Nov 12, 2014

IRC

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

Retort:

  • We're only using Jinja2 for HTML.
  • But it doesn't hurt to use it when it's not needed.
  • Hmmm ... really?
@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

It just seems increasingly stupid to rely on "a human knows" for the 544 places where we are potentially vulnerable to injection. Performance will have to be really bad to justify not switching to autoescaping. No?

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

Autoescape interacts with i18n somehow:

screen shot 2015-02-04 at 1 15 30 pm

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

Also if [newstyle gettext calls] are used in an autoescaping environment they better support automatic escaping.

http://jinja.pocoo.org/docs/dev/extensions/#newstyle-gettext

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

So it looks like implementing autoescaping will take some work. It's not a trivial quick answer to this ticket.

@Changaco

This comment has been minimized.

Contributor

Changaco commented Feb 4, 2015

  • We're only using Jinja2 for HTML.

Not true, we also use it for plain text in email simplates.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

P.S. The place to implement autoescaping would be in aspen_jinja2, by adding autoescape=True, extensions['jinja2.ext.autoescape'] to the Environment instantiation. Then, another thing to work around is that we don't want to autoescape the CSS and JS assets. The workaround for that is:

[---]
{% autoescape false %}{{ css }}{% endautoescape %}

If the application enables the Autoescape Extension one can activate and deactivate the autoescaping from within the templates.

http://jinja.pocoo.org/docs/dev/templates/#autoescape-extension

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

Not true, we also use it for plain text in email simplates.

Okay, so if we go with autoescaping I guess we'd use the {% autoescape false %} mechanism to turn it off for the text/plain email parts.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

@Changaco Can you speak to whether we're using newstyle gettext calls?

@Changaco

This comment has been minimized.

Contributor

Changaco commented Feb 4, 2015

Sort of, we use custom functions.

chadwhitacre added a commit that referenced this issue Feb 4, 2015

Here's a helper for #722
Figured this would be worth having for now until we switch to autoescape
(#3152).

chadwhitacre added a commit that referenced this issue Feb 4, 2015

Here's a helper for #722
Figured this would be worth having for now until we switch to autoescape
(#3152).
@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 4, 2015

Here's a slightly differently-formatted list:

https://gist.github.com/whit537/03b1e5f88fcf38c3dd87

Looks like we have 527 replacements on master right now.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

I separated out the places where we are just doing i18n ({{ _("Greetings, program!") }}).

https://gist.github.com/whit537/007d96927377a7ed0419

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

Category N
High Risk 336
Simple i18n 139
Assets 23
Escaped 29

https://gist.github.com/whit537/0a60f06b87ecc332f4fe

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

Alright, I still don't want to go through 336 of these. :-)

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

The next step down the autoescaping path is to benchmark performance. If performance isn't "really a lot" worse with autoescaping then we should just use that. It'll take a chunk of work to switch over though due to i18n so benchmarking probably makes sense to make sure it'll be worth it.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

If performance is really as atrocious as promised then we should review the 336 places and make it part of our PR review habit.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

Here's a script I wrote to help with benchmarking:

https://gist.github.com/whit537/352a6b02d5f644720007

Gonna do some benchmarks now.

@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py | grep " seconds "
11.9522500038 seconds without autoescaping
12.4190990925 seconds *with* autoescaping
[gratipay] $
#!/usr/bin/env python
"""This is a helper script for benchmarking page performance re: #722.

Set this in local.env to avoid a canonicalizing redirect:

    CANONICAL_HOST=

Then run this script like so:

    [gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py

"""
import time
import aspen_jinja2_renderer
from gratipay.testing import Harness


def autoescaping_compile_meta(self, configuration):
    loader = None
    if configuration.project_root is not None:
        # Instantiate a loader that will be used to resolve template bases.
        loader = aspen_jinja2_renderer.FileSystemLoader(configuration.project_root)
    return aspen_jinja2_renderer.Environment( loader=loader
                                            , autoescape=True
                                            , extensions=['jinja2.ext.autoescape']
                                             )

start = time.time()
for i in range(100):
    Harness.client.GET('/')
print("{} seconds without autoescaping".format(time.time() - start))

aspen_jinja2_renderer.Factory.compile_meta = autoescaping_compile_meta

start = time.time()
for i in range(100):
    Harness.client.GET('/')
print("{} seconds *with* autoescaping".format(time.time() - start))
@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

10x

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py | grep " seconds "
1.3902361393 seconds without autoescaping
1.18036794662 seconds *with* autoescaping
[gratipay] $
@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 10 | grep " seconds "
1.29223489761 seconds without autoescaping
1.17796278 seconds *with* autoescaping
[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 100 | grep " seconds "
11.9243550301 seconds without autoescaping
12.4741439819 seconds *with* autoescaping
[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 1000 | grep " seconds "
124.910935163 seconds without autoescaping
132.204340935 seconds *with* autoescaping
[gratipay] $
#!/usr/bin/env python
"""This is a helper script for benchmarking page performance re: #722.

Set this in local.env to avoid a canonicalizing redirect:

    CANONICAL_HOST=

Then run this script like so:

    [gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py

"""
import sys
import time

import aspen_jinja2_renderer
from gratipay.testing import Harness


def autoescaping_compile_meta(self, configuration):
    loader = None
    if configuration.project_root is not None:
        # Instantiate a loader that will be used to resolve template bases.
        loader = aspen_jinja2_renderer.FileSystemLoader(configuration.project_root)
    return aspen_jinja2_renderer.Environment( loader=loader
                                            , autoescape=True
                                            , extensions=['jinja2.ext.autoescape']
                                             )

try:
    n = int(sys.argv[1])
except IndexError:
    n = 10

start = time.time()
for i in range(n):
    Harness.client.GET('/')
print("{} seconds without autoescaping".format(time.time() - start))

aspen_jinja2_renderer.Factory.compile_meta = autoescaping_compile_meta

start = time.time()
for i in range(n):
    Harness.client.GET('/')
print("{} seconds *with* autoescaping".format(time.time() - start))
@chadwhitacre

This comment has been minimized.

Contributor

chadwhitacre commented Feb 5, 2015

Alright, performance concerns look like FUD to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.