Parsing and Manipulation Library for Content Security Policy
This is a Python library to parse, manipulate, compare, and generate policies according to the most important features of the version 1.1 draft of Content Security Policy as of 11 February 2014. It is mostly useful to analyse violation reports submitted by browsers using CSP's
Overview of the library
The library contains higher-level objects that represent CSP's main concepts:
URI represents URIs of various types, including regular http(s) URLs, and a few special URIs with schemes such as
chrome-extension, for instance.
SourceExpression and its subclasses indicate sources from which resources may be loaded.
Directive contains a set of whitelisted source expressions for a specific resource type, such as all locations from which images may be loaded.
Policy is a set of directives, at most one for each resource type, and specifies for a protected document the locations from which included resources of certain types may be loaded.
Report corresponds to violation reports that browsers can send back to a web site under CSP's
report-uri feature when a policy is violated. It contains various fields that specify the details of the violation, such as the policy of the site, the violated directive, and the URI of the resource that violated the policy. Additionally,
LogEntry is a data type that we added to represent additional information about a report received at a web site, such as the timestamp and the IP address and user agent string of the browser that sent the report.
All data types are immutable. They implement the standard hash and equality methods so that they can be used intuitively in Python's data structures such as lists and sets. For instance, two lists containing the same policies (even if they are not the same
Policy instance) will compare as equal. The
str(.) methods will convert the objects into the string representation used in the CSP standard. Each data type also has an associated parser to read from strings. Instead of returning
None or throwing an exception, most methods will instead return a special singleton instance
INVALID in case of errors.
Collecting violation reports
src/examples/csp.cgi contains a sample CGI script that can receive CSP violation reports from browsers and store them into files.
We recommend to send the following CSP headers to browsers:
Content-Security-Policy-Report-Only: default-src 'none'; script-src 'unsafe-eval' 'unsafe-inline'; object-src 'none'; style-src 'unsafe-inline'; img-src 'none'; media-src 'none'; frame-src 'none'; font-src 'none'; connect-src 'none'; report-uri http://example.com/csp.cgi?type=regular Content-Security-Policy-Report-Only: default-src *; script-src * 'unsafe-inline'; style-src * 'unsafe-inline'; report-uri http://example.com/csp.cgi?type=eval Content-Security-Policy-Report-Only: default-src *; script-src * 'unsafe-eval'; style-src *; report-uri http://example.com/csp.cgi?type=inline
They operate in report-only mode, that is, the policy is only simulated, not enforced, and will not cause the web site to malfunction. The three headers should all be sent simultaneously. They are necessary to capture different types of violations (violation reports as currently sent by browsers cannot distinguish
Example: Generate a policy
The following sample code generates a policy that whitelists every resource that caused a violation report in
inputfile, assuming that the reports were collected with a methodology similar to the one outlined above:
from csp.tools.fileio import LogEntryDataReader from csp.policy import Policy inputfile = "tests/csp/data/sample-logentries.dat" fin = LogEntryDataReader(True) fullPolicy = None def handleEntry(entry): global fullPolicy newPolicy = entry.generatePolicy() if newPolicy is Policy.INVALID(): print "generated invalid policy from log entry '%s'" % str(entry) return if fullPolicy is None: fullPolicy = newPolicy else: fullPolicy = fullPolicy.combinedPolicy(newPolicy) fin.load(inputfile, handleEntry) print "Generated policy with full paths: %s" % str(fullPolicy) print "" print "Generated policy without paths: %s" % str(fullPolicy.withoutPaths())
In practice, additional steps are necessary to filter the reports before a policy is derived. Furthermore, the resulting policy needs to be processed manually in order to remove any directives that might be due to attacks or undesirable resources injected into the web site. The directive
default-src 'none' should be added to the policy so that resources not explicitly allowed will be forbidden (the default behaviour of CSP is to assume
default-src * when there is no
default-src directive specified in the policy).
This library has no external dependencies. To run the tests, however, you'll need
When generating a policy from violation reports as in the example above, it is possible that the resulting policy cannot be parsed by this library when it uses a scheme that is not listed in
defaults.schemePortMappings. The workaround is to add the unknown scheme to these data structures.
Some script-related reports include the value
blobfor the field
blocked-uri. This library incorrectly parses this value as the host name of an URI.