Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec is inconsistent about which strings are valid CSPs #414

Open
bakkot opened this issue Nov 22, 2019 · 0 comments
Open

Spec is inconsistent about which strings are valid CSPs #414

bakkot opened this issue Nov 22, 2019 · 0 comments

Comments

@bakkot
Copy link

@bakkot bakkot commented Nov 22, 2019

The Parse a serialized CSP algorithm says that it must be given a "serialized CSP", which is an ASCII string adhering to

serialized-policy =
    serialized-directive *( optional-ascii-whitespace ";" [ optional-ascii-whitespace serialized-directive ] )
serialized-directive =
    directive-name [ required-ascii-whitespace directive-value ]
directive-name =
    1*( ALPHA / DIGIT / "-" )
directive-value =
    *( required-ascii-whitespace / ( %x21-%x2B / %x2D-%x3A / %x3C-%x7E ) )

This does not match, for example, the string @; script-src 'none'.

However, the algorithm itself does accept that string: step 2.3 is "Let directive name be the result of collecting a sequence of code points from token which are not ASCII whitespace.", which consumes @ as directive name. And indeed I would expect it a parser to have defined behavior for all strings, which may include rejecting them. It is very strange to say that the parser must be given a string which conforms to a particular grammar, especially since consumers like the HTML spec do not first check that the strings with which they are calling it conform to the grammar.

(In fact the algorithm as specified accepts all strings.)

It is not clear to me what the intended interpretation of the string @; script-src 'none' is. Browsers seem to treat the @ as an unrecognized directive and discard it as they would any other directive, and hence still enforce the script-src 'none' part. But if we read 3.1 The Content-Security-Policy HTTP Response Header Field strictly, a HTTP header named Content-Security-Policy whose value is @; script-src 'none' is not actually a Content-Security-Policy header.


There are other places this comes up: for example, the grammar for serialized-source-list

serialized-source-list =
    ( source-expression *( required-ascii-whitespace source-expression ) ) / "'none'"
source-expression =
    scheme-source / host-source / keyword-source / nonce-source / hash-source

does not match 'none' https://example.com, which by a strict reading of the definition of img-src means that img-src 'none' https://example.com is not a img-src directive. Is that the intent? (cf #411)


The general statement of the problem is that the current spec gives grammars for things which are more restrictive than the algorithms which are said to correspond to those grammars. I think this should be fixed, either by removing the relevant grammars from the normative specification, by loosing them, or by tightening the algorithms which correspond to them (which would probably be a breaking change).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.