New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for HTTP meta-equiv headers. #264
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions, but generally looks good.
Without knowledge of the project, I suspect that the test coverage here is inadequate, and that coverage would fall substantially as a result of this PR. Might be worth further looking into.
""" | ||
|
||
# Clean out all the junk | ||
csp_string = csp_string.replace('\r', '').replace('\n', '').strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably fine, but personally I'd treat it like an array and use csp_string.splitlines()
; it mirrors the nature of the data better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually shouldn't be split across lines. Carriage returns should be ignored, as it it's ideally all a single line. I've seen a lot of junk from various webservers, so this is my attempt to work aroundi t.
httpobs/scanner/analyzer/headers.py
Outdated
# So technically the shortest directive is img-src, so lets just assume that | ||
# anything super short is invalid | ||
if len(csp_string) < 6 or csp_string.isspace(): | ||
raise ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the use of a magic number here. I'd suggest defining it as a constant, and comparing against
SHORTEST_DIRECTIVE = 'img-src'
SHORTEST_DIRECTIVE_LENGTH = len(SHORTEST_DIRECTIVE)
Then in addition to having more obvious and legible code, you can import these and use them as comparators for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thank you.
httpobs/scanner/analyzer/headers.py
Outdated
|
||
# Technically the path part of any source is case-sensitive, but since we don't test | ||
# any paths, we can cheat a little bit here | ||
values = set([_.lower() for _ in entry[-1].split()]) if len(entry) > 1 else {'\'none\''} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit, but maybe give _
a meaningful name? It makes the line awkwardly long, but for production Python code I try to avoid single-char variable names, even in comprehensions, as it has the potential to conflict with pdb.set_trace()
commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to source
, as defined in the CSP specification. Thanks!
<body> | ||
|
||
</body> | ||
</html> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the trailing line endings on all these files.
for source in script_src) and '\'unsafe-inline\'' in script_src: | ||
script_src.remove('\'unsafe-inline\'') | ||
|
||
# Now to make the piggies squeal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐽
httpobs/scanner/analyzer/headers.py
Outdated
# While technically valid in that you just use the first entry, we are saying that repeated | ||
# directives are invalid so that people notice it | ||
if directive in csp: | ||
raise ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're raising ValueError
s for multiple reasons, I'd suggest doing one of two things:
- Raising instances of
ValueError
with amessage
, e.g.raise ValueError('Repeated directives are invalid')
- More complicated but probably more correct: create a series of custom error classes for each possible exception, so in the future you can explicitly except specific exception types.
Class HTTPObservatoryException(Exception):
Pass
Class CSPParsingException(HTTPObservatoryException):
Pass
Class DirectiveLengthSception(CSPParsingException):
Pass
Class RepeatedDirectiveException(CSPParsingException):
Pass
(I'm on an iPad so first lines are capitalizing, but you get the idea.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, with the first option. I may eventually do the later, but the exceptions are all handled so they don't actually percolate upward in a useful manner, only the failure is.
httpobs/scanner/analyzer/headers.py
Outdated
'equiv': __parse_csp(response.http_equiv.get('Content-Security-Policy')) | ||
if 'Content-Security-Policy' in response.http_equiv else None, | ||
} | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naked except
s are really bad practice. I know it's catching some expected exceptions here, but it could potentially mask other errors that are unexpected. You should whitelist a set of expected exception types, e.g. except (ValueError) as e:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My problem is I'm not entirely sure what it might catch here, since the data coming from the client can be so weird. It should be just ValueError looking at my code, but I'm not sure if there are some weird subsets of Unicode or null characters or other weirdness that might trigger an exception. I really just want to catch any weird/broken header and mark it as invalid.
else: | ||
output['result'] = 'csp-not-implemented' | ||
# Code defensively on the size of the data | ||
output['data'] = csp if len(str(csp)) < 32768 else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to trim the string rather than remove it if it's long? Either csp[:32768]
or textwrap.shorten
(in 3.4+).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically just drop it because I don't want a truncated string to mess up future database queries, when I'm looking for feature usage.
@@ -152,6 +206,8 @@ def cookies(reqs: dict, expectation='cookies-secure-with-httponly-sessions') -> | |||
'cookies-session-without-httponly-flag', | |||
'cookies-session-without-secure-flag'] | |||
|
|||
# TODO: Support cookies set over http-equiv (ugh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File a bug for this and add the URL to it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, see #265.
No description provided.