Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HTTP meta-equiv headers. #264

Merged
merged 5 commits into from Sep 6, 2017
Merged

Conversation

april
Copy link
Contributor

@april april commented Sep 5, 2017

No description provided.

Copy link

@chuckharmston chuckharmston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, but generally looks good.

Without knowledge of the project, I suspect that the test coverage here is inadequate, and that coverage would fall substantially as a result of this PR. Might be worth further looking into.

"""

# Clean out all the junk
csp_string = csp_string.replace('\r', '').replace('\n', '').strip()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine, but personally I'd treat it like an array and use csp_string.splitlines(); it mirrors the nature of the data better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually shouldn't be split across lines. Carriage returns should be ignored, as it it's ideally all a single line. I've seen a lot of junk from various webservers, so this is my attempt to work aroundi t.

# So technically the shortest directive is img-src, so lets just assume that
# anything super short is invalid
if len(csp_string) < 6 or csp_string.isspace():
raise ValueError

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the use of a magic number here. I'd suggest defining it as a constant, and comparing against

SHORTEST_DIRECTIVE = 'img-src'
SHORTEST_DIRECTIVE_LENGTH = len(SHORTEST_DIRECTIVE)

Then in addition to having more obvious and legible code, you can import these and use them as comparators for tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thank you.


# Technically the path part of any source is case-sensitive, but since we don't test
# any paths, we can cheat a little bit here
values = set([_.lower() for _ in entry[-1].split()]) if len(entry) > 1 else {'\'none\''}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, but maybe give _ a meaningful name? It makes the line awkwardly long, but for production Python code I try to avoid single-char variable names, even in comprehensions, as it has the potential to conflict with pdb.set_trace() commands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to source, as defined in the CSP specification. Thanks!

<body>

</body>
</html>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the trailing line endings on all these files.

for source in script_src) and '\'unsafe-inline\'' in script_src:
script_src.remove('\'unsafe-inline\'')

# Now to make the piggies squeal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐽

# While technically valid in that you just use the first entry, we are saying that repeated
# directives are invalid so that people notice it
if directive in csp:
raise ValueError

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're raising ValueErrors for multiple reasons, I'd suggest doing one of two things:

  1. Raising instances of ValueError with a message, e.g. raise ValueError('Repeated directives are invalid')
  2. More complicated but probably more correct: create a series of custom error classes for each possible exception, so in the future you can explicitly except specific exception types.
Class HTTPObservatoryException(Exception):
    Pass

Class CSPParsingException(HTTPObservatoryException):
    Pass

Class DirectiveLengthSception(CSPParsingException):
    Pass

Class RepeatedDirectiveException(CSPParsingException):
    Pass

(I'm on an iPad so first lines are capitalizing, but you get the idea.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, with the first option. I may eventually do the later, but the exceptions are all handled so they don't actually percolate upward in a useful manner, only the failure is.

'equiv': __parse_csp(response.http_equiv.get('Content-Security-Policy'))
if 'Content-Security-Policy' in response.http_equiv else None,
}
except:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naked excepts are really bad practice. I know it's catching some expected exceptions here, but it could potentially mask other errors that are unexpected. You should whitelist a set of expected exception types, e.g. except (ValueError) as e:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem is I'm not entirely sure what it might catch here, since the data coming from the client can be so weird. It should be just ValueError looking at my code, but I'm not sure if there are some weird subsets of Unicode or null characters or other weirdness that might trigger an exception. I really just want to catch any weird/broken header and mark it as invalid.

else:
output['result'] = 'csp-not-implemented'
# Code defensively on the size of the data
output['data'] = csp if len(str(csp)) < 32768 else {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to trim the string rather than remove it if it's long? Either csp[:32768] or textwrap.shorten (in 3.4+).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically just drop it because I don't want a truncated string to mess up future database queries, when I'm looking for feature usage.

@@ -152,6 +206,8 @@ def cookies(reqs: dict, expectation='cookies-secure-with-httponly-sessions') ->
'cookies-session-without-httponly-flag',
'cookies-session-without-secure-flag']

# TODO: Support cookies set over http-equiv (ugh)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File a bug for this and add the URL to it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, see #265.

@april april merged commit 5d227eb into mozilla:master Sep 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants