Skip to content

Conversation

@pcorliss
Copy link
Contributor

@pcorliss pcorliss commented Jan 5, 2015

Paired with @kjperry

For GET requests and POST requests with query parameters the query_string remains unsanitized. While GET requests with sensitive data is rare for our application we wanted to be sure that bad requests would be sanitized as expected.

screen shot 2015-01-05 at 9 21 25 am

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you kindly use CGI::parse here? I think the code as written will break with keys that have several values + url encoding.

@pcorliss
Copy link
Contributor Author

pcorliss commented Jan 8, 2015

Modified to use standard library functions. Please note that I had to modify the array sanitizer to pass the key down. Otherwise elements in arrays wouldn't be sanitized as desired.

Looks like there are failures I'll take a look.

@pcorliss
Copy link
Contributor Author

Phew, okay now that the builds are passing...

I added in a backport of URI from the rack gem to support URI::encode_www_form_component which allows us to convert back from CGI::parse. This solution involves a fair amount of extra stuff, but does result in an encoding safe sanitization.

If the above changes seem excessive we could always drop back to my original solution. I believe it will handle odd encodings and several values but I'm not 100% certain. I attempted several odd encodings and special character combinations and it seemed to work fine.

@nateberkopec
Copy link
Contributor

Wow! That's some awesome work! Thank you so much for taking the time to do all of this - I was considering just doing a "if >1.8 do" sort of branch, but you've made it work. I'm sure the people running 1.8 will appreciate it.

@pcorliss Can you write a test for the case of multiple values to a key and for url encoding? Just two more test assertions is all, nothing crazy. I know both of these cases are common in Rails so I want make sure we get that right.

@pcorliss
Copy link
Contributor Author

Tests added. A few notes, because of the way this is implemented the query string can mutate a bit. This is due to the decoding and subsequent re-encoding of the content of the strings. Simple stuff like %20 being converted to +, and [ converting to %5B. This is unavoidable as far as I can tell. And shouldn't overtly affect clients.

@nateberkopec
Copy link
Contributor

@pcorliss thanks so much again. This is some very classy work.

I think the gain here (sanitizing query strings) is better than the loss (query strings which were not url encoded in the original request will appear url encoded in Sentry). Ideally you will always have your actual application logs to check against any discrepancy.

nateberkopec added a commit that referenced this pull request Jan 13, 2015
Adding suppport for sanitizing query parameters.
@nateberkopec nateberkopec merged commit 5aef0a6 into getsentry:master Jan 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants