Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-36274: Encode request lines with surrogate escapes #12315

Closed
wants to merge 1 commit into from

Conversation

@tipabu
Copy link

commented Mar 13, 2019

While this is out of spec according to RFC 7230 (which limits
expected octets to some subset of ASCII), it is often useful to
be able to mimic an out-of-spec client when testing a server or
application.

Don't use Latin-1 (though that would be in keeping with how we
handle headers and bodies) to encourage callers to write
RFC-complient clients. Rather, use surrogate escape sequences
('\udc80' - '\udcff') to increase friction while still
allowing out-of-spec requests to be expressable.

This is the second fix proposed in the bug report; the first was submitted as #12314 so reviewers can decide between fixes.

https://bugs.python.org/issue36274

https://bugs.python.org/issue36274

@auvipy
auvipy approved these changes May 31, 2019
@@ -274,6 +274,45 @@ def test_ipv6host_header(self):
conn.request('GET', '/foo')
self.assertTrue(sock.data.startswith(expected))

def test_request_path_handling(self):
conn = client.HTTPConnection('server.fqdn')

This comment has been minimized.

Copy link
@ZackerySpytz

ZackerySpytz Jul 8, 2019

Contributor

I believe there is too much duplication in the unit tests for both of your pull requests. Please use a loop or two (like what is done in test_invalid_headers()).

This comment has been minimized.

Copy link
@tipabu

tipabu Jul 8, 2019

Author

Sure; done. Any preference on which approach to take, though?

bpo-36274: Encode request lines with surrogate escapes
While this is out of spec according to RFC 7230 (which limits
expected octets to some subset of ASCII), it is often useful to
be able to mimic an out-of-spec client when testing a server or
application.

Don't use Latin-1 (though that would be in keeping with how we
handle headers and bodies) to encourage callers to write
RFC-complient clients. Rather, use surrogate escape sequences
('\udc80' - '\udcff') to increase friction while still
allowing out-of-spec requests to be expressable.

https://bugs.python.org/issue36274

@tipabu tipabu force-pushed the tipabu:bpo-36274-surrogate-escape branch from d57b342 to 0573040 Jul 8, 2019

@jaraco
jaraco approved these changes Sep 11, 2019
Copy link
Member

left a comment

This change looks good to me and is my preferred option from those presented. The only thing I'd like to see is some explanation in the tests linking to the justification (already made in the issue tracker).

@jaraco

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

After reviewing this request with @ericsnowcurrently, we've decided that this approach is dangerous in that it has the potential to expose users unexpectedly to non-compliant behavior, where as currently they are assured compliance. In particular, if a user had input from a source where it was surrogate-escaped non-ascii unicode, the request would currently be rejected but now will be accepted. Instead, we would like to see a more explicit opt-in, such as through a separate method or through a setting on the call and/or client object.

As a result, I'll be closing this PR ad the alternate and will follow up in the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.