New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [2.0] Header keys should be native strings. #1338

Closed
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@Lukasa
Member

Lukasa commented Apr 30, 2013

Note: This PR follows a discussion on IRC. The logs can be found here.

Summary

This fixes a problem that @https://github.com/va1en0k would have had in #1335 if the current Case-Insensitive Dict didn't behave so weirdly (see #1333 for progress on that).

Right now, Requests unconditionally encodes all header values as bytes on all Python versions. Kenneth and I are (tentatively) in agreement that header values should actually be native strings on all platforms. This pull request introduces code that explicitly encodes or decodes string objects to satisfy that requirement.

Notes

Firstly, this is a breaking API change on Python 3. On Python 2 the behaviour will be the same unless the user has changed sys.defaultencoding, and if they have they deserve what's coming to them. On Python 3, all header keys on PreparedRequests will go from being bytes to unicode objects, and so anyone who has correctly programmed to this interface will have to change their code.

Secondly, this code currently uses ASCII as the codec in both directions. I think this is OK for now. I'm pretty sure that header values should actually be Latin-1, but I can understand if we want to keep the current behaviour in place. I'm open to feedback here.

@cdunklau

This comment has been minimized.

Show comment
Hide comment
@cdunklau

cdunklau Apr 30, 2013

Contributor

Ref #1339, completed CaseInsensitiveDict rewrite.

Contributor

cdunklau commented Apr 30, 2013

Ref #1339, completed CaseInsensitiveDict rewrite.

@cdunklau

This comment has been minimized.

Show comment
Hide comment
@cdunklau

cdunklau May 1, 2013

Contributor

CID merged in e6e9b55.

Contributor

cdunklau commented May 1, 2013

CID merged in e6e9b55.

@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 May 1, 2013

Member

@Lukasa rebase ;)

Member

sigmavirus24 commented May 1, 2013

@Lukasa rebase ;)

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa May 1, 2013

Member

Done. =)

Member

Lukasa commented May 1, 2013

Done. =)

@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 May 1, 2013

Member

@Lukasa the tests fail on python3 where you have 'accept'.encode('ascii') because of the new CaseInsensitiveDict (#1339)

Member

sigmavirus24 commented May 1, 2013

@Lukasa the tests fail on python3 where you have 'accept'.encode('ascii') because of the new CaseInsensitiveDict (#1339)

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa May 1, 2013

Member

That should resolve it. =D I wish Travis still emailed me when I'd broken the build.

Member

Lukasa commented May 1, 2013

That should resolve it. =D I wish Travis still emailed me when I'd broken the build.

@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 May 1, 2013

Member

Yeah that fixes it. Travis just has your 2.6 build queued so @kennethreitz might be wary until it completes (not that he need be).

Member

sigmavirus24 commented May 1, 2013

Yeah that fixes it. Travis just has your 2.6 build queued so @kennethreitz might be wary until it completes (not that he need be).

@cdunklau

View changes

Show outdated Hide outdated requests/models.py
@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa May 1, 2013

Member

@cdunklau That was a great idea, and is now done.

Member

Lukasa commented May 1, 2013

@cdunklau That was a great idea, and is now done.

@kennethreitz

This comment has been minimized.

Show comment
Hide comment
@kennethreitz

kennethreitz May 4, 2013

Member

What's the status here?

Member

kennethreitz commented May 4, 2013

What's the status here?

@sigmavirus24

This comment has been minimized.

Show comment
Hide comment
@sigmavirus24

sigmavirus24 May 4, 2013

Member

I believe the status is 🍰 but I'm not 100% sure.

Member

sigmavirus24 commented May 4, 2013

I believe the status is 🍰 but I'm not 100% sure.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa May 4, 2013

Member

The status is 🍰, unless there's something more you want here. In which case, I am your humble servant. =)

Member

Lukasa commented May 4, 2013

The status is 🍰, unless there's something more you want here. In which case, I am your humble servant. =)

@oliverjanik

This comment has been minimized.

Show comment
Hide comment
@oliverjanik

oliverjanik May 13, 2013

Merge please.

oliverjanik commented May 13, 2013

Merge please.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa May 13, 2013

Member

@oliverjanik: The reason this wasn't merged straight away is that it is a significant, backward-incompatible change. It is generally unwise to merge such a change immediately. It's important to add it in a release which gets a minor version bump (e.g. 1.2.x to 1.3.x), and with good documentation (which reminds me, I should document this change somewhere).

This, or something similar, will get merged in good time. =)

Member

Lukasa commented May 13, 2013

@oliverjanik: The reason this wasn't merged straight away is that it is a significant, backward-incompatible change. It is generally unwise to merge such a change immediately. It's important to add it in a release which gets a minor version bump (e.g. 1.2.x to 1.3.x), and with good documentation (which reminds me, I should document this change somewhere).

This, or something similar, will get merged in good time. =)

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 8, 2013

Member

Rebased again, I'll close and reopen to get GitHub to realise it.

Member

Lukasa commented Jun 8, 2013

Rebased again, I'll close and reopen to get GitHub to realise it.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 8, 2013

Member

Let's pretend I didn't push up a merge conflict. =P

Member

Lukasa commented Jun 8, 2013

Let's pretend I didn't push up a merge conflict. =P

@kennethreitz

This comment has been minimized.

Show comment
Hide comment
@kennethreitz

kennethreitz Jun 8, 2013

Member

What's the status here?

Member

kennethreitz commented Jun 8, 2013

What's the status here?

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 8, 2013

Member

The answer is that this should be fine. It's a breaking API change, so if you want to sit on it for a while that would be totally reasonable.

Member

Lukasa commented Jun 8, 2013

The answer is that this should be fine. It's a breaking API change, so if you want to sit on it for a while that would be totally reasonable.

@oliverjanik

This comment has been minimized.

Show comment
Hide comment
@oliverjanik

oliverjanik Jun 9, 2013

I vote for merging, because this bug makes requests behave differently on Python2 and Python3 and there is no reasonable work-around.

While it's true that it breaks public API it breaks undocumented behaviour, which nobody should rely on.

oliverjanik commented Jun 9, 2013

I vote for merging, because this bug makes requests behave differently on Python2 and Python3 and there is no reasonable work-around.

While it's true that it breaks public API it breaks undocumented behaviour, which nobody should rely on.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 9, 2013

Member

@oliverjanik If only it were that simple. =) This will affect the behaviour of the headers dictionary. People will have spotted that Requests sets its headers to bytes on Python3 and have written code that takes that into account. That code will break.

We simply can't rush this into a Requests release. With that said, this (or something like it) will definitely end up in Requests eventually. =)

Member

Lukasa commented Jun 9, 2013

@oliverjanik If only it were that simple. =) This will affect the behaviour of the headers dictionary. People will have spotted that Requests sets its headers to bytes on Python3 and have written code that takes that into account. That code will break.

We simply can't rush this into a Requests release. With that said, this (or something like it) will definitely end up in Requests eventually. =)

@ecbftw

This comment has been minimized.

Show comment
Hide comment
@ecbftw

ecbftw Jun 20, 2013

Under Python3 with requests 1.2.3, header overrides are currently broken. That is, if I specify a header that requests would normally specify itself (such as Accept-Encoding or Content-Type), then it simply adds my header instead of replacing it.

Approximate script I used to test:

approximate script:

host = 'localhost'
port = 443
protocol = 'https'
data = 'mystring'

session = requests.Session()

method = 'POST'
path = '/my/web/service'
headers = {'x-custom-header1': 'x',
           'x-custom-header2': data,
           'x-custom-header3': 'x',
           'User-Agent': 'myagent',
           'Content-Type': 'text/xml',
           'x-custom-header4': '12345'}
url = "%s://%s:%d%s" % (protocol,host,port,path)
body = (b'<myxml></myxml>\n')

return session.request(method, url, headers=headers, data=body,
                       cert=('cert.pem', 'key.pem'),
                       verify=False)

HTTP request generated:

POST /my/web/service HTTP/1.1
Host: localhost
Accept-Encoding: identity
Content-Type: application/x-www-form-urlencoded
Content-Length: 64
Accept-Encoding: gzip, deflate, compress
x-custom-header1: x
x-custom-header2: mystring
x-custom-header3: x
Accept: /
User-Agent: myagent
Content-Type: text/xml
x-custom-header4: 12345

Note the two content types. This same script in python 2 behaved correctly.

ecbftw commented Jun 20, 2013

Under Python3 with requests 1.2.3, header overrides are currently broken. That is, if I specify a header that requests would normally specify itself (such as Accept-Encoding or Content-Type), then it simply adds my header instead of replacing it.

Approximate script I used to test:

approximate script:

host = 'localhost'
port = 443
protocol = 'https'
data = 'mystring'

session = requests.Session()

method = 'POST'
path = '/my/web/service'
headers = {'x-custom-header1': 'x',
           'x-custom-header2': data,
           'x-custom-header3': 'x',
           'User-Agent': 'myagent',
           'Content-Type': 'text/xml',
           'x-custom-header4': '12345'}
url = "%s://%s:%d%s" % (protocol,host,port,path)
body = (b'<myxml></myxml>\n')

return session.request(method, url, headers=headers, data=body,
                       cert=('cert.pem', 'key.pem'),
                       verify=False)

HTTP request generated:

POST /my/web/service HTTP/1.1
Host: localhost
Accept-Encoding: identity
Content-Type: application/x-www-form-urlencoded
Content-Length: 64
Accept-Encoding: gzip, deflate, compress
x-custom-header1: x
x-custom-header2: mystring
x-custom-header3: x
Accept: /
User-Agent: myagent
Content-Type: text/xml
x-custom-header4: 12345

Note the two content types. This same script in python 2 behaved correctly.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 20, 2013

Member

@adian0: Indeed. Try replacing the 'User-Agent' and 'Content-Type' header keys with b'User-Agent' and b'Content-Type'. If that works, then the problem is one that will be fixed by this pull request.

Member

Lukasa commented Jun 20, 2013

@adian0: Indeed. Try replacing the 'User-Agent' and 'Content-Type' header keys with b'User-Agent' and b'Content-Type'. If that works, then the problem is one that will be fixed by this pull request.

@ecbftw

This comment has been minimized.

Show comment
Hide comment
@ecbftw

ecbftw Jun 20, 2013

No, already I tried using bytes for those. It threw this exception:
...
File "/usr/local/lib/python3.2/dist-packages/requests/models.py", line 340, in
headers = dict((name.encode('ascii'), value) for name, value in headers.items())
AttributeError: 'bytes' object has no attribute 'encode'

ecbftw commented Jun 20, 2013

No, already I tried using bytes for those. It threw this exception:
...
File "/usr/local/lib/python3.2/dist-packages/requests/models.py", line 340, in
headers = dict((name.encode('ascii'), value) for name, value in headers.items())
AttributeError: 'bytes' object has no attribute 'encode'

@paparomeo

This comment has been minimized.

Show comment
Hide comment
@paparomeo

paparomeo Jun 21, 2013

The cause of the behavior reported by @adian0 seems to be the same I described in #1250.

paparomeo commented Jun 21, 2013

The cause of the behavior reported by @adian0 seems to be the same I described in #1250.

@Lukasa

This comment has been minimized.

Show comment
Hide comment
@Lukasa

Lukasa Jun 21, 2013

Member

@pmcnr Agreed, and this PR should resolve it. =)

Member

Lukasa commented Jun 21, 2013

@pmcnr Agreed, and this PR should resolve it. =)

@paparomeo

This comment has been minimized.

Show comment
Hide comment
@paparomeo

paparomeo Jun 21, 2013

@Lukasa It's a shame this PR is a breaking API change in Python 3. I've really been looking forward to the next major release of requests when you can finally include it. A big thank you for the excellent work!

paparomeo commented Jun 21, 2013

@Lukasa It's a shame this PR is a breaking API change in Python 3. I've really been looking forward to the next major release of requests when you can finally include it. A big thank you for the excellent work!

@kennethreitz kennethreitz referenced this pull request Jul 15, 2013

Closed

Consider 2.0 #1459

Lukasa added some commits Apr 30, 2013

Header keys should be native strings.
This commit follows a discussion on IRC. For more information, see the
Pull Request associated with it.
Instantiate the CID directly.
No need to do this the slow way now. Thanks to Colin (@cdunklau) for the
idea.
@kennethreitz

This comment has been minimized.

Show comment
Hide comment
@kennethreitz
Member

kennethreitz commented Aug 1, 2013

@kennethreitz

This comment has been minimized.

Show comment
Hide comment
@kennethreitz

kennethreitz Aug 1, 2013

Member

Closing since this is merged into #1509

Member

kennethreitz commented Aug 1, 2013

Closing since this is merged into #1509

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment