Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should urllib2.urlopen send an Accept-Encoding header? #52978

Closed
dabrahams mannequin opened this issue May 16, 2010 · 6 comments
Closed

Should urllib2.urlopen send an Accept-Encoding header? #52978

dabrahams mannequin opened this issue May 16, 2010 · 6 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@dabrahams
Copy link
Mannequin

dabrahams mannequin commented May 16, 2010

BPO 8732
Nosy @orsenthil, @merwok, @karlcow, @dabrahams, @vadmium, @demianbrecht

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/orsenthil'
closed_at = <Date 2016-05-14.12:46:10.787>
created_at = <Date 2010-05-16.14:47:09.704>
labels = ['type-bug', 'library']
title = 'Should urllib2.urlopen send an Accept-Encoding header?'
updated_at = <Date 2016-05-14.12:46:10.785>
user = 'https://github.com/dabrahams'

bugs.python.org fields:

activity = <Date 2016-05-14.12:46:10.785>
actor = 'martin.panter'
assignee = 'orsenthil'
closed = True
closed_date = <Date 2016-05-14.12:46:10.787>
closer = 'martin.panter'
components = ['Library (Lib)']
creation = <Date 2010-05-16.14:47:09.704>
creator = 'dabrahams'
dependencies = []
files = []
hgrepos = []
issue_num = 8732
keywords = []
message_count = 6.0
messages = ['105870', '105937', '105959', '183573', '239926', '265526']
nosy_count = 6.0
nosy_names = ['orsenthil', 'eric.araujo', 'karlcow', 'dabrahams', 'martin.panter', 'demian.brecht']
pr_nums = []
priority = 'normal'
resolution = 'works for me'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue8732'
versions = ['Python 3.1', 'Python 2.7', 'Python 3.2']

@dabrahams
Copy link
Mannequin Author

dabrahams mannequin commented May 16, 2010

According to the RFC, the server is allowed to send back any encoding it likes when no Accept-Encoding header is supplied, but all the examples I can find of urllib2.urlopen usage assume they're getting plain text back. I think it would be better to inject an Accept-Encoding header when none is explicitly supplied so that nobody else trips over this issue.

See http://support.github.com/discussions/site/1510

@dabrahams dabrahams mannequin added the stdlib Python modules in the Lib dir label May 16, 2010
@pitrou pitrou added the type-bug An unexpected behavior, bug, or error label May 16, 2010
@orsenthil
Copy link
Member

HTTP Ref says that Server can send any encoding, if client does not
specify Accept-Encoding header. But if 'identity' is one of the
encoding that server recognizes (?), then it should send it as
identity, which indicates untransformed content.

I also see in the httplib that Accept-Encoding = 'identity' is added in the
request level to the headers. I shall see what is missing here, if it
is not being sent for all requests.

BTW, I could not figure out the problem you are facing from the url
mentioned. I specifically do not see any interleaving gzip and no-gzip
request behaviours at different points.

@dabrahams
Copy link
Mannequin Author

dabrahams mannequin commented May 18, 2010

How many tests did you run? My two tests were minutes apart. I have the feeling that this has something to do with cacheing behavior on the server.

@karlcow
Copy link
Mannequin

karlcow mannequin commented Mar 6, 2013

What was the content of http://support.github.com/discussions/site/1510
I can't find it. Is the issue still going on?

@demianbrecht
Copy link
Mannequin

demianbrecht mannequin commented Apr 2, 2015

This doesn't seem to be an issue in 3.4+, the following headers are injected in a call to urlopen():

GET / HTTP/1.1
Accept-Encoding: identity
Host: example.com
User-Agent: Python-urllib/3.4
Connection: close

However, this is not the same behaviour in 2.7:

GET / HTTP/1.0
Host: example.com
User-Agent: Python-urllib/1.17

That said, I wouldn't see this as a bug but a feature request, so it should be invalid for 2.7.

Setting this to pending to close unless anyone has any objections or further details.

@vadmium
Copy link
Member

vadmium commented May 14, 2016

I suspect for Demian’s 2.7 experiment, he used the older urllib.urlopen(), rather than urllib2.urlopen() as given in the original description. When I use urllib2.urlopen("http://localhost/"), I see

GET / HTTP/1.1
Accept-Encoding: identity
Host: localhost
Connection: close
User-Agent: Python-urllib/2.7

Even in the urllib (no 2) case, since it is using HTTP 1.0, I suspect not having Accept-Encoding is not such a problem.

The underlying HTTP library has always added “Accept-Encoding: identity” for HTTP 1.1 by default (https://hg.python.org/cpython/annotate/4a3e9871b41b/Lib/httplib.py#l444), so I am closing this.

@vadmium vadmium closed this as completed May 14, 2016
@vadmium vadmium changed the title Should urrllib2.urlopen send an Accept-Encoding header? Should urllib2.urlopen send an Accept-Encoding header? May 14, 2016
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants