Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib2 blocked from news.google.com #42560

Closed
asyncster mannequin opened this issue Nov 7, 2005 · 4 comments
Closed

urllib2 blocked from news.google.com #42560

asyncster mannequin opened this issue Nov 7, 2005 · 4 comments
Labels
stdlib Python modules in the Lib dir

Comments

@asyncster
Copy link
Mannequin

asyncster mannequin commented Nov 7, 2005

BPO 1349977
Nosy @mwhudson, @brettcannon

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2005-11-07.21:20:10.000>
created_at = <Date 2005-11-07.06:31:01.000>
labels = ['library']
title = 'urllib2 blocked from news.google.com'
updated_at = <Date 2005-11-07.21:20:10.000>
user = 'https://bugs.python.org/asyncster'

bugs.python.org fields:

activity = <Date 2005-11-07.21:20:10.000>
actor = 'brett.cannon'
assignee = 'none'
closed = True
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2005-11-07.06:31:01.000>
creator = 'asyncster'
dependencies = []
files = []
hgrepos = []
issue_num = 1349977
keywords = []
message_count = 4.0
messages = ['26807', '26808', '26809', '26810']
nosy_count = 3.0
nosy_names = ['mwh', 'brett.cannon', 'asyncster']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1349977'
versions = ['Python 2.4']

@asyncster
Copy link
Mannequin Author

asyncster mannequin commented Nov 7, 2005

It seems that google is blocking requests from clients
with urllib 2.4 as the user-agent. If you telnet to
news.google.com and type:

GET / HTTP/1.1
Host: news.google.com
User-agent: Python-urllib/2.4

You get a HTTP/1.1 403 Forbidden

@asyncster asyncster mannequin closed this as completed Nov 7, 2005
@asyncster asyncster mannequin added the stdlib Python modules in the Lib dir label Nov 7, 2005
@asyncster asyncster mannequin closed this as completed Nov 7, 2005
@asyncster asyncster mannequin added the stdlib Python modules in the Lib dir label Nov 7, 2005
@brettcannon
Copy link
Member

Logged In: YES
user_id=357491

I can verify this using urllib.urlretrieve() from the trunk.

@mwhudson
Copy link

mwhudson commented Nov 7, 2005

Logged In: YES
user_id=6656

In what crazy universe is this a Python bug? It's up to google what they
do with http requests, surely. If you are reasonably sure that your use
does not violate the terms of use for google news:

http://news.google.com/intl/en_us/terms_google_news.html

Then you can experiment with getting urllib to send a different User-Agent
header.

@brettcannon
Copy link
Member

Logged In: YES
user_id=357491

It isn't a Python bug, but then again it got my attention
which means I can contact people within Google to see if
they can find out what happened.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

2 participants