Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the project handle HTTP redirects? #1

Closed
karlcow opened this issue May 26, 2015 · 3 comments
Closed

Should the project handle HTTP redirects? #1

karlcow opened this issue May 26, 2015 · 3 comments
Milestone

Comments

@karlcow
Copy link

karlcow commented May 26, 2015

Currently the requests is not handling the redirect.

>>> import requests
>>> response = requests.head('http://w3.org/')
>>> response.headers
{'connection': 'close', 'content-length': '0', 'location': 'http://www.w3.org/'}
>>> response = requests.head('http://w3.org/', allow_redirects=True)
>>> response.headers
{'content-length': '41403', 'content-location': 'Home.html', 'accept-ranges': 'bytes', 'expires': 'Tue, 26 May 2015 23:22:05 GMT', 'vary': 'negotiate,accept', 'server': 'Apache/2', 'tcn': 'choice', 'last-modified': 'Sun, 24 May 2015 13:30:14 GMT', 'etag': '"a1bb-516d3e4ace580;89-3f26bd17a2f00"', 'cache-control': 'max-age=600', 'date': 'Tue, 26 May 2015 23:12:05 GMT', 'p3p': 'policyref="http://www.w3.org/2014/08/p3p.xml"', 'content-type': 'text/html; charset=utf-8'}
>>> response.history
[<Response [301]>]

Not sure if it's by choice or not.

I can make a pull request to add , allow_redirects=True except if it's by choice.

@davidbgk
Copy link
Collaborator

davidbgk commented Jun 5, 2015

I hesitated on that one but eventually it's more interesting to get the final destination. Maybe I should store somewhere that a redirect was issued though in order to be able to update the URL somehow. Thoughts?

@karlcow
Copy link
Author

karlcow commented Jun 5, 2015

It can be interesting to know which type of redirects. Is it a temporary redirect and need to keep the original URI in the URI db, or is a permanent redirect and need to replace the original URI (assuming that people respect such a thing ;) )

Anyway food for thoughts:

I had some sites doing a series of 5 redirects before reaching the final resource.

Type of redirects I get:

  • based on UA mobile/desktop
  • based on geolocation (example to try with msn or yahoo properties)
  • based on browser language

All redirects are not HTTP

  • HTTP
  • html with meta
  • JavaScript with window.location

Some redirects are done just for login credentials ^_^

@davidbgk
Copy link
Collaborator

After some time in production, we need to handle both cases:

  1. knowing there is a redirect (and which one)
  2. knowing the status of the target of the redirect

Reopened issues are the best 😉

@davidbgk davidbgk reopened this Jun 20, 2017
@abulte abulte mentioned this issue Jul 1, 2017
2 tasks
@abulte abulte added this to the 2.0.0 milestone Jul 19, 2017
abulte added a commit to abulte/croquemort that referenced this issue Jul 19, 2017
abulte added a commit to abulte/croquemort that referenced this issue Jul 20, 2017
@abulte abulte closed this as completed Jan 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants