You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think using something like scrapy.utils.datatypes.MergeDict for the headers could help. We want headers added at spider level to be kept, but the rest to be removed. Redirect and Retry middlewares returns a Request on process_response which send the request to the beginning of the `DownloaderMiddleware" chain, so the headers added on that instance are going to be added again if they are needed.
But what about redirections to different domains with an Authorization header. Should we have a meta key with the headers to keep?
When:
http_proxy
is set forHttpProxyMiddleware
,http://
request is redirected to anhttps://
location,scrapy will use the
http_proxy
settings for thehttps
scheme.This also happens for
https://
tohttp://
Proxy-Authorization
header is also propagated.To test:
http://www.facebook.com
redirects tohttps://www.facebook.com
https://instagram.com/
redirects tohttp://instagram.com
Note: interesting discussion on HTTP redirection and headers: https://code.google.com/p/go/issues/detail?id=4800
The text was updated successfully, but these errors were encountered: