scrapy may keep wrong proxy setting when following redirects #767

redapple · 2014-06-26T11:52:10Z

When:

http_proxy is set for HttpProxyMiddleware,
and an http:// request is redirected to an https:// location,

scrapy will use the http_proxy settings for the https scheme.

This also happens for https:// to http://

Proxy-Authorization header is also propagated.

To test:

http://www.facebook.com redirects to https://www.facebook.com
https://instagram.com/ redirects to http://instagram.com

Note: interesting discussion on HTTP redirection and headers: https://code.google.com/p/go/issues/detail?id=4800

The text was updated successfully, but these errors were encountered:

dangra · 2014-06-26T17:15:47Z

A possible solution is to cleanup all proxy related metakeys and headers on process_response() hook of HttpProxyMiddleware

nramirezuy · 2014-07-17T20:03:01Z

I think using something like scrapy.utils.datatypes.MergeDict for the headers could help. We want headers added at spider level to be kept, but the rest to be removed.
Redirect and Retry middlewares returns a Request on process_response which send the request to the beginning of the `DownloaderMiddleware" chain, so the headers added on that instance are going to be added again if they are needed.

But what about redirections to different domains with an Authorization header. Should we have a meta key with the headers to keep?

Gallaecio · 2024-05-14T12:10:05Z

GHSA-jm3v-qxmh-hxwv

redapple added bug security help wanted labels Sep 15, 2016

Gallaecio closed this as completed May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scrapy may keep wrong proxy setting when following redirects #767

scrapy may keep wrong proxy setting when following redirects #767

redapple commented Jun 26, 2014

dangra commented Jun 26, 2014

nramirezuy commented Jul 17, 2014

Gallaecio commented May 14, 2024

scrapy may keep wrong proxy setting when following redirects #767

scrapy may keep wrong proxy setting when following redirects #767

Comments

redapple commented Jun 26, 2014

dangra commented Jun 26, 2014

nramirezuy commented Jul 17, 2014

Gallaecio commented May 14, 2024