You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code yield Request(link, meta={'proxy': 'http://user:password@ip:port’}) ignores user:password.
Problem is solved by using header "Proxy-Authorization" with base64, but it is better to implement it inside Scrapy.
The text was updated successfully, but these errors were encountered:
Thanks @vezunch1k for reporting.
I can indeed reproduce this.
HttpProxyMiddleware does not touch outgoing requests if they have the "proxy" key set in meta dict. Especially, it does not update headers for Proxy-Authorization.
And Scrapy's downloader Agent only uses the "host" part of the proxy URL, and ignores credentials that may be there, assuming Proxy-Authorization is already there if it's needed:
proxy = request.meta.get('proxy')
if proxy:
_, _, proxyHost, proxyPort, proxyParams = _parse(proxy)
scheme = _parse(request.url)[0]
proxyHost = to_unicode(proxyHost)
omitConnectTunnel = b'noconnect' in proxyParams
if scheme == b'https' and not omitConnectTunnel:
proxyConf = (proxyHost, proxyPort,
request.headers.get(b'Proxy-Authorization', None))
return self._TunnelingAgent(reactor, proxyConf,
contextFactory=self._contextFactory, connectTimeout=timeout,
bindAddress=bindaddress, pool=self._pool)
Proxy credentials in proxy URL are correctly processed by HttpProxyMiddleware when http(s)_proxy env vars are being used,
so it makes sense to me to handle them as well when using "proxy" key direclty.
Code
yield Request(link, meta={'proxy': 'http://user:password@ip:port’})
ignores user:password.Problem is solved by using header "Proxy-Authorization" with base64, but it is better to implement it inside Scrapy.
The text was updated successfully, but these errors were encountered: