Skip to content

pip incorrectly seeing an infinite loop while fetching a URL #827

Closed
schwehr opened this Issue Mar 6, 2013 · 6 comments

5 participants

@schwehr
schwehr commented Mar 6, 2013

pip is reporting an infinite loop in the logs, when I am able to download the file. First I tried pip and see messages like what follows

pip -v --log install-bitvector.log install BitVector

Could not fetch URL http://RVL4.ecn.purdue.edu/%7ekak/dist/BitVector-2.2.tar.gz?download (from http://pypi.python.org/simple/BitVector/): HTTP Error 301: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Permanently

Will skip URL http://RVL4.ecn.purdue.edu/%7ekak/dist/BitVector-2.2.tar.gz?download when looking for download links for BitVector

Then I tried wget and it is able to pretty quickly get the file:

wget http://RVL4.ecn.purdue.edu/%7ekak/dist/BitVector-2.2.tar.gz?download
--2013-03-06 10:40:27-- http://rvl4.ecn.purdue.edu/%7ekak/dist/BitVector-2.2.tar.gz?download
Resolving rvl4.ecn.purdue.edu (rvl4.ecn.purdue.edu)... 128.46.4.72
Connecting to rvl4.ecn.purdue.edu (rvl4.ecn.purdue.edu)|128.46.4.72|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://engineering.purdue.edu/~kak/dist/BitVector-2.2.tar.gz?download [following]
--2013-03-06 10:40:28-- https://engineering.purdue.edu/~kak/dist/BitVector-2.2.tar.gz?download
Resolving engineering.purdue.edu (engineering.purdue.edu)... 128.46.104.5
Connecting to engineering.purdue.edu (engineering.purdue.edu)|128.46.104.5|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://engineering.purdue.edu/kak/dist/BitVector-2.2.tar.gz?download [following]
--2013-03-06 10:40:29-- https://engineering.purdue.edu/kak/dist/BitVector-2.2.tar.gz?download
Reusing existing connection to engineering.purdue.edu:443.
HTTP request sent, awaiting response... 200 OK
Length: 144541 (141K) [application/x-gzip]
Saving to: `BitVector-2.2.tar.gz?download'

@pnasrat
Python Packaging Authority member
pnasrat commented Mar 6, 2013

Can you try with the latest pip RC and see if the behaviour is still present.

@schwehr
schwehr commented Mar 7, 2013

Definitely better looking log entries and faster (but not fast). Thanks pnasrat.

virtualenv ve
source ve/bin/activate
time pip -v --log install-bitvector-pip-normal.log install BitVector
real 0m48.904s

pip install pip==dev
time pip -v --log install-bitvector-pip-1.4.dev1.log install BitVector
real 0m23.140s

From install-bitvector-pip-1.4.dev1.log
Downloading/unpacking BitVector

Getting page https://pypi.python.org/simple/BitVector/
URLs to search for versions for BitVector:

@bukzor
bukzor commented May 8, 2013

This is showing up when isntalling lxml. IMO this look-before-you-leap method of loop detection is inherently flawed. Not only is it quite hard to implement, it will only catch the special case of a single-node redirect loop. All browsers use a ask-forgiveness strategy with a redirect limit of 50, because that's what actually works in the general case of insane redirect graphs.

Downloading/unpacking lxml     
  Getting page https://pypi.python.org/simple/lxml/
  URLs to search for versions for lxml:
  * https://pypi.python.org/simple/lxml/
  Getting page http://codespeak.net/lxml
  Could not fetch URL http://codespeak.net/lxml (from https://pypi.python.org/simple/lxml/): HTTP Error 404: Not Found
  Will skip URL http://codespeak.net/lxml when looking for download links for lxml
  Could not fetch URL http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3.tar.gz (from https://pypi.python.org/simple/lxml/): HTTP Error 301: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Permanently

Looking at the headers, it's clear that pip only uses the path portion of the url for this logic. The quick fix is to get pip to notice the difference in the domain portion before assuming a redirect loop.

$ curl --head http://cheeseshop.python.org/packages/source/l/lxml/lxml-1.3.tar.gz
HTTP/1.1 301 Moved Permanently
Server: nginx/1.1.19
Date: Wed, 08 May 2013 20:24:27 GMT
Content-Type: text/html
Content-Length: 185
Location: http://pypi.python.org/packages/source/l/lxml/lxml-1.3.tar.gz

I imagine that you'll get a similarly false redirect-loop detected if I redirect from http to https with the same path (or vice versa). pip should really compare the full urls when looking for a trivial redirect loop.

@dstufft
Python Packaging Authority member
dstufft commented Jan 10, 2014

Can you verify if this is still the case with pip 1.5?

@dstufft
Python Packaging Authority member
dstufft commented Jan 30, 2014

I believe this should be fixed now, if anyone has this still still please recomment and we can reopen the issue.

@dstufft dstufft closed this Jan 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.