Pip silently ignores index pages with unexpected content type #6754

chrahunt · 2019-07-20T18:43:39Z

Environment

pip version: 19.1.1
Python version: 3.7.2
OS: Ubuntu Linux 18.04

Description

As mentioned in #6697, if a package index page configured via --index-url returns an unexpected content type, then pip silently ignores it and displays a generic "No matching distribution found" error message.

Expected behavior

pip should display a warning in the event that the content returned by the package index is ill-formed or not understood.

How to Reproduce

repro.sh

#!/bin/sh
cd $(mktemp -d)
python -m venv .venv
. .venv/bin/activate
pip install --upgrade pip
cat <<EOF > server.py
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer


class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('content-type', 'application/xhtml+xml')
        self.end_headers()


if __name__ == '__main__':
    server = ThreadingHTTPServer(('localhost', 11111), Handler)
    server.serve_forever()
EOF
python server.py >/dev/null 2>&1 &
echo ========== run 1 ==========
pip install --index-url=http://localhost:11111/ requests
echo ========== run 2 ==========
pip install --verbose --index-url=http://localhost:11111/ requests

Output

run 1

Looking in indexes: http://localhost:11111/
Collecting requests
  ERROR: Could not find a version that satisfies the requirement requests (from versions: none)
ERROR: No matching distribution found for requests

run 2 (verbose)

Created temporary directory: /tmp/user/1000/pip-ephem-wheel-cache-nz_lnbof
Created temporary directory: /tmp/user/1000/pip-req-tracker-68gx3hw_
Created requirements tracker '/tmp/user/1000/pip-req-tracker-68gx3hw_'
Created temporary directory: /tmp/user/1000/pip-install-3cazkrz9
Looking in indexes: http://localhost:11111/
Collecting requests
  1 location(s) to search for versions of requests:
  * http://localhost:11111/requests/
  Getting page http://localhost:11111/requests/
  Starting new HTTP connection (1): localhost:11111
  http://localhost:11111 "GET /requests/ HTTP/1.1" 200 None
  Skipping page http://localhost:11111/requests/ because the GET request got Content-Type: application/xhtml+xml
  ERROR: Could not find a version that satisfies the requirement requests (from versions: none)
Cleaning up...
Removed build tracker '/tmp/user/1000/pip-req-tracker-68gx3hw_'
ERROR: No matching distribution found for requests
Exception information:
Traceback (most recent call last):
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 178, in main
    status = self.run(options, args)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 352, in run
    resolver.resolve(requirement_set)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 131, in resolve
    self._resolve_one(requirement_set, req)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 294, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 242, in _get_abstract_dist_for
    self.require_hashes
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 282, in prepare_linked_requirement
    req.populate_link(finder, upgrade_allowed, require_hashes)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 198, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/index.py", line 792, in find_requirement
    'No matching distribution found for %s' % req
pip._internal.exceptions.DistributionNotFound: No matching distribution found for requests

Specifically, the issue is that the message

Skipping page http://localhost:11111/requests/ because the GET request got Content-Type: application/xhtml+xml

only shows up with --verbose.

The text was updated successfully, but these errors were encountered:

deveshks · 2020-04-18T19:01:23Z

Is this PR just a change from logger.debug to logger.warning? If yes I would like to take it up

sbidoul · 2020-04-19T09:18:15Z

At first glance this should be useful improvement, yes. We may want to add a mention of the supported content types in the error message.

deveshks · 2020-04-19T09:22:42Z

Thanks, and it looks like text/html is the only acceptable header as per

pip/src/pip/_internal/index/collector.py

Lines 98 to 106 in 97f6390

    
           def _ensure_html_header(response): 
        
               # type: (Response) -> None 
        
               """Check the Content-Type header to ensure the response contains HTML. 
        
               Raises `_NotHTML` if the content type is not text/html. 
        
               """ 
        
               content_type = response.headers.get("Content-Type", "") 
        
               if not content_type.lower().startswith("text/html"): 
        
                   raise _NotHTML(content_type, response.request.method)

triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Jul 20, 2019

chrahunt mentioned this issue Jul 20, 2019

Fails with PEP 503-compliant XML-serialised HTML5 ("XHTML5") directory indices #6697

Closed

chrahunt added C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: enhancement Improvements to functionality labels Jul 20, 2019

triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Jul 20, 2019

deveshks mentioned this issue Apr 19, 2020

Warn if package index gets unexpected Content-Type #8083

Merged

pradyunsg closed this as completed in #8083 May 23, 2020

lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 24, 2020

lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pip silently ignores index pages with unexpected content type #6754

Pip silently ignores index pages with unexpected content type #6754

chrahunt commented Jul 20, 2019

deveshks commented Apr 18, 2020

sbidoul commented Apr 19, 2020

deveshks commented Apr 19, 2020

Pip silently ignores index pages with unexpected content type #6754

Pip silently ignores index pages with unexpected content type #6754

Comments

chrahunt commented Jul 20, 2019

deveshks commented Apr 18, 2020

sbidoul commented Apr 19, 2020

deveshks commented Apr 19, 2020