Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pip silently ignores index pages with unexpected content type #6754

Closed
chrahunt opened this issue Jul 20, 2019 · 3 comments · Fixed by #8083
Closed

Pip silently ignores index pages with unexpected content type #6754

chrahunt opened this issue Jul 20, 2019 · 3 comments · Fixed by #8083
Labels
auto-locked Outdated issues that have been locked by automation C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: enhancement Improvements to functionality

Comments

@chrahunt
Copy link
Member

Environment

  • pip version: 19.1.1
  • Python version: 3.7.2
  • OS: Ubuntu Linux 18.04

Description

As mentioned in #6697, if a package index page configured via --index-url returns an unexpected content type, then pip silently ignores it and displays a generic "No matching distribution found" error message.

Expected behavior

pip should display a warning in the event that the content returned by the package index is ill-formed or not understood.

How to Reproduce

repro.sh
#!/bin/sh
cd $(mktemp -d)
python -m venv .venv
. .venv/bin/activate
pip install --upgrade pip
cat <<EOF > server.py
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer


class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('content-type', 'application/xhtml+xml')
        self.end_headers()


if __name__ == '__main__':
    server = ThreadingHTTPServer(('localhost', 11111), Handler)
    server.serve_forever()
EOF
python server.py >/dev/null 2>&1 &
echo ========== run 1 ==========
pip install --index-url=http://localhost:11111/ requests
echo ========== run 2 ==========
pip install --verbose --index-url=http://localhost:11111/ requests

Output

run 1
Looking in indexes: http://localhost:11111/
Collecting requests
  ERROR: Could not find a version that satisfies the requirement requests (from versions: none)
ERROR: No matching distribution found for requests
run 2 (verbose)
Created temporary directory: /tmp/user/1000/pip-ephem-wheel-cache-nz_lnbof
Created temporary directory: /tmp/user/1000/pip-req-tracker-68gx3hw_
Created requirements tracker '/tmp/user/1000/pip-req-tracker-68gx3hw_'
Created temporary directory: /tmp/user/1000/pip-install-3cazkrz9
Looking in indexes: http://localhost:11111/
Collecting requests
  1 location(s) to search for versions of requests:
  * http://localhost:11111/requests/
  Getting page http://localhost:11111/requests/
  Starting new HTTP connection (1): localhost:11111
  http://localhost:11111 "GET /requests/ HTTP/1.1" 200 None
  Skipping page http://localhost:11111/requests/ because the GET request got Content-Type: application/xhtml+xml
  ERROR: Could not find a version that satisfies the requirement requests (from versions: none)
Cleaning up...
Removed build tracker '/tmp/user/1000/pip-req-tracker-68gx3hw_'
ERROR: No matching distribution found for requests
Exception information:
Traceback (most recent call last):
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 178, in main
    status = self.run(options, args)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 352, in run
    resolver.resolve(requirement_set)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 131, in resolve
    self._resolve_one(requirement_set, req)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 294, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/resolve.py", line 242, in _get_abstract_dist_for
    self.require_hashes
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 282, in prepare_linked_requirement
    req.populate_link(finder, upgrade_allowed, require_hashes)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 198, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/tmp/user/1000/tmp.ubtUD2yfZP/.venv/lib/python3.7/site-packages/pip/_internal/index.py", line 792, in find_requirement
    'No matching distribution found for %s' % req
pip._internal.exceptions.DistributionNotFound: No matching distribution found for requests

Specifically, the issue is that the message

Skipping page http://localhost:11111/requests/ because the GET request got Content-Type: application/xhtml+xml

only shows up with --verbose.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Jul 20, 2019
@chrahunt chrahunt added C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: enhancement Improvements to functionality labels Jul 20, 2019
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Jul 20, 2019
@deveshks
Copy link
Contributor

Is this PR just a change from logger.debug to logger.warning? If yes I would like to take it up

@sbidoul
Copy link
Member

sbidoul commented Apr 19, 2020

At first glance this should be useful improvement, yes. We may want to add a mention of the supported content types in the error message.

@deveshks
Copy link
Contributor

Thanks, and it looks like text/html is the only acceptable header as per

def _ensure_html_header(response):
# type: (Response) -> None
"""Check the Content-Type header to ensure the response contains HTML.
Raises `_NotHTML` if the content type is not text/html.
"""
content_type = response.headers.get("Content-Type", "")
if not content_type.lower().startswith("text/html"):
raise _NotHTML(content_type, response.request.method)

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 24, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants