Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2020-resolver & fast-deps] BadZipFile when force-reinstalling packages #8701

Closed
akaihola opened this issue Aug 4, 2020 · 6 comments · Fixed by #8716
Closed

[2020-resolver & fast-deps] BadZipFile when force-reinstalling packages #8701

akaihola opened this issue Aug 4, 2020 · 6 comments · Fixed by #8716
Labels
kind: crash For situations where pip crashes state: needs reproducer Need to reproduce issue

Comments

@akaihola
Copy link
Contributor

akaihola commented Aug 4, 2020

Environment

  • pip version: 20.3.dev0 at commit 5a61475
  • Python version: 3.7.7
  • OS: Fedora 30

Description
Force-reinstalling a cached package causes pip to fail if use-feature = 2020-resolver fast-deps both are enabled.

I saw this behavior with Black, pip itself, wheel and setuptools. The example below is for Black.

If only either 2020-resolver or fast-deps is enabled, everything works correctly.

Expected behavior
The package should be reinstalled successfully.

How to Reproduce

  1. Clear the pip cache
  2. Create a virtualenv
  3. Update pip to current master
  4. Enable 2020-resolver and fast-deps using pip config set --site global.use-feature "2020-resolver fast-deps"
  5. Install Black with pip
  6. Install Black again using the --force-reinstall option

Output

$ rm ~/.cache/pip -rf
$ mktmpenv
Using base prefix '/usr'
New python executable in /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/python3
Also creating executable in /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/python
Installing setuptools, pip, wheel...done.
virtualenvwrapper.user_scripts creating /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/predeactivate
virtualenvwrapper.user_scripts creating /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/postdeactivate
virtualenvwrapper.user_scripts creating /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/preactivate
virtualenvwrapper.user_scripts creating /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/postactivate
virtualenvwrapper.user_scripts creating /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/bin/get_env_details
This is a temporary environment. It will be deleted when you run 'deactivate'.
$ python -m pip install -U "pip @ https://github.com/pypa/pip/archive/master.zip"
Collecting pip@ https://github.com/pypa/pip/archive/master.zip
  Using cached https://github.com/pypa/pip/archive/master.zip
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: pip
  Building wheel for pip (PEP 517) ... done
  Created wheel for pip: filename=pip-20.3.dev0-py2.py3-none-any.whl size=1503406 sha256=adbf64c5ebbd34c17295a6c96c5e90e5ca9f35714f75001e1c24c971e9232302
  Stored in directory: /tmp/pip-ephem-wheel-cache-b_6p5pik/wheels/cb/71/7f/677ff1340ac636bc57b0dda6815dade9ad667c5c337aa5dc39
Successfully built pip
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.2.1
    Uninstalling pip-20.2.1:
      Successfully uninstalled pip-20.2.1
Successfully installed pip-20.3.dev0
$ pip config set --site global.use-feature "2020-resolver fast-deps"                            
Writing to /home/akaihola/.virtualenvs/tmp-32e020b1a10180f/pip.conf
$ pip install black
WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.
Collecting black
  Obtaining dependency information from black 19.10b0
Collecting typed-ast>=1.4.0
  Obtaining dependency information from typed-ast 1.4.1
Collecting pathspec<1,>=0.6
  Obtaining dependency information from pathspec 0.8.0
Collecting toml>=0.9.4
  Obtaining dependency information from toml 0.10.1
Collecting attrs>=18.1.0
  Obtaining dependency information from attrs 19.3.0
Collecting click>=6.5
  Obtaining dependency information from click 7.1.2
Collecting appdirs
  Obtaining dependency information from appdirs 1.4.4
Collecting regex
  Obtaining dependency information from regex 2020.7.14
Collecting black
  Downloading black-19.10b0-py36-none-any.whl (97 kB)
     |████████████████████████████████| 97 kB 1.2 MB/s
Collecting typed-ast>=1.4.0
  Downloading typed_ast-1.4.1-cp37-cp37m-manylinux1_x86_64.whl (737 kB)
     |████████████████████████████████| 737 kB 2.5 MB/s
Collecting pathspec<1,>=0.6
  Downloading pathspec-0.8.0-py2.py3-none-any.whl (28 kB)
Collecting toml>=0.9.4
  Downloading toml-0.10.1-py2.py3-none-any.whl (19 kB)
Collecting attrs>=18.1.0
  Downloading attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Collecting click>=6.5
  Downloading click-7.1.2-py2.py3-none-any.whl (82 kB)
     |████████████████████████████████| 82 kB 1.1 MB/s
Collecting appdirs
  Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting regex
  Downloading regex-2020.7.14-cp37-cp37m-manylinux2010_x86_64.whl (660 kB)
     |████████████████████████████████| 660 kB 3.8 MB/s
Installing collected packages: typed-ast, toml, regex, pathspec, click, attrs, appdirs, black
Successfully installed appdirs-1.4.4 attrs-19.3.0 black-19.10b0 click-7.1.2 pathspec-0.8.0 regex-2020.7.14 toml-0.10.1 typed-ast-1.4.1
$ pip install --force-reinstall black
WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.
Collecting black
  Obtaining dependency information from black 19.10b0
ERROR: black has an invalid wheel, could not read 'black-19.10b0.dist-info/LICENSE' file: BadZipFile('Bad magic number for file header')

The final error for the other packages I tried was:

# wheel (no traceback):
ERROR: wheel has an invalid wheel, could not read 'wheel-0.34.2.dist-info/LICENSE.txt' file: BadZipFile('Bad magic number for file header')

# setuptools and pip:
  File "/home/kaiant/.virtualenvs/tmp-c9ddecd5fb033a7/lib/python3.7/site-packages/pip/_internal/network/lazy_wheel.py", line 46, in dist_from_wheel_url
    zip_file = ZipFile(wheel)  # type: ignore
  File "/usr/lib64/python3.7/zipfile.py", line 1258, in __init__
    self._RealGetContents()
  File "/usr/lib64/python3.7/zipfile.py", line 1353, in _RealGetContents
    raise BadZipFile("Bad magic number for central directory")
zipfile.BadZipFile: Bad magic number for central directory
@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Aug 4, 2020
@McSinyx
Copy link
Contributor

McSinyx commented Aug 5, 2020

Thank you very much for reporting this! However, I can reproduce this neither the latest release 20.2.1 nor the current master ee4371c. Furthermore, there seems to be a typo in version you listed, 5a61475 is part of GH-6236 which was merged on 2019-02-05 and was released as pip 19.0.2.

The traceback you provided from forcing reinstalling setuptools and pip suggests that the implementation of the wheel over HTTP range requests thing got some segment merging done wrong, but it would be difficult to debug without failure occurring. It would be really nice if you can additionally see if the crash is reproducible on pip 20.2.0 and 20.2.1.

@pradyunsg pradyunsg added kind: crash For situations where pip crashes state: needs reproducer Need to reproduce issue labels Aug 5, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Aug 5, 2020
@uranusjr
Copy link
Member

uranusjr commented Aug 5, 2020

Would you be able to locate black-19.10b0-py36-none-any.whl on disk (I believe pip cache returns where the file is stored) and upload it somewhere?

@akaihola
Copy link
Contributor Author

akaihola commented Aug 5, 2020

$ pip cache info 
Location: /home/akaihola/.cache/pip/wheels
Size: 0 bytes
Number of wheels: 0

It seems wheels aren't stored, but HTTP responses are. I believe I found the correct one in /home/akaihola/.cache/pip/http/ by grepping for black.

Here's a link to a Dropbox folder with two files:

98494 bytes black-wheel.cached.with-features.6a5ee54c682b7d5a1512cdace1238512882aad7b543d3a326417e714
98493 bytes black-wheel.cached.without-features.6a5ee54c682b7d5a1512cdace1238512882aad7b543d3a326417e714
  • The with-features one was copied after doing the initial pip install black when 2020-resolver and fast-deps were enabled.
  • The without-features one was copied after doing the initial pip install black when only 2020-resolver was enabled.

However, it seems that the differences between these files don't matter, because this fails:

$ pip config set --site global.use-feature "2020-resolver"
$ pip install black
$ pip config set --site global.use-feature "2020-resolver fast-deps"
$ pip install --force-reinstall black  # --> FAILS

and this succeeds:

$ pip config set --site global.use-feature "2020-resolver fast-deps"
$ pip install black
$ pip config set --site global.use-feature "2020-resolver"
$ pip install --force-reinstall black  # --> SUCCEEDS

So based on this, the cached response is always written correctly, but 2020-resolver and fast-deps together fail to use it correctly regardless of whether it was written with that setup or not.

@akaihola
Copy link
Contributor Author

akaihola commented Aug 5, 2020

I also reproduced this on the python:3.8-slim-buster Docker image. I first created this script:

$HOME/pip-issue-8701.sh:

#!/bin/bash

python -m pip install -U "pip @ https://github.com/pypa/pip/archive/master.zip"
pip install black
pip config set --site global.use-feature "2020-resolver fast-deps"
pip install --force-reinstall black

I then ran that in the image using Podman:

$ podman pull python:3.8-slim-buster
Trying to pull docker.io/library/python:3.8-slim-buster...
Getting image source signatures
Copying blob 7a5d07f2fd13 done  
Copying blob ab14b629693d done  
Copying blob bf5952930446 done  
Copying blob 335be2fee6e0 done  
Copying blob 385bb58d08e6 done  
Copying config d4226ee526 done  
Writing manifest to image destination
Storing signatures
d4226ee526c0b442c94ff127a7f719616118452c52e39a0c9351767683b298b0

$ podman run -v $HOME/pip-issue-8701.sh:/pip-issue-8701.sh:Z python:3.8-slim-buster /pip-issue-8701.sh
Collecting pip@ https://github.com/pypa/pip/archive/master.zip
  Downloading https://github.com/pypa/pip/archive/master.zip
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Building wheels for collected packages: pip
  Building wheel for pip (PEP 517): started
  Building wheel for pip (PEP 517): finished with status 'done'
  Created wheel for pip: filename=pip-20.3.dev0-py2.py3-none-any.whl size=1503405 sha256=36cf42b4b26cf33c87bc611d80cb651c0c755c2626fab0ce7076b29ca255ff13
  Stored in directory: /tmp/pip-ephem-wheel-cache-qgjqx2os/wheels/b2/f0/ae/286fb76d950bd0a0d20bcabbda0f56531389ed5030f038f6b9
Successfully built pip
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.2
    Uninstalling pip-20.2:
      Successfully uninstalled pip-20.2
Successfully installed pip-20.3.dev0
Collecting black
  Downloading black-19.10b0-py36-none-any.whl (97 kB)
Collecting regex
  Downloading regex-2020.7.14-cp38-cp38-manylinux2010_x86_64.whl (672 kB)
Collecting click>=6.5
  Downloading click-7.1.2-py2.py3-none-any.whl (82 kB)
Collecting attrs>=18.1.0
  Downloading attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Collecting pathspec<1,>=0.6
  Downloading pathspec-0.8.0-py2.py3-none-any.whl (28 kB)
Collecting typed-ast>=1.4.0
  Downloading typed_ast-1.4.1-cp38-cp38-manylinux1_x86_64.whl (768 kB)
Collecting toml>=0.9.4
  Downloading toml-0.10.1-py2.py3-none-any.whl (19 kB)
Collecting appdirs
  Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Installing collected packages: regex, click, attrs, pathspec, typed-ast, toml, appdirs, black
Successfully installed appdirs-1.4.4 attrs-19.3.0 black-19.10b0 click-7.1.2 pathspec-0.8.0 regex-2020.7.14 toml-0.10.1 typed-ast-1.4.1
Writing to /usr/local/pip.conf
WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.
Collecting black
  Obtaining dependency information from black 19.10b0
ERROR: black has an invalid wheel, could not read 'black-19.10b0.dist-info/LICENSE' file: BadZipFile('Bad magic number for file header')

@McSinyx
Copy link
Contributor

McSinyx commented Aug 5, 2020

Thank you @akaihola, this is really helpful, I'll investigate this ASAP. Until then, I want to note that the fast-deps feature is no-op when used with the legacy resolver (i.e. when 2020-resolver is not enabled) and that shallow wheels are not cached (I'm not sure if it's a good idea, I've never really thought about it).

@McSinyx
Copy link
Contributor

McSinyx commented Aug 5, 2020

The problem seem to be with incorrect (?) caching of range responses (please have my apologies, @akaihola, for putting you through creating a container for this—I have --no-cache-dir habitually added to every command testing the fast-deps feature and thus didn't catch the error). The following patch would hot fix the problem and leave caching to be revisit later:

diff --git a/src/pip/_internal/network/lazy_wheel.py b/src/pip/_internal/network/lazy_wheel.py
index 16be0d29..c03eaeb8 100644
--- a/src/pip/_internal/network/lazy_wheel.py
+++ b/src/pip/_internal/network/lazy_wheel.py
@@ -194,7 +194,8 @@ class LazyZipOverHTTP(object):
     def _stream_response(self, start, end, base_headers=HEADERS):
         # type: (int, int, Dict[str, str]) -> Response
         """Return HTTP response to a range request from start to end."""
-        headers = {'Range': 'bytes={}-{}'.format(start, end)}
+        headers = {'Range': 'bytes={}-{}'.format(start, end),
+                   'Cache-Control': 'no-cache'}
         headers.update(base_headers)
         return self._session.get(self._url, headers=headers, stream=True)

bors bot added a commit to duckinator/emanate that referenced this issue Aug 11, 2020
162: Update pip to 20.2.2 r=duckinator a=pyup-bot


This PR updates [pip](https://pypi.org/project/pip) from **20.2.1** to **20.2.2**.



<details>
  <summary>Changelog</summary>
  
  
   ### 20.2.2
   ```
   ===================

Bug Fixes
---------

- Only attempt to use the keyring once and if it fails, don&#39;t try again.
  This prevents spamming users with several keyring unlock prompts when they
  cannot unlock or don&#39;t want to do so. (`8090 &lt;https://github.com/pypa/pip/issues/8090&gt;`_)
- Fix regression that distributions in system site-packages are not correctly
  found when a virtual environment is configured with ``system-site-packages``
  on. (`8695 &lt;https://github.com/pypa/pip/issues/8695&gt;`_)
- Disable caching for range requests, which causes corrupted wheels
  when pip tries to obtain metadata using the feature ``fast-deps``. (`8701 &lt;https://github.com/pypa/pip/issues/8701&gt;`_, `8716 &lt;https://github.com/pypa/pip/issues/8716&gt;`_)
- Always use UTF-8 to read ``pyvenv.cfg`` to match the built-in ``venv``. (`8717 &lt;https://github.com/pypa/pip/issues/8717&gt;`_)
- 2020 Resolver: Correctly handle marker evaluation in constraints and exclude
  them if their markers do not match the current environment. (`8724 &lt;https://github.com/pypa/pip/issues/8724&gt;`_)
   ```
   
  
</details>


 

<details>
  <summary>Links</summary>
  
  - PyPI: https://pypi.org/project/pip
  - Changelog: https://pyup.io/changelogs/pip/
  - Homepage: https://pip.pypa.io/
</details>



Co-authored-by: pyup-bot <github-bot@pyup.io>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind: crash For situations where pip crashes state: needs reproducer Need to reproduce issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants