Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resolver: performance issue when updating virtualenv with pinned requirements #8675

Closed
sbidoul opened this issue Aug 1, 2020 · 4 comments
Labels
resolution: duplicate Duplicate of an existing issue/PR

Comments

@sbidoul
Copy link
Member

sbidoul commented Aug 1, 2020

Environment

  • pip version: 20.2
  • Python version: 3
  • OS: linux

Description

A common use case for application developers is updating an existing virtualenv with a list of pinned requirements (pip install -r requirements.txt -e . where requirements.txt is a list of name==version, as emitted by pip freeze).

The new resolver is about 10 to 20 times slower than the legacy resolver in that use case. With large requirements.txt (100 or more lines) the performance drop can become crippling (I have an example that goes from 15 seconds to 4.5 minutes).

Expected behavior

A smaller performance impact of the new resolver.

How to Reproduce

Run the following script (at least twice to warm up the cache).

test.sh
#!/bin/bash

cd $(mktemp -d)

REQS=requirements.txt

cat <<EOF > $REQS
Babel==2.6.0
chardet==3.0.4
decorator==4.3.0
docutils==0.14
ebaysdk==2.1.5
feedparser==5.2.1
freezegun==0.3.11
gevent==1.1.2 ; sys_platform != 'win32' and python_version < '3.7'
gevent==1.3.7 ; sys_platform != 'win32' and python_version >= '3.7'
gevent==1.4.0 ; sys_platform == 'win32'
greenlet==0.4.10 ; python_version < '3.7'
greenlet==0.4.15 ; python_version >= '3.7'
html2text==2018.1.9
idna==2.6
Jinja2==2.10.1
libsass==0.17.0
lxml==3.7.1 ; sys_platform != 'win32' and python_version < '3.7'
lxml==4.3.2 ; sys_platform != 'win32' and python_version >= '3.7'
lxml ; sys_platform == 'win32'
Mako==1.0.7
MarkupSafe==1.1.0
num2words==0.5.6
ofxparse==0.19
passlib==1.7.1
Pillow==5.4.1 ; python_version < '3.7' or sys_platform != 'win32'
Pillow==6.1.0 ; sys_platform == 'win32' and python_version >= '3.7'
polib==1.1.0
psutil==5.6.6
psycopg2==2.7.7; sys_platform != 'win32' and python_version < '3.8'
psycopg2==2.8.3; sys_platform == 'win32' or python_version >= '3.8'
pydot==1.4.1
python-ldap==3.1.0; sys_platform != 'win32'
pyparsing==2.2.0
PyPDF2==1.26.0
pyserial==3.4
python-dateutil==2.7.3
pytz==2019.1
pyusb==1.0.2
qrcode==6.1
reportlab==3.5.13
requests==2.21.0
zeep==3.2.0
python-stdnum==1.8
vobject==0.9.6.1
Werkzeug==0.16.1
XlsxWriter==1.1.2
xlwt==1.3.*
xlrd==1.1.0
pypiwin32 ; sys_platform == 'win32'
EOF

virtualenv -q venv-legacy-resolver
source venv-legacy-resolver/bin/activate
python -m pip -q install -U pip
python -m pip --version
echo "*** install (legacy resolver)"
time pip -q install -r $REQS
echo
echo "*** reinstall (legacy resolver)"
time pip -q install -r $REQS
echo

virtualenv -q venv-2020-resolver
source venv-2020-resolver/bin/activate
python -m pip -q install -U pip
python -m pip --version
echo "*** install (2020 resolver)"
time pip -q install -r $REQS --use-feature=2020-resolver
echo
echo "*** reinstall (2020 resolver)"
time pip -q install -r $REQS --use-feature=2020-resolver
echo

Output

pip 20.2 from /tmp/tmp.448mAdzPzs/venv-legacy-resolver/lib/python3.8/site-packages/pip (python 3.8)
*** install (legacy resolver)

real	0m10,028s
user	0m7,181s
sys	0m0,709s

*** reinstall (legacy resolver)

real	0m0,535s
user	0m0,492s
sys	0m0,044s

pip 20.2 from /tmp/tmp.448mAdzPzs/venv-2020-resolver/lib/python3.8/site-packages/pip (python 3.8)
*** install (2020 resolver)

real	0m12,452s  ==> new/old ≃ 1.25 (ok)
user	0m9,186s
sys	0m0,620s

*** reinstall (2020 resolver)

real	0m6,642s ==> new/old ≃ 12 (problematic)
user	0m4,783s
sys	0m0,132s
@minusf
Copy link

minusf commented Aug 14, 2020

i also see huge performance degradation compared to old resolver (pip 20.2.2):

here is an example of creating a fairly involved mailman3 suite venv (130 packages). only certain packages are pinned in this case.

make venv  73.22s user 9.65s system 84% cpu 1:37.99 total   <---- new resolver

make venv  29.73s user 8.99s system 79% cpu 48.478 total    <---- old resolver

(mentioning #8664 so it shows up there as well)

Django==3.0.9
django-auth-ldap
django-cors-headers
django-debug-toolbar
django-extensions
django-uwsgi
hyperkitty==1.3.3
mailman
mailman-hyperkitty
postorius
psycopg2-binary==2.8.5
supervisor
uWSGI==2.0.19.1
whoosh

@uranusjr
Copy link
Member

Would you be able to provide the complete output of the corresponding pip install command? The new resolver does a lot more things that the old resolver should have done but never did, and it is difficult to determine whether the performance cost is unavoidable with just the input requirements.

@uranusjr
Copy link
Member

Closing in favour of #8664 so we can better keep discussions in one place.

@uranusjr uranusjr added the resolution: duplicate Duplicate of an existing issue/PR label Aug 14, 2020
@sbidoul
Copy link
Member Author

sbidoul commented Oct 25, 2020

Output with 20.2.4:

pip 20.2.4 from /tmp/tmp.XWMupjCuZs/venv-legacy-resolver/lib/python3.8/site-packages/pip (python 3.8)
*** install (legacy resolver)

real	0m11,052s
user	0m8,054s
sys	0m0,751s

*** reinstall (legacy resolver)

real	0m0,661s
user	0m0,606s
sys	0m0,057s

pip 20.2.4 from /tmp/tmp.XWMupjCuZs/venv-2020-resolver/lib/python3.8/site-packages/pip (python 3.8)
*** install (2020 resolver)

real	0m14,374s
user	0m11,702s
sys	0m0,698s

*** reinstall (2020 resolver)

real	0m0,709s  ==> ~= legacy resolver (great!)
user	0m0,645s
sys	0m0,066s

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
resolution: duplicate Duplicate of an existing issue/PR
Projects
None yet
Development

No branches or pull requests

3 participants