Skip to content

Perf regression: pipenv lock creates many full-tree copies of the project #4403

@mjpieters

Description

@mjpieters

Issue description

While trying to debug why pipenv update was taking a monumental amount of time, I noticed that the pipenv/resolver.py process was very busy with copying across data to /tmp/reqlib-src[randomvalue] directories. The project includes several GBs of log files, so this was eating up a lot of /tmp disk space and taking a lot of time.

I traced this to the following SetupInfo.from_ireq() lines:

if is_file and not is_vcs and path is not None and os.path.isdir(path):
target = os.path.join(kwargs["src_dir"], os.path.basename(path))
shutil.copytree(path, target, symlinks=True)
ireq.source_dir = target

This creates multiple copies of the source tree (in apparent defiance of the LRU decorator), even though the project is listed just once, with no other entries under [packages]:

[packages]
project-name = {editable = true,path = "."}

It appears to create a copy per dependency; we have 104:

$ pipenv graph --json | grep '"key"' | cut -d':' -f2 | sort -u | wc -l
104

and while the temp dirs get cleaned up once the process finishes, towards the end I counted nearly as many temp directories:

$ ls -1ld /tmp/reqlib-src*/project-name | wc -l
85

I'm assuming at this point that had I put in a breakpoint somewhere I'd have seen 104 directories before pipenv completes. (addendum: my later timing tests show that this isn't quite the case, but still close).

While having log files in a project source directory is not ideal, it was not otherwise a problem as setuptools, .gitignore and MANIFEST.in files are configured to ignore the log files. We can't work around the issue with a symlink either as shutil.copytree() is called with symlinks=True.

Even then, nearly 100 copies of the full project is a huge waste of resources.

Expected result

If pipenv must create an isolated environment, it should either attempt to enumerate the source distribution files to copy, or warn or clearly document that a full copy is created of the whole tree. It should then create just one copy.

By temporarily removing the log files, pipenv update completed (albeit still slowly) without completely trashing the filesystem.

Actual result

Either pipenv update times out (in pexpect), or you run out of disk space on your temp partition.


Note: I've cut sensitive information out of the --support output; the project dependencies are simply:

install_requires =
    apache-airflow[celery,postgres,redis] >= 1.10.11
    airflow_multi_dagrun
    airflow-prometheus-exporter
$ pipenv --support

Pipenv version: '2020.6.2'

Pipenv location: '/home/ubuntu/.local/lib/python3.6/site-packages/pipenv'

Python location: '/home/ubuntu/miniconda3/envs/project-name/bin/python'

Python installations found:

  • 3.7.4: /home/ubuntu/miniconda3/bin/python3.7m
  • 3.7.4: /home/ubuntu/miniconda3/bin/python3
  • 3.7.4: /home/ubuntu/miniconda3/bin/python3.7
  • 3.6.9: /usr/bin/python3
  • 3.6.9: /usr/bin/python3.6m
  • 3.6.9: /usr/bin/python3.6
  • 2.7.17: /usr/bin/python2
  • 2.7.17: /usr/bin/python2.7

PEP 508 Information:

{'implementation_name': 'cpython',
 'implementation_version': '3.6.10',
 'os_name': 'posix',
 'platform_machine': 'x86_64',
 'platform_python_implementation': 'CPython',
 'platform_release': '5.3.0-1027',
 'platform_system': 'Linux',
 'platform_version': '#29~18.04.1-Ubuntu SMP Mon Jun 22 15:19:42 UTC 2020',
 'python_full_version': '3.6.10',
 'python_version': '3.6',
 'sys_platform': 'linux'}

System environment variables:

  • CONDA_SHLVL
  • LC_ALL
  • LS_COLORS
  • LD_LIBRARY_PATH
  • CONDA_EXE
  • SSH_CONNECTION
  • LESSCLOSE
  • LANG
  • CONDA_PREFIX
  • S_COLORS
  • _CE_M
  • XDG_SESSION_ID
  • USER
  • PWD
  • HOME
  • CONDA_PYTHON_EXE
  • LC_CTYPE
  • LC_TERMINAL
  • SSH_CLIENT
  • TMUX
  • LC_TERMINAL_VERSION
  • XDG_DATA_DIRS
  • COMBILEXDIR
  • _CE_CONDA
  • LADSPA_PATH
  • CONDA_PROMPT_MODIFIER
  • SSH_TTY
  • MAIL
  • TERM
  • SHELL
  • TMUX_PANE
  • SHLVL
  • LOGNAME
  • XDG_RUNTIME_DIR
  • PATH
  • CONDA_DEFAULT_ENV
  • LESSOPEN
  • _
  • OLDPWD
  • PIP_DISABLE_PIP_VERSION_CHECK
  • PYTHONDONTWRITEBYTECODE
  • PIP_SHIMS_BASE_MODULE
  • PIP_PYTHON_PATH
  • PYTHONFINDER_IGNORE_UNSUPPORTED

Pipenv–specific environment variables:

Debug–specific environment variables:

  • PATH: /home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/miniconda3/bin:/home/ubuntu/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
  • SHELL: /bin/bash
  • LANG: C.UTF-8
  • PWD: /home/ubuntu/project-name

Contents of Pipfile ('/home/ubuntu/project-name/Pipfile'):

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]
flake8 = "*"
flake8-bugbear = "*"
black = "==19.10b0"
pre-commit = "*"
pytest = "*"
pytest-cov = "*"

[packages]
project-name = {editable = true,path = "."}

[requires]
python_version = "3.6"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions