Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment activation is slow #25555

Closed
adamjstewart opened this issue Aug 22, 2021 · 3 comments · Fixed by #25633
Closed

Environment activation is slow #25555

adamjstewart opened this issue Aug 22, 2021 · 3 comments · Fixed by #25633

Comments

@adamjstewart
Copy link
Member

Activating/deactivating Spack environments is incredibly slow:

$ time spack env activate .

real	2m13.037s
user	1m25.584s
sys	0m43.654s
$ time spack env deactivate

real	2m30.974s
user	1m38.090s
sys	0m49.781s

For comparison, for a similarly sized Conda environment:

$ time conda activate azureml_py38

real	0m0.099s
user	0m0.081s
sys	0m0.018s

Unfortunately, pyinstrument doesn't work for bash functions (which are required for env support), and spack --profile doesn't work either.

May be related to #25541, #25306

@adamjstewart
Copy link
Member Author

This is with a fairly large environment:

spack:
  specs:

  # Shells
  - bash

  # Spack dependencies
  - clingo
  - gnupg
  - graphviz
  - kcov

  # Linux tools
  - watch
  - wget

  # Compression
  - p7zip
  - unrar

  # Software installation tools
  - automake
  - autoconf
  - m4
  - cmake
  - ninja
  - patchelf
  - scons

  # Version control systems
  - mercurial
  - subversion

  # Python libraries
  - python
  - py-azureml-sdk
  - py-black
  - py-cartopy
  - py-cmocean
  - py-codecov
  - py-dask
  - py-fiona
  - py-flake8
  - py-geocube
  - py-geopandas
  - py-geoplot
  - py-inference-schema+numpy
  - py-ipywidgets
  - py-isort+colors
  - py-joblib
  - py-jupyterlab
  - py-matplotlib
  - py-metpy
  - py-mypy
  - py-numpy
  - py-openpyxl
  - py-pandas
  - py-pycocotools
  - py-pygeos
  - py-pyinstrument
  - py-pytest
  - py-pytest-cov
  - py-pytest-mock
  - py-pytorch-sphinx-theme
  - py-pyyaml
  - py-radiant-mlhub
  - py-rarfile
  - py-rasterio
  - py-scikit-learn
  - py-scipy
  - py-seaborn
  - py-setuptools
  - py-shapely
  - py-sphinx
  - py-sphinx-rtd-theme
  - py-sphinxcontrib-programoutput
  - py-statsmodels
  - py-tables
  - py-torch
  - py-torchvision
  - py-twine
  - py-vermin
  - py-wheel
  - py-xarray
  - py-xgboost

  # Research libraries
  - gdal
  - opencv+imgcodecs+python3+tiff+jpeg+png  # no idea why it won't take this from packages.yaml

  concretization: together

@adamjstewart
Copy link
Member Author

From running spack -d, it looks like the issue is that we are running the code I added in #24095 for every Python package. Coupled with #25306, we are running this recursively for every package.

In the short term, I think we can just cache that result. In the long term, we need to do something differently in that code anyway because distutils is being removed.

@alalazo

@alalazo
Copy link
Member

alalazo commented Aug 23, 2021

In the long term, we need to do something differently in that code anyway because distutils is being removed.

I was thinking that it might be difficult to settle on a single strategy for all the Python versions we support and all the Python packages, so maybe we should:

  1. Design an approach that could account for different strategies (distutils + ?)
  2. Have some way to dispatch this call to the right strategy depending on a) Python version in the DAG b) Metadata in the package

tgamblin pushed a commit that referenced this issue Aug 26, 2021
This is a direct followup to #13557 which caches additional attributes that were added in #24095 that are expensive to compute. I had to reopen #25556 in another PR to invalidate the GitLab CI cache, but see #25556 for prior discussion.

### Before

```console
$ time spack env activate .

real	2m13.037s
user	1m25.584s
sys	0m43.654s
$ time spack env view regenerate
==> Updating view at /Users/Adam/.spack/.spack-env/view

real	16m3.541s
user	10m28.892s
sys	4m57.816s
$ time spack env deactivate

real	2m30.974s
user	1m38.090s
sys	0m49.781s
```

### After
```console
$ time spack env activate .

real	0m8.937s
user	0m7.323s
sys	0m1.074s
$ time spack env view regenerate
==> Updating view at /Users/Adam/.spack/.spack-env/view

real	2m22.024s
user	1m44.739s
sys	0m30.717s
$ time spack env deactivate

real	0m10.398s
user	0m8.414s
sys	0m1.630s
```

Fixes #25555
Fixes #25541 

* Speedup environment activation, part 2
* Only query distutils a single time
* Fix KeyError bug
* Make vermin happy
* Manual memoize
* Add comment on cross-compiling
* Use platform-specific include directory
* Fix multiple bugs
* Fix python_inc discrepancy
* Fix import tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants