Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the resolve cache to skip installs. #815

Merged
merged 2 commits into from Nov 27, 2019

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Nov 26, 2019

Previously the cache was only used by pip for http queries. Now we also
use the cache to skill wheel installs in their isolated chroots.

Fixes #809
Fixes #811

Previously the cache was only used by pip for http queries. Now we also
use the cache to skill wheel installs in their isolated chroots.

Fixes pex-tool#809
Fixes pex-tool#811
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for the speed up!

if os.path.exists(chroot):
TRACER.log('Using cached installation of {} at {}'.format(wheel_file, chroot))
else:
TRACER.log('Installing {} in {}'.format(wheel_file_path, chroot))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in makes it sound like you are installing it in the chroot directory. Really, the chroot variable here is the specific file location, not the parent directory. A simple fix would be to change the preposition from in to at.

Better, I think, is to rename chroot to something like chrooted_wheel_path or something that more clearly indicates this is the chrooted file location, not the general chrooted directory location.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in makes it sound like you are installing it in the chroot directory.

I am actually. The chroot directory is like a mini-site-packages that will contain exactly one distribution after this operation, the wheel file we install there (and which is exploded + other things by pip in the process). To be super clear hopefully:

$ rm -rf ~/.pex
$ python -mpex requests -- -c ''
$ tree -d -L 2 ~/.pex/build/chroots/
/home/jsirois/.pex/build/chroots/
├── certifi-2019.9.11-py2.py3-none-any.whl
│   ├── certifi
│   └── certifi-2019.9.11.dist-info
├── chardet-3.0.4-py2.py3-none-any.whl
│   ├── bin
│   ├── chardet
│   └── chardet-3.0.4.dist-info
├── idna-2.8-py2.py3-none-any.whl
│   ├── idna
│   └── idna-2.8.dist-info
├── requests-2.22.0-py2.py3-none-any.whl
│   ├── requests
│   └── requests-2.22.0.dist-info
└── urllib3-1.25.7-py2.py3-none-any.whl
    ├── urllib3
    └── urllib3-1.25.7.dist-info

16 directories

In that example, taking one chroot of ~/.pex/build/chroots/requests-2.22.0-py2.py3-none-any.whl, you'll see that installed in it is exactly one distribution, requests-2.22.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's confusing that the chroot dir basename is the zipped wheel basename? That is a hack that allows PEXEnvironment to activate the correct set of wheels at runtime as explained in the comment up above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What confused me is thinking of chroot as one singular thing and not realizing you are referring to each wheel as having its own chroot. I was conceptualizing chroot to be the parent folder /home/jsirois/.pex/build/chroots/.

Renaming chroot to wheel_chroot would have helped to figure that out.

pex/resolver.py Outdated Show resolved Hide resolved
Comment on lines +258 to +260
environment = Environment(search_path=[chroot])
for dist_project_name in environment:
resolved_distributions.extend(environment[dist_project_name])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go into the below TRACER.timed block? It seems this is only setup for the environment markers code to work properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not easily since its part of this loop. Also, its used to get the pkg_resources.Distribution object which is part of the return type of the function: pex.resolver.ResolvedDistribution.distribution

@jsirois jsirois merged commit 5b8a9b4 into pex-tool:master Nov 27, 2019
@jsirois jsirois deleted the issues/811/cache_chroots branch November 27, 2019 00:40
jsirois added a commit that referenced this pull request Dec 6, 2019
The three major phases of a pex resolve: download, build and install,
are delegated to pip subprocesses. As such, we can easily parallelize
these operations only needing to take care of shared portions of the
filesystem the processes might mutate. In the end this is only relevant
to the build and install phases which are natural points to cache
results in a shared filesystem cache. The install phase is already
cached in this way (#815) and we add caching for the build phase as
well, both utilizing (posix in the end) guarantees around `os.rename`.

The atomic shared directory updates are achieved with `AtomicDirectory`
and the subprocess parallelization is acheived using the new `jobs`
module and request / response data object pairs to coordinate parallel
pip jobs.

Example speedups:

+ PEX build using prebuilt wheels (the #811 case):
  ```
  pex \
    --platform=manylinux1-x86_64-cp-35-m \
    --platform=manylinux1-x86_64-cp-36-m \
    --platform=manylinux1-x86_64-cp-37-m \
    --platform=macosx-10.9-x86_64-cp-35-m \
    --platform=macosx-10.9-x86_64-cp-36-m \
    --platform=macosx-10.9-x86_64-cp-37-m \
    numpy==1.17.4 \
    --python-shebang "/usr/bin/env python3" \
    -o numpy-pex1.6.2.pex
  ```
  pex version | cold      | warm
  ------------|-----------|----------
  1.6.12      | 0m41.313s | 0m20.759s
  2.0.2       | 0m53.236s | 0m27.217s
  HEAD        | 0m32.336s | 0m17.596s

+ PEX build including sdists (wheel builds):
  ```
  pex \
    pantsbuild.pants==1.22.0 \
    -o pantsbuild.pants.pex
  ```
  pex version | cold      | warm
  ------------|-----------|----------
  1.6.12      | 0m49.101s | 0m11.788s
  2.0.2       | 1m15.602s | 1m2.476s
  HEAD        | 0m47.174s | 0m15.309s

Fixes #811
Fixes #817
Fixes #818
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants