Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip packages not getting cached #25

Open
PLPeeters opened this issue Mar 28, 2018 · 9 comments
Open

pip packages not getting cached #25

PLPeeters opened this issue Mar 28, 2018 · 9 comments

Comments

@PLPeeters
Copy link

The way things are right now, the pip packages are not getting cached, which often adds unnecessary build time since the packages don't change often. It would be better to run it first thing from the Dockerfile instead of in a script to leverage Docker's caching mechanism, as explained here: https://www.aptible.com/documentation/enclave/tutorials/faq/dockerfile-caching/pip-dockerfile-caching.html

@GrahamDumpleton
Copy link
Owner

Packages are not cached because doing that results in them bloating out the image, as they would live in a lower layer. Deleting them in a higher layer doesn't free up any space the image is still just as fat.

For speeding up build times, you are better off relying on creating a Python wheelhouse which contains wheel versions of all packages that you require. The wheelhouse directory can then be injected into a build in some way with packages installed from it, with fallback to PyPi if necessary. The wheelhouse directory is then deleted when done in the same layer to avoid bloating the image.

@GrahamDumpleton
Copy link
Owner

GrahamDumpleton commented Mar 28, 2018

FWIW, Glyph has posted on this topic before at:

and I have also posted about it as well.

The newer versions of docker images for Python I had been working on incorporated support for using a Python wheelhouse.

@PLPeeters
Copy link
Author

I should have chosen a better title; what I mean is that pip packages are not being installed in their own RUN command, which from what I understand effectively prevents Docker from caching the would-be pip install layer. So I don't mean caching the packages in the image itself, which would indeed cause unnecessary bloating.

I'll check out the wheelhouse idea though, seems like an interesting workaround!

@GrahamDumpleton
Copy link
Owner

In the general case, the other issue is that a requirements.txt file can list a local directory from which to install a package. This could even be the application code itself, as some people like to create a package from their code. In that case the application code has to already be in the image before pip is run. So the order things is done is also based on providing one generic solution that works in all cases.

@PLPeeters
Copy link
Author

That does make sense, although if I'm not mistaken you could probably use a build argument that makes the pip install part run after copying the code to the image (or the other way around, depending on what you want to be the default).

@PLPeeters
Copy link
Author

So I tried the wheelhouse approach and I'm running into some issues, so I must have done something wrong somewhere.

I created a .whiskey/wheelhouse directory and ran pip wheel -r ../../requirements.txt from there. I then tried running a build and got the following error:

Sending build context to Docker daemon  58.58MB
Step 1/31 : FROM grahamdumpleton/mod-wsgi-docker:python-2.7-onbuild
# Executing 2 build triggers
 ---> Running in 72e029677ad3
 -----> Detected wheelhouse for pip
 -----> Installing dependencies with pip
The command '/bin/sh -c mod_wsgi-docker-build' returned a non-zero code: 137
Docker build failed. Aborting.

Any clues?

@GrahamDumpleton
Copy link
Owner

When those images were originally written, the concept of build arguments didn't exist in docker.

As to trying to do the wheelhouse, where are trying to do that? That image probably doesn't have a new enough pip and also likely lacks the wheel package. I don't recollect ever using it to test wheelhouse builds.

Because Docker Inc blocked me from being able to build that image any more on Docker hub using automated builds, it has been neglected. The intent was to replace it with a newer image with it done differently that could be built using automated builds, but I have had next to no interest from Python community in all the work I have been doing on creating better docker images for using with Python, so has been little incentive.

@PLPeeters
Copy link
Author

The commands above were run on my local machine. The image I'm running has pip 9.0.1. I'm not sure what I did, but it suddenly worked. I did rebuild my wheelhouse from inside the image instead of from my local machine in order to have the correct wheels.

@PLPeeters
Copy link
Author

Okay so even when I add RUN rm -rf .whiskey/wheelhouse at the top of my Dockerfile, the wheelhouse seems to remain in the image somewhere because it's 100 MB larger than when I don't include the wheelhouse... I checked your scripts though, and nothing seems to copy it anywhere, so I'm a bit confused... Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants