-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use python-builds instead of Miniconda #370
Comments
I'd be happy to have a conversation about this. For context, I used to work at Anaconda on conda/package builds/miniconda, and I've also spent a fair amount of time with docker. I don't think that changing from miniconda to our own internal python build will be a huge win. Miniconda may duplicate some low-level system libraries, but with docker images being very stripped down already, I don't think there's much overlap. This is a guess and I have not verified it with hard numbers. Our own internal python can only be as much better as the elimination of that overlap. However, neither miniconda nor our internal python build are likely to be stripped down for the sake of creating a small docker image. Debug symbols are generally what take up most space in these things. You can strip them as a step in creating the docker image. The big downside of stripping debug symbols is that it's harder to make sense of things when they crash. Here's some more info: docker-library/php#297 One possible alternative for miniconda is using micromamba to create environments: https://hub.docker.com/r/mambaorg/micromamba This doesn't imply that the created environments have stripped binaries - they'll probably also be too big. It's a way to avoid installing miniconda, but the packages that are being installed are the same. If you install a basic python environment with this tool, it'll be probably equivalent to the miniconda base environment (minus conda and any dependencies outside of Python). Information around stripping in conda packages: conda-forge/conda-forge.github.io#520 There's an article at https://uwekorn.com/2021/03/01/deploying-conda-environments-in-docker-how-to-do-it-right.html about trimming content. That's pretty specialized and perhaps fragile, since it entails removing so many files based on domain knowledge. This is also not an argument that you should keep using conda. This is a means of provisioning a python installation, and it is separate from the question of how you install other software in that installation after it has been provisioned. Using micromamba as linked above will not install conda into the created environment unless you explicitly list it for installation. The idea of binary stripping applies just as well to any internal python build setup, but it's probably easier to use the official docker python images where extensive effort has gone in to trimming them to minimal size: https://hub.docker.com/_/python. Even so, the compressed images aren't much, if any, better than the Miniconda installs (around 300 MB compressed) - https://hub.docker.com/_/python/tags. I'm not sure if those have stripped binaries. |
Ran an experiment. Here's a dockerfile to build an image with micromamba that comes out to 310 MB compressed:
env.yaml (the environment spec for micromamba) has:
Here's a dockerfile using micromamba and stripping that gets down to 212 MB:
env.yaml has:
binutils is transient - it provides the If you build this, there's a whole lot of noisy warnings from the stripping, because the input file search is pretty heavy-handed and throws a lot of non-binary stuff at it. Just ignore it. |
There is a lot of discussion still going on for this. We will make sure to do this the same way it's happening elsewhere, which is still tbd. |
Miniconda can be around 400MB fully installed and we install it multiple times in most images. There are a couple alternatives to Miniconda, but we do have an internal python build setup that we want to consider using here. Investigate using this and seeing how it works. We will have a follow-up ticket to roll this out more fully, which shouldn't happen yet.
The text was updated successfully, but these errors were encountered: