Lighter docker containers #529

eigenraven · 2021-10-28T17:42:12Z

The current way the containers are set up wastes a lot of resources when setting them up on different machines. There could be a faasm-builder container that builds all of the programs (upload, pool_runner, codegen, etc.), and then they all get copied with their dependencies into small runner containers, without any of the build files. We have to be careful about FS structures and shared library paths, but it should be doable, and possibly automatically tested by checking the executables with ldd to make sure all libraries are available. The faasm-builder container could then act as the cli container for rapid iterative development, with resulting executables bind-mounted into the corresponding runner containers.

Switching from ubuntu to debian, or even something like alpine linux could also massively reduce the size of containers. With apt-based distros, if the apt commands are executed within one RUN layer, followed by a rm -rf /var/lib/apt/lists/* (and maybe more cache directories etc.) can yield smaller images.

Reference for multi stage builds: https://docs.docker.com/develop/develop-images/multistage-build/

The text was updated successfully, but these errors were encountered:

Shillaker · 2021-10-29T07:58:53Z

The base image is almost the builder you mention, as it sets up the CMake build that is then used by the child images such as worker and upload. However, at the moment each of these images inherits from base and builds their respective binaries on top (hence still needs all the build files and deps). We could convert this relatively easily to the builder approach by adding a few lines to base where it builds all the binaries, then have worker and upload be multi-stage builds which copy from base rather than inherit from it.

We already do the mounting of built binaries in the "dev cluster" setup described in the dev docs.

However, before doing this, it would be good to quantify how much gain we can actually get, as it will inevitably introduce more complexity and fragility in ensuring everything is copied over correctly from base, and disrepancies between the runtime distro and the build/ development environment. We can then make a call as to whether it's worth it.

Switching from ubuntu to debian, or even something like alpine linux could also massively reduce the size of containers.

AFAICT from my machine, ubuntu:20.04 is about 70MB, debian:stretch-slim is about 55MB, and alpine:3.14 is about 5MB. Although this is a good saving in itself, as a percentage of our overall image size it may not be much. In addition, that ubuntu image may contain some of the dependencies we'd need to install on top of alpine, so I'm not sure how much it would contribute to the overall size. Would be good to weigh up the pros and cons of saving 20 or 60MB respectively.

I would propose getting a rough estimate of the size of stuff we can remove vs. the size of stuff that has to stay. Stuff we can remove is bloat from the underlying distro image, build files and build/ test dependencies, and stuff we can't remove is the runtime dependencies and runtime artifacts like wasm files, and python libs used by the runtime. With this we can then make a call as to whether it's worth it. If it would be going from a 1GB image to a 700MB one, i'm not sure, but if it was from 1GB to 200MB it would be a good call.

eigenraven · 2022-01-04T15:57:22Z

Some numbers from docker refactors I just did in auto-ndp/faasm@51babc0...f4dfacc (splitting base into base and base-runtime, making worker/upload use base-runtime, small tweaks to other containers)

faabric-base: 900MB -> 600MB
faabric-base-runtime: 180MB
sgx: 200MB -> 17MB (made it a two-phase build, with the second image containing just the sdk folder and no ubuntu at all)
cpp-sysroot: 1.2GB -> 700MB (removing old llvm/clang packages that are not used anymore)
base: unchanged, 1.7GB
base-runtime: 530MB
worker,upload: 1.9GB -> 500 MB
cli: 3GB -> 2.3GB

At this point worker&upload is 50% apt dependencies, 50% binaries. There might be a couple of unnecessary libraries installed through apt and the binaries are super large due to symbol tables - if I didn't care about stacktraces, those could be stripped down significantly. I have also copied all binaries (except tests) into both containers atm to speed up docker builds, but only half of them are actually necessary in each container.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lighter docker containers #529

Lighter docker containers #529

eigenraven commented Oct 28, 2021

Shillaker commented Oct 29, 2021 •

edited

eigenraven commented Jan 4, 2022

Lighter docker containers #529

Lighter docker containers #529

Comments

eigenraven commented Oct 28, 2021

Shillaker commented Oct 29, 2021 • edited

eigenraven commented Jan 4, 2022

Shillaker commented Oct 29, 2021 •

edited