-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove compiler toolchain and python #23
Conversation
We currently include a full compiler toolchain (gcc, autoconf, automake, etc), as well as Python, in our Docker baseimage used for all Pelias Docker images. This used to be required, since we had several Node.js modules that needed to compile native code. However, since `better-sqlite3` added support for prebuilt binaries in [version 6.0](https://github.com/JoshuaWise/better-sqlite3/releases/tag/v6.0.0), I don't think we have any such modules left. Removing all these packages reduces the uncompressed baseimage size from 243MB to 230MB according to Google's [container-diff](https://github.com/GoogleContainerTools/container-diff). That's not huge, but it's not nothing either and if it's free, then why not?
I put together a little script to test that all our Docker images will still work with this change. From a directory with all the pelias repositories checked out, run the following: #!/bin/bash
set -euo pipefail
for i in api placeholder pip-service whosonfirst geonames csv-importer openaddresses openstreetmap schema docker-libpostal_baseimage; do
pushd $i
git stash || true
git checkout master || true
git pull
sed -i 's/FROM pelias\/baseimage/FROM pelias\/baseimage:remove-compiler-toolchain/' Dockerfile
docker build . -t pelias/$i:remove-compiler-toolchain
popd
done
All the tested Docker images build fine with this change, so I don't think it will break anything! Edit: the |
In pelias/docker-baseimage#23 we're leaning out our Docker baseimage used by all other Pelias images, and hopefully can remove the compiler toolchain all together. The Polylines Docker images _do_ need `gcc` (but not quite a full compiler toolchain), but didn't follow the convention in our other Dockerfiles of having an `apt-get` step to install it. This adds such a step, and is a little clever in installing `gcc` only temporarily, and only for the `go get` step that requires it. The polylines Docker image is already quite large (950MB uncompressed) since it includes Node.js, Go, and package dependencies for both. Skipping the installation of `gcc` cuts out 120MB of that. Until pelias/docker-baseimage#23 this change won't really have any impact on the size or operation of this Docker image.
After `better-sqlite3` added support for pre-compiled binaries in https://github.com/JoshuaWise/better-sqlite3/releases/tag/v6.0.0, we no longer need to install a compiler toolchain to run `npm install` in our Docker images. pelias/docker-baseimage#23 is workong on removing the compiler toolchain from our Pelias baseimages. In order for the toolchain to be removed from the whosonfirst image in particular, we also need to remove those dependencies here. Until that PR is merged, this change is effectively a no-op. After, between the two PRs we reduce the size of the whosonfirst docker image from 490MB to 261MB, an impressive 221MB savings!
After better-sqlite3 added support for [pre-compiled binaries](https://github.com/JoshuaWise/better-sqlite3/releases/tag/v6.0.0), we no longer need to install a compiler toolchain to run npm install in our Docker images. pelias/docker-baseimage#23 removes the compiler toolchain from our Pelias baseimages. In order for the toolchain to be removed from the Placeholder image in particular, we also need to remove those dependencies here. Similar to the whosonfirst repository in pelias/whosonfirst#532, this change by itself is effectively a no-op. After the baseimage removes the compiler toolchain,the size of the Placeholder docker image goes from 495MB to 266MB, an impressive 229MB savings!
The polylines Docker image is a bit of a large one currently, as it includes not just Node.js and a `node_modules` directory, but a full compiler toolchain, an install of the Go language, the dependencies of the `pbf` repository from https://github.com/missinglink/pbf, and the final `pbf` executable that comes from it. All told, this brought the total image size to a whopping 950MB uncompressed. This PR makes use of multi stage builds to run the compiling of the `pbf` executable in a separate container. After this, all the toolchain and dependencies needed can be thrown away, and only the small executable copied to the final image. Using `container-diff` it looks like the image size, uncompressed, after pelias/docker-baseimage#23 as well, will be only 322MB. That's a nice 600MB savings! Before pelias/docker-baseimage#23 the image size still drops to 500MB, still a healthy reduction. Replaces #262
The polylines Docker image is a bit of a large one currently, as it includes not just Node.js and a `node_modules` directory, but a full compiler toolchain, an install of the Go language, the dependencies of the `pbf` repository from https://github.com/missinglink/pbf, and the final `pbf` executable that comes from it. All told, this brought the total image size to a whopping 950MB uncompressed. This PR makes use of multi stage builds to run the compiling of the `pbf` executable in a separate container. After this, all the toolchain and dependencies needed can be thrown away, and only the small executable copied to the final image. Using `container-diff` it looks like the image size, uncompressed, after pelias/docker-baseimage#23 as well, will be only 322MB. That's a nice 600MB savings! Before pelias/docker-baseimage#23 the image size still drops to 500MB, still a healthy reduction. Replaces #262
The polylines Docker image is a bit of a large one currently, as it includes not just Node.js and a `node_modules` directory, but a full compiler toolchain, an install of the Go language, the dependencies of the `pbf` repository from https://github.com/missinglink/pbf, and the final `pbf` executable that comes from it. All told, this brought the total image size to a whopping 950MB uncompressed. This PR makes use of multi stage builds to run the compiling of the `pbf` executable in a separate container. After this, all the toolchain and dependencies needed can be thrown away, and only the small executable copied to the final image. Using `container-diff` it looks like the image size, uncompressed, after pelias/docker-baseimage#23 as well, will be only 322MB. That's a nice 600MB savings! Before pelias/docker-baseimage#23 the image size still drops to 500MB, still a healthy reduction. Replaces #262
…encies After pelias/docker-baseimage#23, we will no longer have a compiler toolchain in our Docker baseimage. However, due to the way Docker images work and build upon each other, the biggest wins come from ensuring we don't have a compiler toolchain _anywhere_ in our images. If you think about it, even a single image having a compiler toolchain is the same as the baseimage having it, at least when comparing the total size of all our images. Thankfully, with multistage builds we can easily remove both the C++ compiler toolchain and Golang buildtime dependencies in the libpostal service, similar to pelias/polylines#263. This alone drops the total image size for the libpostal-service from 3.2GB to 2.8GB. Further improvements are possible in the libpostal baseimage.
…encies After pelias/docker-baseimage#23, we will no longer have a compiler toolchain in our Docker baseimage. However, due to the way Docker images work and build upon each other, the biggest wins come from ensuring we don't have a compiler toolchain _anywhere_ in our images. If you think about it, even a single image having a compiler toolchain is the same as the baseimage having it, at least when comparing the total size of all our images. Thankfully, with multistage builds we can easily remove both the C++ compiler toolchain and Golang buildtime dependencies in the libpostal service, similar to pelias/polylines#263. This alone drops the total image size for the libpostal-service from 3.2GB to 2.8GB. Further improvements are possible in the libpostal baseimage.
With pelias/docker-baseimage#23 removing the C++ compiler toolchain from our baseimages, ideally we will remove build time dependencies everywhere to save space on disk and over the network. The libpostal baseimage does require a compiler toolchain to build libpostal, but not to run it. So we can use a multistage image where libpostal is compiled with all its dependencies, but only the build artefacts are kept in the final image. That saves a few hundred MB, but the libpostal GitHub repository is also about 80MB, so we get even more savings there.
With pelias/docker-baseimage#23 removing the C++ compiler toolchain from our baseimages, ideally we will remove build time dependencies everywhere to save space on disk and over the network. The libpostal baseimage does require a compiler toolchain to build libpostal, but not to run it. So we can use a multistage image where libpostal is compiled with all its dependencies, but only the build artefacts are kept in the final image. That saves a few hundred MB, but the libpostal GitHub repository is also about 80MB, so we get even more savings there.
I'm going to hit merge on this since it seems safe, but I won't explicitly bump all the end projects to rebuild on this new baseimage just yet. The highly active repositories like API and Placeholder will get this quickly, but I'll take care of all the others after hopefully updating the baseimage to Node.js 16 soon. |
After better-sqlite3 added support for [pre-compiled binaries](https://github.com/JoshuaWise/better-sqlite3/releases/tag/v6.0.0), we no longer need to install a compiler toolchain to run npm install in our Docker images. pelias/docker-baseimage#23 removes the compiler toolchain from our Pelias baseimages. In order for the toolchain to be removed from the Placeholder image in particular, we also need to remove those dependencies here. Similar to the whosonfirst repository in pelias/whosonfirst#532, this change by itself is effectively a no-op. After the baseimage removes the compiler toolchain,the size of the Placeholder docker image goes from 495MB to 266MB, an impressive 229MB savings!
The [node-postal](https://github.com/openvenues/node-postal) NPM module requires a full C++ compiler toolchain _and_ python3 to install. After pelias/docker-baseimage#23 and pelias/docker-libpostal_baseimage#5 this toolchain is no longer present in our Docker baseimage. This PR uses a Docker multi-stage build to build _just_ the NPM modules required by the interpolation service while a C++ toolchain is present. The `node_modules` directory can then be copied to the final image without needing a C++ toolchain or python to be present. In addition to saving some space in the final image, this fixes issues people were having with our Docker images, since `node-postal` wasn't functional. Fixes pelias/docker#271
The [node-postal](https://github.com/openvenues/node-postal) NPM module requires a full C++ compiler toolchain _and_ python3 to install. After pelias/docker-baseimage#23 and pelias/docker-libpostal_baseimage#5 this toolchain is no longer present in our Docker baseimage. This PR uses a Docker multi-stage build to build _just_ the NPM modules required by the interpolation service while a C++ toolchain is present. The `node_modules` directory can then be copied to the final image without needing a C++ toolchain or python to be present. In addition to saving some space in the final image, this fixes issues people were having with our Docker images, since `node-postal` wasn't functional. Fixes pelias/docker#271
We currently include a full compiler toolchain (gcc, autoconf, automake, etc), as well as Python, in our Docker baseimage used for all Pelias Docker images.
This used to be required, since we had several Node.js modules that needed to compile native code.
However, since
better-sqlite3
added support for prebuilt binaries in version 6.0, I don't think we have any such modules left.Removing all these packages reduces the uncompressed baseimage size from 243MB to 230MB according to Google's container-diff. That's not huge, but it's not nothing either and if it's free, then why not?
Note that some of our Dockerimages, like whosonfirst and placeholder, include some of these same dependencies, so we'll have to remove them there too.
Closes #21