Multi-stage docker build of Haskell webapp
Since Docker 17.05, there is support for multi-stage builds.
This is an example and tutorial for using it to build simple Haskell webapp.
The setup is simple: single
Dockerfile, yet the resulting docker image
is only megabytes large.
Essentially, I read through Best practices for writing Dockerfiles
and made a
A word of warning: If you think Nix is the tool to use, I'm fine with that.
But there's nothing for you in this tutorial.
This is an opinionated setup: Ubuntu and
cabal-install's nix-style build.
Also all non-Haskell dependencies are assumed to be avaliable for install
apt-get. If that's not true in your case, maybe you should
The files are on GitHub: phadej/docker-haskell-example. I refer to files by names, not paste them here.
Assuming you have docker tools installed, there are seven steps to build a docker image:
Write your application. Any web-app would do. The assumptions are that the app
- listens at port 8000
- doesn't daemonize itself
cabal.projectcontaining at least
index-state: 2019-06-17T09:52:09Z with-compiler: ghc-8.4.4 packages: .
index-statemakes builds reproducible enough
with-compilerselect the compiler so it's not the default
.dockerignore. The more stuff you can ignore, the better. Less things to copy to docker build context. Less things would invalidate docker cache. Especially hidden files are not hidden from docker, like editors' temporary files. I hide
.gitdirectory. If you want to burn git-hash look at known issues section.
docker.cabal.configis used in
Dockerfile. In most cases you don't need to edit
Dockerfile. You need, if you need some additional system dependencies. The next step will tell, if you need something.
Build an image with
docker build --build-arg EXECUTABLE=docker-haskell-example --tag docker-haskell-example:latest .
If it fails, due missing library, see next section. You'll need to edit
Dockerfile, and iterate until you get a successful build.
After successful build, you can run the container locally
docker run -ti --publish 8000:8000 docker-haskell-example:latest
This step is important, to test that all runtime dependencies are there.
And try it from another terminal
curl -D - localhost:8000
It should respond something like:
HTTP/1.1 200 OK Transfer-Encoding: chunked Date: Thu, 04 Jul 2019 16:15:37 GMT Server: Warp/3.2.27 Content-Type: application/json;charset=utf-8 ["hello","world"]
What happens in the Dockerfile
Dockerfile is written with a monorepo setup in mind. In other words
setup, where you could build different docker images from a single repository.
That explains the
--build-arg EXECUTABLE= in a
docker build command.
It's also has some comments explaining why particular steps are done.
There are two stages in the
Dockerfile, builder and deployment.
In builder stage we install all build dependencies,
separating them into different
RUNs, so we could avoid
cache invalidation as much as possible.
A general rule: Often changing things have to installed latter.
We install few dependencies from Ubuntu's package repositories. That list is something you'll need to edit once in a while. The assumption is that all non-Haskell stuff comes from there (or some PPA). There's also a corresponding list in deployment stage, there we install only non-dev versions.
# More dependencies, all the -dev libraries # - some basic collection of often needed libs # - also some dev tools RUN apt-get -yq --no-install-suggests --no-install-recommends install \ build-essential \ ca-certificates \ curl \ git \ libgmp-dev \ liblapack-dev \ liblzma-dev \ libpq-dev \ libyaml-dev \ netbase \ openssh-client \ pkg-config \ zlib1g-dev
At some point we reach a point, where we add
*.cabal file. This is something
you might need to edit as well, if you have multiple cabal files in different
# Add a .cabal file to build environment # - it's enough to build dependencies COPY *.cabal cabal.project /build/
We only add these, so we can build dependencies.
# Build package dependencies first # - beware of https://github.com/haskell/cabal/issues/6106 RUN cabal v2-build -v1 --dependencies-only all
and their cache
won't be violated by changes in the actual implementation of the webapp.
This is common idiom in
Issue 6106 might
be triggered if you vendor some dependencies. In that case
change the build command to
RUN cabal v2-build -v1 --dependencies-only some-dependencies
listing as many dependencies (e.g.
warp) as possible.
After dependencies are built, the rest of the source files are added
and the executables are built, stripped, and moved to known location
The deployment image is slick. We pay attention and don't install
development dependencies anymore. In other words we install only runtime dependencies.
E.g. we install
I also tend to install
curl and some other cli tool to help debugging.
In deployment environments where you can shell into the running containers, it
helps if there's something you can do. That feature is useful to debug network
problems for example.
The resulting image is not the smallest possible, but it's not huge either:
REPOSITORY TAG SIZE docker-haskell-example latest 137MB
Cold build is slow.
Rebuilds are reasonably fast,
if you don't touch
Cabal issue #6106 may require you to edit
--dependencies-onlybuild step, as explained above.
Git Hash into built executable. My approach is to ignore whole
.gitdirectory, as it might grow quite large. Maybe unignoring (with
.git/refs(which are relatively small) will make
gitrevand alike work. Please tell me if you try!
Caching of Haskell dependencies is very rudimentary. It could be improved largely, if
/cabal/storecould be copied out after the build, and in before the build. I don't really know how to that in Docker. Any tips are welcome. For example with
docker runone could use volumes, but not with
Look at the example repository. I hope this is useful for you.