This is the source for creating the nextstrain/base
Docker image. Currently
the image is published as nextstrain/base
.
Ideally most pathogen builds are supported by this base image without further customization. The possibility remains, however, for pathogens to define and use an image derived from this base layer. This would be desirable for pathogen builds requiring custom external software, like Python modules or tree-builders.
The image includes the standard Nextstrain components Fauna, Augur, and Auspice, as well as other bioinformatics tools like MAFFT, RAxML, FastTree, IQ-TREE, and TreeTime.
This image is best interacted with using the Nextstrain command-line tool.
To rebuild the image with the latest versions of its software and push to Docker Hub, go to the GitHub Actions workflow, select Run workflow, and confirm.
This is most helpful when you want the image to contain the latest version of a tool whose release does not automatically trigger a new build of the image and you do not need to modify the Dockerfile
.
You can build this image locally during development, but it's important for production releases to happen via CI so a complete multi-platform image is built and validated.
To build this image for local development and testing, run:
make local-image # or just: make
This will leave you with a localhost:5000/nextstrain/base:latest
image loaded
into your local Docker daemon and available to docker run
commands (and thus
nextstrain
commands). Run make
again to update the image after source
modifications.
Alternatively, you can take the steps yourself,
-
Start a local Docker registry.
./devel/start-localhost-registry
It will be served at
localhost:5000
. Optionally, specify another port as an argument. Running a local Docker registry allows us to mimic direct push to a registry done in the GitHub Actions CI workflow. -
Build the image.
./devel/build
By default, this builds for a single platform (
linux/amd64
orlinux/arm64
depending on your Docker server's arch), tags the image withlatest
, and pushes tolocalhost:5000
. See instructions at the top of the script for additional options.If the target platform is different from the build platform, set up emulation before running
./devel/build
. This can be achieved usingtonistiigi/binfmt
. For example, to set up emulation forlinux/arm64
, run:docker run --privileged --rm tonistiigi/binfmt --install arm64
On each subsequent change during your development iterations, you can run just
the ./devel/build
command again.
If you need to force the cached Nextstrain layers to rebuild to, for example,
pick up a new version of augur or auspice, set the CACHE_DATE
environment
variable to a new timestamp first:
export CACHE_DATE=$(date --utc +%Y%m%dT%H%M%SZ)
Otherwise, letting the build process use the cached layers will save you time during development iterations.
Before using the images, they should be checked for any inconsistencies.
./devel/validate-platforms
The output and exit code will tell you whether validation is successful.
Since the images are pushed directly to the local registry, they are not
available to the local Docker daemon after building (i.e.
nextstrain build --image nextstrain/base
does not refer to the latest built
image). To pull the images for local usage, run:
./devel/pull-from-registry
When building with make
, the newly built
localhost:5000/nextstrain/base:latest
image is automatically made available
for you. However, the corresponding base-builder
image is not.
To push images you've built locally to Docker Hub, you can run:
./devel/copy-images -t <tag>
This will copy the Nextstrain images from the local Docker registry to Docker Hub. See instructions at the top of the script for more options.
To add a software program to nextstrain/base
, follow steps in this order:
- Check if it is available via the Ubuntu package manager. You can use
apt-cache search
or Ubuntu Packages Search if you do not have an Ubuntu machine. If available, add it to theapt-get install
command followingFROM … AS final
(example). - Check if it is available via PyPI. You can search on PyPI's
website. If available, add an install command to
the section labeled with
Install programs via pip
. - Check if a pre-built binary for the
linux/amd64
platform (name containslinux
andamd64
/x86_64
) is available on the software's website (e.g. GitHub release assets). If available, add a download command to the section labeled withDownload pre-built programs
.- If a pre-built binary supporting
linux/arm64
(name containslinux
andarm64
/aarch64
) is also available, that should be used conditionally onARG
sTARGETPLATFORM
orTARGETOS
+TARGETARCH
in the Dockerfile. See existing usage of those arguments for examples.
- If a pre-built binary supporting
- The last resort is to build from source. Look for instructions on the
software's website. Add a build command to the section labeled with
Build programs from source
. Note that this can require platform-specific instructions. You should utilize cross-compilation tool available in the builder stage that runs on the build platform.
If possible, pin the software to a specific version. Otherwise, add the
download/install/build command to the section labeled with Add unpinned programs
to ensure the latest version is included in every Docker image build.
If possible, add the program to the builder stage that runs on the build platform to avoid slowness that may arise from emulation.
The smaller the image size, the better. To this end we build upon a "slim" Python image and use a multi-stage build where only artifacts are included in the final image without any of the software required only for compiling, installing, building, etc.
Try to follow Docker best practices for images, although not all apply to our use case, which is somewhat atypical.
The Dockerfile reference documentation is quite handy for looking up the
details of each Dockerfile command (COPY
, ADD
, etc).
Use bash
as the default shell for all stages in the Dockerfile to use handy
modern shell features.
Run bash scripts and Dockerfile commands with the -euo pipefail
options for
proper error handling. That is, these options should be set at the start of each
script and build stage in the Dockerfile.
Every push to this repository triggers a new build of the image with a GitHub Actions workflow. This helps ensure the image builds successfully with the new commits.
Images built from the master
branch are additionally pushed to the Docker
registry. The build instructions used by the workflow are in
this repo's .github/workflows/ci.yml
.
A local test suite of the image's properties and behaviours can be run with:
make test
These tests use Cram, which can be used directly to run individual test files, e.g.:
cram tests/basic.t
Separate integration and validation tests are also run in CI.
To remove all build artifacts and caches:
make clean
This will recover potentially many gigabytes of disk space. Subsequent builds will start with a clean slate.
Less all-inclusive cleaning can be done manually using selected commands found
in ./devel/clean
.