# Building Container Images for High-Performance Computing

This course covers building container images with [Docker](https://www.docker.com) and [Singularity](https://www.sylabs.io/singularity).  It also describes how to use [HPC Container Maker](https://github.com/NVIDIA/hpc-container-maker), a tool to simplify the process of creating container specification files for High Performance Computing.  Among the topics covered are container specification files, the basics of building container images, and techniques for managing the size of container images.

The lab assumes you are familiar with basic Linux shell commands.

Before beginning, please make sure the lab environment is correctly setup by running the two cells below.  To run a cell, highlight the cell and press control-enter or click on the "Run" button in the toolbar.

In [1]:
!docker --version

Docker version 19.03.5, build 633a0ea838


In [2]:
!singularity --version

singularity version 3.5.2+1-gf6aa369


## Why Containers for HPC?

Containers are a very popular technology in IT, but also apply to High-Performance Computing (HPC).

HPC applications are often host specific.  For instance, building a HPC application one system and then trying to run that binary on a different HPC system can be a nightmare.  Software dependencies such as MPI and math libraries are likely installed in different locations, may be different versions, or missing entirely.  The underlying Linux distribution may not even be the same.

Containers bundle the entire application user-space into a single portable package.  As a result, the application environment is both portable and consistent, agnostic to the underlying system software configuration.  The container images may be deployed widely, and even shared with others, with confidence that the results will be reproducible regardless of the underlying system.

Containers make life simple for both system administrators and end users.  System administrators do not need to maintain the hundreds of interdependent software packages requested by end users.  End users can download a container from a repository such as the [NVIDIA GPU Cloud](https://ngc.nvidia.com), [Docker Hub](https://hub.docker.com), or [Singularity Hub](https://singularity-hub.org) and be running in a matter of minutes rather than the often lengthy process of building software for each specific system.

Downloading and using a container image from a repository is the ideal case.  But what if the application environment of interest is not available?  This course will describe how you can build your own container images from scratch.  After you have successfully built an application container image, consider uploading it to a container repository so that others can benefit from your work.

At the end of the course, you will build a container image for a real GPU enabled application.  It can take some time to build the application image of a typical HPC application code.  In order to speed the build up when we get to that point, we need to prefetch some of the software components in the background.  Don't worry about what it's doing right now, although by the end of the course you should understand what it is doing.  Go ahead to the next section after evaluating the following cell.

In [4]:
!docker images 

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
ubuntu              16.04               9499db781771        7 weeks ago         131MB
centos              7                   8652b9f0cb4c        2 months ago        204MB


## Container Image Formats

The [Singularity container runtime](https://www.sylabs.io/singularity/) is specifically designed for the High-Performance Computing use case.  Besides features such as running containers without requiring access to a superuser account, the Singularity container image format is a single "flat" file.  That makes Singularity container images very easy to transfer between systems and share across a cluster.

So why does this lab also cover building container images with Docker?  In short, while Singularity has many advantages as a container runtime for HPC, the Docker image builder has many advantages as a container image builder.  The Docker container image is ["layered"](https://github.com/opencontainers/image-spec).  The advantages of "layered" images include a build cache to speed up building container images and multi-stage builds to minimize the size of the final container image by more precisely controlling the image content.

Fortunately, Singularity can easily work with Docker images.  The best practice described in this lab for HPC containers is:

1. Specify the content of container images with [HPC Container Maker](https://github.com/NVIDIA/hpc-container-maker)
2. Build container images with Docker
3. Convert the Docker images to Singularity images
4. Use Singularity to run containers on your HPC system

This lab will cover all four of these topics.

## Building Container Images With Singularity

This part of the course covers how to [build container images with Singularity](https://sylabs.io/guides/3.2/user-guide/build_a_container.html).

Administrative privileges are required to build Singularity container images.  In contrast to Docker, running Singularity containers does not require administrative privileges.  By default, Singularity uses a `setuid` helper program when elevated privileges are needed.

### Building Your First Singularity Image: Hello World!

A [Singularity definition file](https://sylabs.io/guides/3.2/user-guide/definition_files.html) is a plain text file that specifies the instructions to create your container image. By convention this file is named `Singularity.def`, but any name may be used. The definition file syntax resembles the syntax of RPM spec files.

For this first image, we'll use a very simple [definition file](/lab/edit/singularity/Singularity.def) to build a container for the classic ["Hello World!" program](/lab/edit/sources/hello.c). Singularity will build your container image based on the ubuntu:16.04 container image from Docker Hub. It will try to find it locally first, then will go the default repository (Docker Hub) to download the image. 

The Ubuntu base container on Docker Hub does not include development tools in order to help minimize the size of the image. The definition file installs the GNU C compiler and standard C headers. 

Once the development environment is setup, the "Hello World" program can be built from source.

Build the "Hello World" container image by invoking `singularity build` with the definition file.

In [5]:
!sudo singularity build hello-world.sif singularity/Singularity.def

[34mINFO:   [0m Starting build...
Getting image source signatures
Copying blob be8ec4e48d7f [====>------------------------------] 6.1MiB / 43.7MiB
Copying blob be8ec4e48d7f done
Copying blob be8ec4e48d7f done
Copying blob 33b8b485aff0 done
Copying blob d887158cc58c done
Copying blob be8ec4e48d7f done
Copying blob 33b8b485aff0 done
Copying blob d887158cc58c done
Copying blob 05895bb28c18 done
Copying config 61838c0703 done
Writing manifest to image destination
Storing signatures
2021/01/20 20:14:39  info unpack layer: sha256:be8ec4e48d7f24a9a1c01063e5dfabb092c2c1ec73e125113848553c9b07eb8c
2021/01/20 20:14:40  info unpack layer: sha256:33b8b485aff0509bb0fa67dff6a2aa82e9b7b17e5ef28c1673467ec83edb945d
2021/01/20 20:14:40  info unpack layer: sha256:d887158cc58cbfc3d03cefd5c0b15175fae66ffbf6f28a56180c51cbb5062b8a
2021/01/20 20:14:40  info unpack layer: sha256:05895bb28c18264f614acd13e401b3c5594e12d9fe90d7e52929d3e810e11e97
[34mINFO:   [0m Copying sources/hello.c to /tmp/rootfs-1c996708-5

A quick note of the `singularity build` command line. The first argument is the filename of the resulting container image. By convention, Singularity 2.x container image files have the `.simg` extension, while Singularity 3.x container images have the `.sif` extension. The second argument is the path to the Singularity definition file. 

The output `Build complete: hello-world.sif` indicates that the image was built successfully.

Run the containerized "Hello World" program by invoking `singularity exec`.  Note that `sudo` is not required to use the container image.

In [6]:
!singularity exec hello-world.sif /usr/local/bin/hello

Hello world!


The Hello World program run inside the container produces the expected output.

Let's take a closer look at the Hello World container image.

In [7]:
!ls -lh hello-world.sif

-rwxr-xr-x 1 labuser labuser 93M Jan 20 20:15 hello-world.sif


The Hello World program itself is less than 10 kilobytes, yet the Hello World container image is 93 megabytes!  This is 2.5 times the size of the base Ubuntu 16.04 image (36 megabytes). The compiler accounts for over half of the total container size! But all we really care about is the Hello World program, there is no need to redistribute the compiler (or our source code) to users of the container image.

You could reduce the size of the Singularity container image by [removing the source code and compiler](/lab/edit/singularity/Singularity.def.cleanup) after the Hello World program has been built.  Doing so would reduce the container image size to 36 megabytes. However, more complex programs with runtime dependencies would require more sophisticated cleanup steps to remove unnecessary components while maintaining the needed runtime dependencies.

The Docker image format and build process includes capabilities that help control container image size and more precisely control the content of container images.

### Singularity Summary

The content of Singularity container images is specified in Singularity definition files.

Singularity container images are "flat", not layered like Docker (OCI) images.  Since flat container images are simple files, they are easy to copy and move.  However, building flat container images cannot take advantage of some features available with "layered" images.

## Building Container Images With Docker

This part of the lab covers how to [build container images with Docker](https://docs.docker.com/engine/reference/commandline/build/).

### Building Your First Docker Image

A [Dockerfile is a plain text file](https://docs.docker.com/engine/reference/builder/) that specifies the instructions to create your container image.  For this first image, we'll use a very simple [Dockerfile](/lab/edit/docker/Dockerfile.first).  Docker will build your container image based on the `ubuntu:16.04` container image from Docker Hub.  It will try to find it locally first, then will go the default repository (Docker Hub) to download the image.  After that is a `RUN` instruction that tells the container builder to run the shell command `date > /build-info.txt` and save the result as part of the container image.

In [8]:
!cat docker/Dockerfile.first

FROM ubuntu:16.04

RUN date > /build-info.txt


In [9]:
!sudo docker build -t first-image -f docker/Dockerfile.first .

Sending build context to Docker daemon  97.36MB
Step 1/2 : FROM ubuntu:16.04
 ---> 9499db781771
Step 2/2 : RUN date > /build-info.txt
 ---> Running in 1b6ac923714b
Removing intermediate container 1b6ac923714b
 ---> 4d8180b4869f
Successfully built 4d8180b4869f
Successfully tagged first-image:latest


A quick note of the `docker build` command line.  The `-t` option specifies the name and tag of the resulting container image, with the name and tag separated by a colon. By default, Docker uses `latest` as the tag unless one is specified.  The `-f` option specifies the Dockerfile to build the container from.  And finally, the `.` is the path to use as the build context, i.e., the sandbox where files from the host are accessible during the container image build.

The output `Successfully tagged first-image:latest` indicates that the image was built successfully. 

Note that each instruction from the Dockerfile is shown as a "Step".  As it builds the container image, Docker tells you which step it is on and gives the intermediate hash of the resulting layer.

Let's check out the newly built image.

In [10]:
!sudo docker run --rm -it first-image cat /build-info.txt

Wed Jan 20 20:17:34 UTC 2021


The date shown should be just a short time ago when you build the image.  The date in this file corresponds to when the container image was built, not when it is run.

### Image Layering

One of the most important concepts when building container images is *layering*.  Docker builds container images according to the [Open Container Initiative (OCI) image specification](https://github.com/opencontainers/image-spec).  OCI container images are composed of a series of layers. (If you look closely at the output of building the first container image above, you will see that the `ubuntu:16.04` container image itself actually consists of multiple layers.) The layers are applied sequentially, one on top of another, to form the container image that you ultimately see when running a container.

To help illustrate layering, let's [extend the previous Dockerfile](/lab/edit/docker/Dockerfile.second) to add a second `RUN` instruction that appends the Linux kernel version of the system where the container was built to `/build-info.txt`.

In [11]:
!cat docker/Dockerfile.second

FROM ubuntu:16.04

RUN date > /build-info.txt
RUN uname -r >> /build-info.txt


In [12]:
!sudo docker build -t second-image -f docker/Dockerfile.second .

Sending build context to Docker daemon  97.36MB
Step 1/3 : FROM ubuntu:16.04
 ---> 9499db781771
Step 2/3 : RUN date > /build-info.txt
 ---> Using cache
 ---> 4d8180b4869f
Step 3/3 : RUN uname -r >> /build-info.txt
 ---> Running in 94c74a231b70
Removing intermediate container 94c74a231b70
 ---> 1eae4b8ee152
Successfully built 1eae4b8ee152
Successfully tagged second-image:latest


First, note that first 2 steps were cached.  Docker recognizes that the first 2 instructions have previously been processed, so the corresponding layers do not need to be regenerated.  This is possible due to layering.  The layer cache can significantly speed up building container images.  Recall that the layers are applied sequentially; so the entire history of instructions up to that point must be identical for the cached layer to be used.

The third step which we just added to the Dockerfile is not in the cache, so it needs to be performed and a new layer is generated.

Let's verify that the kernel version is included in the build info file.

In [13]:
!sudo docker run --rm -it second-image cat /build-info.txt

Wed Jan 20 20:17:34 UTC 2021
4.4.0-1102-aws


Docker provides a method to take a closer look at the layers composing a container image.

In [14]:
!sudo docker history second-image

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
1eae4b8ee152        21 seconds ago      /bin/sh -c uname -r >> /build-info.txt          44B                 
4d8180b4869f        3 minutes ago       /bin/sh -c date > /build-info.txt               29B                 
9499db781771        7 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           7 weeks ago         /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B                  
<missing>           7 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0B                  
<missing>           7 weeks ago         /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   745B                
<missing>           7 weeks ago         /bin/sh -c #(nop) ADD file:8eef54430e581236e…   131MB               


Your image consists of 7 layers.  The layers are listed in reverse chronological order; the container image you see when running the container is generated by starting from the last layer shown, applying the second to the last layer on top of it, then the third from the last on top of that, and so on. In case of conflicts, a subsequent layer will overwrite content from previous layers.

The first column shows the layer hash. You can correlate the layer hashes shown here with the `docker build` output above.

The second column shows when the layer was created.  You created the top 2 layers just a few minutes ago, while the other layers correspond to the `ubuntu:16.04` base image and were created longer ago.

The third column shows an abbreviated version of the Dockerfile instruction used to build the corresponding layer.  To see the full instruction, use `docker history --no-trunc`.  The instructions for the top 2 layers match what was specified in the [Dockerfile](/lab/edit/docker/Dockerfile.second).

The fourth column shows the size of the layer.  Why is the layer that appended the kernel version (`uname -r ...`) almost twice as large the layer that saved the date?  

The OCI image specification employs file level deduplication to handle conflicts.  When a build instruction creates or modifies a file, the entire file is saved in the corresponding layer.  So when the kernel version was appended to the build info file, that layer did not capture just the difference, but rather the whole modified file.  In this particular case, the file is tiny and the amount of duplicated data is minimal.  But consider the case of a large, 1 GB file.  If a subsequent layer modifies a single byte in that file, the file will account for 2 GB in the container image, even though the file will appear to be "only" 1 GB when running the container.

A best practice arising from file level deduplication of layers is to put all actions modifying the same set of files in the same Dockerfile instruction.  For example, remove any temporary files in the same instruction in which they are created.

Let's modify the Dockerfile so that the [date and kernel version are written to the build info file in the same instruction](/lab/edit/docker/Dockerfile.third).  In the bash shell, commands can be concatenated with `&&`. (You may have noticed long `RUN` commands connected with `&&` in other Dockerfiles; this is why.)

In [15]:
!sudo docker build -t third-image -f docker/Dockerfile.third .

Sending build context to Docker daemon  97.36MB
Step 1/2 : FROM ubuntu:16.04
 ---> 9499db781771
Step 2/2 : RUN date > /build-info.txt && uname -r >> /build-info.txt
 ---> Running in a3257b7c6661
Removing intermediate container a3257b7c6661
 ---> c0574305c990
Successfully built c0574305c990
Successfully tagged third-image:latest


In [16]:
!sudo docker history third-image

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
c0574305c990        8 seconds ago       /bin/sh -c date > /build-info.txt && uname -…   44B                 
9499db781771        7 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           7 weeks ago         /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B                  
<missing>           7 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0B                  
<missing>           7 weeks ago         /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   745B                
<missing>           7 weeks ago         /bin/sh -c #(nop) ADD file:8eef54430e581236e…   131MB               


Notice there is now a single layer for the build info file and the extraneous layer with the duplicated data has been eliminated.

Strike a balance between using lots of individual Dockerfile instructions versus using a single instruction.  Lots of individual instructions may produce unnecessarily large container images when touching the same files, but using too few instructions will eliminate the advantages of the build cache to speed up your container builds.  

A best practice is to bundle all *related* items into a single layer, but to put unrelated items in separate layers.  For example, install the compiler in one layer and build your source code in another layer (but cleanup any temporary object files in the same layer).

### Hello World

Let's put these techniques into practice by constructing a container image for the classic ["Hello World!" program](/lab/edit/sources/hello.c).

#### Exercise

The Ubuntu base container on Docker Hub does not include development tools in order to help minimize the size of the image.  As an exercise, modify the [Dockerfile](/lab/edit/docker/Dockerfile.hello_exercise) to install the GNU C compiler and standard C headers.  For Ubuntu, the command to install packages is `apt-get`.  The packages are named `gcc` and `build-essential`.

In [19]:
!sudo docker build -t hello-world:exercise -f docker/Dockerfile.hello_exercise .

Sending build context to Docker daemon  97.36MB
Step 1/4 : FROM ubuntu:16.04
 ---> 9499db781771
Step 2/4 : RUN apt-get update -y && apt-get install -y --no-install-recommends         build-essential         gcc
 ---> Running in bee95901fdf9
Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:2 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:3 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [1905 kB]
Get:4 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Get:5 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Get:6 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:7 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:8 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:9 http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [15.9 kB]
Get:10 http://security.ubuntu.com/ubuntu xenial-securit

Verify your solution by running the Hello World program inside the container.

In [20]:
!sudo docker run --rm -it hello-world:exercise /usr/local/bin/hello

Hello world!


#### Solution

If you get stuck, or just want compare your approach, please see the [solution](/lab/edit/docker/Dockerfile.hello_solution).

Note that the apt package cache is removed in the same step where it is generated, following the recommended best practice of cleaning up temporary and unnecessary files in the same instruction where they are created.

In [21]:
!sudo docker build -t hello-world:solution -f docker/Dockerfile.hello_solution .

Sending build context to Docker daemon  97.38MB
Step 1/4 : FROM ubuntu:16.04
 ---> 9499db781771
Step 2/4 : RUN apt-get update -y &&     apt-get install -y --no-install-recommends         build-essential         gcc &&     rm -rf /var/lib/apt/lists/*
 ---> Running in bffb97e0602a
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:2 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:3 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Get:4 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Get:5 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [1905 kB]
Get:6 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:7 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:8 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:9 http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [15.9 kB]
Get:10 http://se

The Hello World program run inside the container produces the expected output.

In [22]:
!sudo docker run --rm -it hello-world:solution /usr/local/bin/hello

Hello world!


Let's look at the layers in the Hello World container image.

In [23]:
!sudo docker history hello-world:solution

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
b4d4da7a8d8f        12 seconds ago      /bin/sh -c gcc -o /usr/local/bin/hello /var/…   8.6kB               
fd03982f5f64        13 seconds ago      /bin/sh -c #(nop) COPY file:9c7ad162b4358f35…   63B                 
711de476042b        13 seconds ago      /bin/sh -c apt-get update -y &&     apt-get …   174MB               
9499db781771        7 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           7 weeks ago         /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B                  
<missing>           7 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0B                  
<missing>           7 weeks ago         /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   745B                
<missing>           7 weeks ago         /bin/sh -c #(nop) ADD file:8eef54430e581236e…   131MB               


The Hello World program itself is less than 10 kilobytes, but the compiler and related tools are ~175 megabytes.  The compiler accounts for *over half* of the total container size!  But all we really care about is the Hello World program, there is no need to redistribute the compiler (or our source code) to users of the container image.

### Multi-Stage Hello World

Docker [multi-stage builds](https://docs.docker.com/develop/develop-images/multistage-build/) are a way to control the size of container images.  In the same Dockerfile, you can define a second stage that is a completely separate container image and copy just the binary and any runtime dependencies from preceding stages into the image.  The output of a multi-stage build is a single container image corresponding to the last stage of the Dockerfile.  The multi-stage Hello World [Dockerfile](/lab/edit/docker/Dockerfile.hello_multistage) shows how a second `FROM` instruction starts a second stage, but where artifacts from the preceding stage can still be accessed (`COPY --from`).

In [24]:
!cat docker/Dockerfile.hello_multistage

# The "build" stage of the multi-stage Dockerfile

# Start from a basic Ubuntu 16.04 image
FROM ubuntu:16.04 AS build

RUN apt-get update -y && \
    apt-get install -y --no-install-recommends \
        build-essential \
        gcc && \
    rm -rf /var/lib/apt/lists/*

# Copy Hello World source code into the build stage
COPY sources/hello.c /var/tmp/hello.c

# Build Hello World
RUN gcc -o /usr/local/bin/hello /var/tmp/hello.c

# The "runtime" stage of the multi-stage Dockerfile
# This starts an entirely new container image

# Start from a basic Ubuntu 16.04 image
FROM ubuntu:16.04

# Copy the hello binary from the build stage
COPY --from=build /usr/local/bin/hello /usr/local/bin/hello


In [25]:
!sudo docker build -t hello-world:multistage -f docker/Dockerfile.hello_multistage .

Sending build context to Docker daemon  97.41MB
Step 1/6 : FROM ubuntu:16.04 AS build
 ---> 9499db781771
Step 2/6 : RUN apt-get update -y &&     apt-get install -y --no-install-recommends         build-essential         gcc &&     rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 711de476042b
Step 3/6 : COPY sources/hello.c /var/tmp/hello.c
 ---> Using cache
 ---> fd03982f5f64
Step 4/6 : RUN gcc -o /usr/local/bin/hello /var/tmp/hello.c
 ---> Using cache
 ---> b4d4da7a8d8f
Step 5/6 : FROM ubuntu:16.04
 ---> 9499db781771
Step 6/6 : COPY --from=build /usr/local/bin/hello /usr/local/bin/hello
 ---> c2ed5d8edcd6
Successfully built c2ed5d8edcd6
Successfully tagged hello-world:multistage


In [26]:
!sudo docker history hello-world:multistage

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
c2ed5d8edcd6        4 seconds ago       /bin/sh -c #(nop) COPY file:de1284d9e252911b…   8.6kB               
9499db781771        7 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           7 weeks ago         /bin/sh -c mkdir -p /run/systemd && echo 'do…   7B                  
<missing>           7 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0B                  
<missing>           7 weeks ago         /bin/sh -c set -xe   && echo '#!/bin/sh' > /…   745B                
<missing>           7 weeks ago         /bin/sh -c #(nop) ADD file:8eef54430e581236e…   131MB               


In [27]:
!sudo docker run --rm -it hello-world:multistage /usr/local/bin/hello

Hello world!


In [28]:
!docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
hello-world         multistage          c2ed5d8edcd6        36 seconds ago      131MB
hello-world         solution            b4d4da7a8d8f        2 minutes ago       305MB
hello-world         exercise            bbd3329734fd        3 minutes ago       335MB
<none>              <none>              37ba1df9c1a3        5 minutes ago       131MB
third-image         latest              c0574305c990        5 minutes ago       131MB
second-image        latest              1eae4b8ee152        6 minutes ago       131MB
first-image         latest              4d8180b4869f        9 minutes ago       131MB
ubuntu              16.04               9499db781771        7 weeks ago         131MB
centos              7                   8652b9f0cb4c        2 months ago        204MB


The container image generated by the multi-stage build adds only the Hello World program to the base `ubuntu:16.04` image, yielding a significant savings in the size of the container.  Multi-stage builds can also be used to avoid redistributing source code or other build artifacts.  However, keep in mind this is a simple case and more complex cases may have additional runtime dependencies that also need to be copied from one stage to another.  HPC Container Maker can help ensure the necessary runtime dependencies are available in the second stage.

In [29]:
!sudo docker images hello-world

REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
hello-world         multistage          c2ed5d8edcd6        About a minute ago   131MB
hello-world         solution            b4d4da7a8d8f        2 minutes ago        305MB
hello-world         exercise            bbd3329734fd        3 minutes ago        335MB


### Docker Summary

The content of Docker container images is specified in Dockerfiles.

Docker (OCI) container images are layered.  Layering provides a number of advantages, including caching that can speed up builds and reducing disk usage when layers can be shared by several images.  However, layering also requires careful use to avoid pitfalls that can bloat the image size.

Multi-stage builds are a very useful feature for fine tuning the content of container images.

## HPC Container Maker

[HPC Container Maker (HPCCM)](https://github.com/NVIDIA/hpc-container-maker) simplifies the process of creating container specification files.  It specifically addresses the challenges of generating HPC container images.

HPC Container Maker generates Dockerfiles or Singularity definition files from a high level Python recipe. HPCCM recipes have some distinct advantages over "native" container specification formats.

1. A library of HPC building blocks that separate the choice of what to include in a container image from the details of how it's done. The building blocks transparently provide the latest component and container best practices.

2. Python provides increased flexibility over static container specification formats. Python-based recipes can branch, validate user input, etc. - the same recipe can generate multiple container specifications.

3. Generate either Dockerfiles or Singularity definition files from the same recipe.

### Getting Started

HPCCM is based on the concept of [building blocks](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md).  For instance, there is an [OpenMPI building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#openmpi).  The building blocks encapsulate the best practices of building HPC software components with the best practices of building container images to generate optimal container image specifications.  This lets you easily take advantage of all the existing knowledge of how to best install a component like OpenMPI inside a container image.

Container images are specified as a HPCCM recipe, which is then converted by a command line tool into a Dockerfile or a Singularity definition file.  A HPCCM recipe is a Python script, usually a really simple Python script.  But you do have the full power of Python available to you so you can do things like validate input, branch inside the recipe based on the type of build desired, or even search the web to download the latest version of a software package.

To illustrate this, let's start with a simple [example](/lab/edit/hpccm/openmpi.py) of a container image that includes CUDA and OpenMPI.

In [30]:
!hpccm --recipe hpccm/openmpi.py

FROM nvidia/cuda:9.2-devel-centos7

# OpenMPI version 4.0.1
RUN yum install -y \
        bzip2 \
        file \
        hwloc \
        make \
        numactl-devel \
        openssh-clients \
        perl \
        tar \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.1.tar.bz2 && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-4.0.1.tar.bz2 -C /var/tmp -j && \
    cd /var/tmp/openmpi-4.0.1 &&   ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --without-verbs && \
    make -j$(nproc) && \
    make -j$(nproc) install && \
    rm -rf /var/tmp/openmpi-4.0.1.tar.bz2 /var/tmp/openmpi-4.0.1
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH \
    PATH=/usr/local/openmpi/bin:$PATH




When this simple two line recipe is processed by HPCCM, the optimized Dockerfile is generated.  Notice that the Dockerfile best practices described earlier, such as combining related steps into a single layer and removing temporary files in the same layer they are generated are automatically employed.

A Singularity definition file can be generated from the exact same recipe just by specifying the `--format` command line option.

In [31]:
!hpccm --recipe hpccm/openmpi.py --format singularity

BootStrap: docker
From: nvidia/cuda:9.2-devel-centos7
%post
    . /.singularity.d/env/10-docker*.sh

# OpenMPI version 4.0.1
%post
    yum install -y \
        bzip2 \
        file \
        hwloc \
        make \
        numactl-devel \
        openssh-clients \
        perl \
        tar \
        wget
    rm -rf /var/cache/yum/*
%post
    cd /
    mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.1.tar.bz2
    mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-4.0.1.tar.bz2 -C /var/tmp -j
    cd /var/tmp/openmpi-4.0.1 &&   ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --without-verbs
    make -j$(nproc)
    make -j$(nproc) install
    rm -rf /var/tmp/openmpi-4.0.1.tar.bz2 /var/tmp/openmpi-4.0.1
%environment
    export LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH
    export PATH=/usr/local/openmpi/bin:$PATH

HPCCM building blocks are also configurable.  The defaults are suitable for many use cases, but you may need to more precisely tailor the container image.  For example, the OpenMPI building block has several [configuration options](https://github.com/NVIDIA/hpc-container-maker/tree/master/docs/building_blocks.md#openmpi).  

For example, [this recipe](/lab/edit/hpccm/openmpi-config.py) installs OpenMPI in `/opt`, disables the Fortran interface and InfiniBand support, and specifies to use version 2.1.2.  Also note that the base image is based on Ubuntu rather than CentOS, as in the previous recipe; the building block automatically detected the Linux distribution type and uses `apt-get` rather than `yum` to install its dependencies.

In [32]:
!hpccm --recipe hpccm/openmpi-config.py

FROM nvidia/cuda:9.2-devel-ubuntu16.04

# OpenMPI version 2.1.2
RUN apt-get update -y && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        bzip2 \
        file \
        hwloc \
        libnuma-dev \
        make \
        openssh-client \
        perl \
        tar \
        wget && \
    rm -rf /var/lib/apt/lists/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v2.1/downloads/openmpi-2.1.2.tar.bz2 && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-2.1.2.tar.bz2 -C /var/tmp -j && \
    cd /var/tmp/openmpi-2.1.2 &&   ./configure --prefix=/opt/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --disable-fortran --with-cuda --without-verbs && \
    make -j$(nproc) && \
    make -j$(nproc) install && \
    rm -rf /var/tmp/openmpi-2.1.2.tar.bz2 /var/tmp/openmpi-2.1.2
ENV LD_LIBRARY_PATH=/opt/openmpi/lib:$LD_LIBRARY_PATH \
    PATH=/opt/openmpi/bin:$

### Reproducing a Bare Metal Environment

Many HPC systems use [environment modules](https://en.wikipedia.org/wiki/Environment_Modules_(software)) to manage their software environment. A user loads the modules corresponding to the desired software environment.

```
$ module load cuda/9.0
$ module load gcc
$ module load openmpi/1.10.7
```

Modules can depend on each other, and in this case, the openmpi module was built with the gcc compiler and with CUDA support enabled.

The Linux distribution and drivers are typically fixed by the system administrator, for instance CentOS 7 and Mellanox OFED 3.4.

The system administrator of the HPC system built and installed these components for their user community. Including a software component in a container image requires knowing how to properly configure and build the component. This is specialized knowledge and can be further complicated when applying container best practices.

_How can this software environment be reproduced in a container image?_

The starting point for any container image is a base image. Since CUDA is required, the base image should be one of the [publicly available CUDA base images](https://hub.docker.com/r/nvidia/cuda/). The CUDA base image corresponding to CUDA 9.0 and CentOS 7 is `nvidia/cuda:9.0-devel-centos7`. So the first line of the HPCCM recipe is:

```python
Stage0 += baseimage(image='nvidia/cuda:9.0-devel-centos7')
```

Note: `Stage0` refers to the first stage of a [multi-stage Docker build](https://docs.docker.com/develop/develop-images/multistage-build/). Multi-stage builds are a technique that can significantly reduce the size of container images. This section will not use multi-stage builds, so the `Stage0` prefix can be considered boilerplate.

The next step is to include the HPCCM building blocks corresponding to the rest of the desired software environment: [Mellanox OFED](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#mlnx_ofed), [gcc](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#gnu), and [OpenMPI](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#openmpi).

The [mlnx_ofed building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#mlnx_ofed) installs the OpenFabrics user space libraries:

```python
Stage0 += mlnx_ofed(version='3.4-1.0.0.0')
```

The [gnu building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#gnu) installs the GNU compiler suite:

```python
compiler = gnu()
Stage0 += compiler
```

Note: The `compiler` variable is defined here so that in the next step the OpenMPI building block can use the GNU compiler toolchain. Since the GNU compiler is typically the default compiler, this is just being explicit about the default behavior.

The [openmpi building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#openmpi) installs OpenMPI, configured to use the desired version, the GNU compiler, and with CUDA and InfiniBand enabled:

```python
Stage0 += openmpi(cuda=True, infiniband=True, toolchain=compiler.toolchain,
                  version='1.10.7')
```

Bringing it all together, the complete recipe corresponding to the bare metal software environment is [cuda-gcc-openmpi.py](/lab/edit/hpccm/cuda-gcc-openmpi.py). The HPCCM recipe has nearly a one-to-one correspondence with the environment module commands. HPCCM strives to provide a similar a high level environment modules like interface.  You need only specify which software components you want inside your container image, without requiring you to get into the low level details of how to best build and configure each component.

Use the `hpccm` command line tool to generate the corresponding Dockerfile or Singularity definition file.

In [33]:
!hpccm --recipe hpccm/cuda-gcc-openmpi.py --format docker

FROM nvidia/cuda:9.0-devel-centos7

# Mellanox OFED version 3.4-1.0.0.0
RUN yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-3.4-1.0.0.0/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \
    find /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} + && \
    rm -rf /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64

# GNU compiler

In [34]:
!hpccm --recipe hpccm/cuda-gcc-openmpi.py --format singularity

BootStrap: docker
From: nvidia/cuda:9.0-devel-centos7
%post
    . /.singularity.d/env/10-docker*.sh

# Mellanox OFED version 3.4-1.0.0.0
%post
    yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget
    rm -rf /var/cache/yum/*
%post
    cd /
    mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-3.4-1.0.0.0/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz -C /var/tmp -z
    find /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} +
    rm -rf /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tg

Depending on the desired workflow, the next step might be to use a text editor to add the steps to build an HPC application to the Dockerfile or Singularity definition file, or it might be to extend the HPCCM recipe to add the steps to build an HPC application.

#### Exercises

1. Modify [cuda-gcc-openmpi.py](/lab/edit/hpccm/cuda-gcc-openmpi.py) to use version 7 of the GNU compiler.  Refer to the [gnu building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#gnu) documentation for details.

2. Modify [cuda-gcc-openmpi.py](/lab/edit/hpccm/cuda-gcc-openmpi.py) to use the PGI compilers. Change `compiler = gnu()` to `compiler = pgi(eula=True)`. Note: The PGI compiler EULA must be accepted in order to use the [PGI building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#pgi).

3. Modify [cuda-gcc-openmpi.py](/lab/edit/hpccm/cuda-gcc-openmpi.py) so that the Linux distribution is Ubuntu instead of CentOS.  Modify the base image from `nvidia/cuda:9.0-devel-centos7` to `nvidia/cuda:9.0-devel-ubuntu16.04`.

#### HPCCM Python Module

The `hpccm` command line tool is not required.  A HPCCM recipe can also be expressed as a normal Python script using the HPCCM Python module.  The equivalent of the preceding recipe is [script-cuda-gcc-openmpi.py](/lab/edit/hpccm/script-cuda-gcc-openmpi.py).

The "recipe" itself is exactly the same, but the Python script requires additional code to import the Python modules, parse input, and print output that is handled automatically by the hpccm command line tool. However, the script also allows precise control over its behavior. For instance, additional command line arguments could be added to specify the compiler version, compiler suite, Linux distribution, and so on. Note it is also possible to tailor the behavior of recipes processed by the `hpccm` command line tool with user arguments. Another possible enhancement would be to write the output to a file instead of printing it to standard output.

In [35]:
!python3 hpccm/script-cuda-gcc-openmpi.py --format docker

FROM nvidia/cuda:9.0-devel-centos7

# Mellanox OFED version 3.4-1.0.0.0
RUN yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-3.4-1.0.0.0/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \
    find /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} + && \
    rm -rf /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64.tgz /var/tmp/MLNX_OFED_LINUX-3.4-1.0.0.0-rhel7.2-x86_64

# GNU compiler

### MPI Bandwidth

The [MPI Bandwidth sample program](/lab/edit/sources/mpi_bandwidth.c) from the Lawrence Livermore National Laboratory (LLNL) will be used as a proxy application to illustrate how to use HPCCM recipes to create application containers.

The CentOS 7 base image is sufficient for this example. The Mellanox OFED user space libraries, a compiler, and MPI library are also needed. For this tutorial section, the GNU compiler and OpenMPI will be used. The corresponding HPCCM recipe is:

```python
Stage0 += baseimage(image='centos:7')
Stage0 += gnu(fortran=False)
Stage0 += mlnx_ofed()
Stage0 += openmpi(cuda=False)
```

The next step is to build the MPI Bandwidth program from source. First the source code must be copied into the container, and then compiled. For both of these steps, HPCCM [primitives](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/primitives.md) will be used. HPCCM primitives are wrappers around the native container specification operations that translate the conceptual operation into the corresponding native container specific syntax. Primitives also hide many of the behavioral differences between the Docker and Singularity container image build processes so that behavior is consistent regardless of the output configuration specification format.

```
Stage0 += copy(src='sources/mpi_bandwidth.c', dest='/var/tmp/mpi_bandwidth.c')
```

Note: The MPI Bandwidth source code could also be downloaded as part of the container build itself, e.g., using wget. The [MPI Bandwidth example recipe](https://github.com/NVIDIA/hpc-container-maker/blob/master/recipes/mpi_bandwidth.py) distributed with HPCCM does this.

Finally, compile the program binary using the mpicc MPI compiler wrapper.

```python
Stage0 += shell(commands=[
    'mpicc -o /usr/local/bin/mpi_bandwidth /var/tmp/mpi_bandwidth.c'])
```

Note: In a production container image, a cleanup step would typically also be performed to remove the source code and any other build artifacts. That step is skipped here. [Multi-stage Docker builds](https://docs.docker.com/develop/develop-images/multistage-build/) are another approach that separates the application build process from the application deployment.

View the complete [MPI Bandwidth recipe](/lab/edit/hpccm/mpi_bandwidth.py).

To run MPI Bandwidth from a container, first generate the Dockerfile.

In [36]:
!hpccm --recipe hpccm/mpi_bandwidth.py --format docker > Dockerfile.mpi_bandwidth
!cat Dockerfile.mpi_bandwidth

FROM centos:7.6.1810

# GNU compiler
RUN yum install -y \
        gcc \
        gcc-c++ && \
    rm -rf /var/cache/yum/*

# Mellanox OFED version 4.5-1.0.1.0
RUN yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \
    find /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} + && \
    rm -rf /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1

Second, create the Docker container image.  The cell immediately below will load prebuilt (cached) versions of the Docker image layers up to and including OpenMPI to significantly reduce the container image build time.  This is not strictly required, but the MPI Bandwidth container image will take 10-15 minutes to build if the cache is not loaded.

In [37]:
!sudo docker load -i cache/mpi_bandwidth_cache.tar.xz


[1B9d87dbe2: Loading layer  209.5MB/209.5MB[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K
[1B95400fca: Loading layer  124.1MB/124.1MB[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K
[1B770fd8d8: Loading layer  25.24MB/25.24MB[1A[2K
[1Bd8094ef1: Loading layer  27.39MB/27.39MB[1A[2K[1A[2K
[1B15666c78: Loading layer  67.98MB/67.98MB[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K
[1Bb2683514: Loading layer  13.66MB/13.66MB[1A[2K[1A[2KLoaded image: mpi_bandwidth:cache
Loaded image ID: sha256:99b7d77949090d107faa15c7a02e6f9be314d3adc30231c15666f68be7e9d42e
Loaded image ID: sha256:5abb39b93e36956855a139e73acb9f1c18ec642589869f71430a92658b304409
Loaded image ID: sha256:53864f8e60acb828158af314d7f358da4ac51c446f07e81fc1f093fdbbaa5446
Loaded image ID: sha256:1636dc6fc5be4baa8778c9f7b7731afdca44dab384d923473e3b1bf4ce4ed22a
Loaded image ID: sha256:ebdb

In [38]:
!docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
hello-world         multistage          c2ed5d8edcd6        9 minutes ago       131MB
hello-world         solution            b4d4da7a8d8f        11 minutes ago      305MB
hello-world         exercise            bbd3329734fd        11 minutes ago      335MB
<none>              <none>              37ba1df9c1a3        14 minutes ago      131MB
third-image         latest              c0574305c990        14 minutes ago      131MB
second-image        latest              1eae4b8ee152        15 minutes ago      131MB
first-image         latest              4d8180b4869f        18 minutes ago      131MB
ubuntu              16.04               9499db781771        7 weeks ago         131MB
centos              7                   8652b9f0cb4c        2 months ago        204MB
mpi_bandwidth       cache               99b7d7794909        12 months ago       454MB


In [39]:
!cat Dockerfile.mpi_bandwidth

FROM centos:7.6.1810

# GNU compiler
RUN yum install -y \
        gcc \
        gcc-c++ && \
    rm -rf /var/cache/yum/*

# Mellanox OFED version 4.5-1.0.1.0
RUN yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \
    find /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} + && \
    rm -rf /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1

In [40]:
!sudo docker build -t mpi_bandwidth -f Dockerfile.mpi_bandwidth .

Sending build context to Docker daemon  97.43MB
Step 1/9 : FROM centos:7.6.1810
7.6.1810: Pulling from library/centos
Digest: sha256:62d9e1c2daa91166139b51577fe4f4f6b4cc41a3a2c7fc36bd895e2a17a3e4e6
Status: Downloaded newer image for centos:7.6.1810
 ---> f1cb7c7d58b7
Step 2/9 : RUN yum install -y         gcc         gcc-c++ &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> da190768f0a2
Step 3/9 : RUN yum install -y         findutils         libnl         libnl3         numactl-libs         wget &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> ebdbbe742ec4
Step 4/9 : RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz &&     mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz -C /var/tmp -z &&     find /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|

Third, convert the Docker container image to a Singularity container image.  The `docker-daemon` endpoint tells Singularity to use the local Docker image repository.

In [41]:
!singularity build mpi_bandwidth.sif docker-daemon://mpi_bandwidth:latest

[34mINFO:   [0m Starting build...
Getting image source signatures
Copying blob 89169d87dbe2 [==>------------------------------] 15.6MiB / 199.7MiB
Copying blob 89169d87dbe2 [====>----------------------------] 33.2MiB / 199.7MiB
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 3a6295400fca [===>-----------------------------] 14.0MiB / 118.3MiB
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 3a6295400fca done
Copying blob 89169d87dbe2 done
Copying blob 3a6295400fca done
Copying blob 89169d87dbe2 done
Copying blob 3a6295400fca done
Copying blob 10a4770fd8d8 done
Copying blob 89169d87dbe2 done
Copying blob 3a6295400fca done
Copying blob 10a4770fd8d8 done
Copying blob 59b815666c78 [====>------------------------------] 9.2MiB / 64.8MiB
Copying blob 89169d

Finally, run MPI Bandwidth using Singularity with 2 MPI ranks.

In [None]:
!singularity exec mpi_bandwidth.sif mpirun -n 2 -mca btl_base_warn_component_unused 0 /usr/local/bin/mpi_bandwidth

The exact same container images may also be used for multi-node runs, but that is beyond the scope of this lab. The webinar [Scaling Out HPC Workflows with NGC and Singularity](https://info.nvidia.com/simplfying-workflows-with-singularity-reg-page.html?ondemandrgt=yes) is a good reference for multi-node MPI runs.

#### Exercises

1. Modify [mpi_bandwidth.py](/lab/edit/hpccm/mpi_bandwidth.py) to use MVAPICH2 instead of OpenMPI.  Consult the [MVAPICH2 building block](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md#mvapich2) documentation for more information.

### User Arguments

Using Python to express container specifications is one of the key features of HPCCM. Python recipes can process user input to generate multiple container specification permutations from the same source code.

Consider the case where the CUDA version and OpenMPI version are user specified values. If not specified, default values should be used. In addition, the user supplied values should be verified to be valid version numbers.

The hpccm command line tool has the `--userarg` option. Values specified using this option are inserted into a Python dictionary named USERARG that can be accessed inside a recipe.

It's similar to the [`ARG` Dockerfile instruction](https://docs.docker.com/engine/reference/builder/#arg), but more powerful since you can process the arguments with Python.  For instance, the input can be validated.

The [userargs.py](/lab/edit/hpccm/userargs.py) recipe demonstrates user arguments.

In [42]:
!hpccm --recipe hpccm/userargs.py

FROM nvidia/cuda:9.1-devel-ubuntu16.04

# OpenMPI version 3.1.2
RUN apt-get update -y && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        bzip2 \
        file \
        hwloc \
        libnuma-dev \
        make \
        openssh-client \
        perl \
        tar \
        wget && \
    rm -rf /var/lib/apt/lists/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v3.1/downloads/openmpi-3.1.2.tar.bz2 && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-3.1.2.tar.bz2 -C /var/tmp -j && \
    cd /var/tmp/openmpi-3.1.2 &&   ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --without-verbs && \
    make -j$(nproc) && \
    make -j$(nproc) install && \
    rm -rf /var/tmp/openmpi-3.1.2.tar.bz2 /var/tmp/openmpi-3.1.2
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH \
    PATH=/usr/local/openmpi/bin:$

In [43]:
!hpccm --recipe hpccm/userargs.py --userarg cuda=10.0 ompi=3.1.3

FROM nvidia/cuda:10.0-devel-ubuntu16.04

# OpenMPI version 3.1.3
RUN apt-get update -y && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        bzip2 \
        file \
        hwloc \
        libnuma-dev \
        make \
        openssh-client \
        perl \
        tar \
        wget && \
    rm -rf /var/lib/apt/lists/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://www.open-mpi.org/software/ompi/v3.1/downloads/openmpi-3.1.3.tar.bz2 && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/openmpi-3.1.3.tar.bz2 -C /var/tmp -j && \
    cd /var/tmp/openmpi-3.1.3 &&   ./configure --prefix=/usr/local/openmpi --disable-getpwuid --enable-orterun-prefix-by-default --with-cuda --without-verbs && \
    make -j$(nproc) && \
    make -j$(nproc) install && \
    rm -rf /var/tmp/openmpi-3.1.3.tar.bz2 /var/tmp/openmpi-3.1.3
ENV LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH \
    PATH=/usr/local/openmpi/bin:

#### Exercise

1. Try specifying invalid or out of range user arguments

In [44]:
!hpccm --recipe hpccm/userargs.py --userarg cuda=nine_point_zero ompi=4.0.0

ERROR: invalid version number 'nine_point_zero'


### Multi-stage Recipes

[Multi-stage Docker builds](https://docs.docker.com/develop/develop-images/multistage-build/) are a very useful capability that separates the application build step from the deployment step. The development toolchain, application source code, and build artifacts are not necessary when deploying the built application inside a container. In fact, they can significantly and unnecessarily increase the size of the container image.

The `hpccm` command line tool automatically creates 2 stages, Stage0, and Stage1. Most [building blocks](https://github.com/NVIDIA/hpc-container-maker/blob/master/docs/building_blocks.md) provide a runtime method to install the corresponding runtime version of a component.

The [multistage.py](/lab/edit/hpccm/multistage.py) recipe installs the GNU compiler in the first (build) stage, but only the corresponding runtime libraries in the second (deployment) stage. Building block settings defined in the first stage are automatically reflected in the second stage.

In [45]:
!hpccm --recipe hpccm/multistage.py

FROM nvidia/cuda:9.0-devel-centos7

# GNU compiler
RUN yum install -y \
        gcc \
        gcc-c++ \
        gcc-gfortran && \
    rm -rf /var/cache/yum/*

FROM nvidia/cuda:9.0-base-centos7

# GNU compiler runtime
RUN yum install -y \
        libgfortran \
        libgomp && \
    rm -rf /var/cache/yum/*


### Multi-stage MPI Bandwidth

By adding just a few more lines to the recipe, the MPI Bandwidth example can be improved from a [single stage recipe](/lab/edit/hpccm/mpi_bandwidth.py) to a [multi-stage recipe](/lab/edit/hpccm/mpi_bandwidth_multistage.py).

In [46]:
!hpccm --recipe hpccm/mpi_bandwidth_multistage.py --format docker > Dockerfile.mpi_bandwidth_multistage
!cat Dockerfile.mpi_bandwidth_multistage

FROM centos:7.6.1810

# GNU compiler
RUN yum install -y \
        gcc \
        gcc-c++ && \
    rm -rf /var/cache/yum/*

# Mellanox OFED version 4.5-1.0.1.0
RUN yum install -y \
        findutils \
        libnl \
        libnl3 \
        numactl-libs \
        wget && \
    rm -rf /var/cache/yum/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz && \
    mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz -C /var/tmp -z && \
    find /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not -path "*UPSTREAM*" -exec rpm --install {} + && \
    rm -rf /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1

In [47]:
!sudo docker build -t mpi_bandwidth:multistage -f Dockerfile.mpi_bandwidth_multistage .

Sending build context to Docker daemon  219.6MB
Step 1/17 : FROM centos:7.6.1810
 ---> f1cb7c7d58b7
Step 2/17 : RUN yum install -y         gcc         gcc-c++ &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> da190768f0a2
Step 3/17 : RUN yum install -y         findutils         libnl         libnl3         numactl-libs         wget &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> ebdbbe742ec4
Step 4/17 : RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp http://content.mellanox.com/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz &&     mkdir -p /var/tmp && tar -x -f /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64.tgz -C /var/tmp -z &&     find /var/tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.2-x86_64 -regextype posix-extended -type f -regex ".*(libibmad|libibmad-devel|libibumad|libibumad-devel|libibverbs|libibverbs-devel|libibverbs-utils|libmlx4|libmlx4-devel|libmlx5|libmlx5-devel|librdmacm|librdmacm-devel)-[0-9].*x86_64.rpm" -not

In [48]:
!singularity build mpi_bandwidth_multistage.sif docker-daemon://mpi_bandwidth:multistage

[34mINFO:   [0m Starting build...
Getting image source signatures
Copying blob 89169d87dbe2 [==>------------------------------] 15.8MiB / 199.7MiB
Copying blob 89169d87dbe2 [====>----------------------------] 31.2MiB / 199.7MiB
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 6f57b53a7eb4 done
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 6f57b53a7eb4 done
Copying blob 0099eeedd692 [===>-------------------------------] 2.9MiB / 27.5MiB
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 6f57b53a7eb4 done
Copying blob 0f9fa727eb64 done
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 6f57b53a7eb4 done
Copying blob 0f9fa727eb64 done
Copying blob 89169d87dbe2 done
Copying blob a09033ea1c17 done
Copying blob 6f57b53a7eb4 done
Copying

The multi-stage container image functionality is the same, but the container image is smaller because the development environment is not being redistributed with the MPI Bandwidth workload.

In [49]:
!singularity exec mpi_bandwidth_multistage.sif mpirun -n 2 -mca btl_base_warn_component_unused 0 /usr/local/bin/mpi_bandwidth


******************** MPI Bandwidth Test ********************
Message start size= 100000 bytes
Message finish size= 1000000 bytes
Incremented by 100000 bytes per iteration
Roundtrips per iteration= 100
MPI_Wtick resolution = 1.000000e-09
************************************************************
task    0 is on f45ad7e41aac partner=   1
task    1 is on f45ad7e41aac partner=   0
************************************************************
***Message size:   100000 *** best  /  avg  / worst (MB/sec)
   task pair:    0 -    1:    5991.43 / 5721.74 / 2685.86 
   OVERALL AVERAGES:          5991.43 / 5721.74 / 2685.86 

***Message size:   200000 *** best  /  avg  / worst (MB/sec)
   task pair:    0 -    1:    6264.49 / 6123.50 / 4924.35 
   OVERALL AVERAGES:          6264.49 / 6123.50 / 4924.35 

***Message size:   300000 *** best  /  avg  / worst (MB/sec)
   task pair:    0 -    1:    6400.27 / 6262.15 / 4896.64 
   OVERALL AVERAGES:          6400.27 / 6262.15 / 4896.64 

***Message size:

In [50]:
!ls -lh mpi_bandwidth*.sif

-rwxr-xr-x 1 labuser labuser 117M Jan 20 20:37 mpi_bandwidth.sif
-rwxr-xr-x 1 labuser labuser  78M Jan 20 20:42 mpi_bandwidth_multistage.sif


## miniWeather: A Simple Example Application

The [miniWeather code](https://github.com/mrnorman/miniWeather) mimics the basic dynamics seen in atmospheric weather and climate. The dynamics themselves are dry compressible, stratified, non-hydrostatic flows dominated by buoyant forces that are relatively small perturbations on a hydrostatic background state.  The equations in this code themselves form the backbone of pretty much all fluid dynamics codes, and this particular flavor forms the base of all weather and climate modeling.

With about 500 total lines of code (and only about 200 lines that you care about), it serves as an approachable place to learn parallelization and porting using MPI + X, where X is OpenMP, OpenACC, CUDA, or potentially other approaches to CPU and accelerated parallelization.

To build and run this code, you need MPI, parallel-netcdf, and an OpenACC compiler (PGI).  Fortunately, there are HPCCM building blocks for all of these.

While the [HPCCM `pgi` building block](/lab/edit/hpccm/miniweather_pgi_bb.py) could be used for the compiler, we will use the PGI compiler container image (`nvcr.io/hpc/pgi-compilers:ce`) from the [NVIDIA GPU Cloud](https://ngc.nvidia.com) instead to speed up the build process.  The PGI compiler container image is what you prefetched at the beginning of the course.  The download should be complete now.

The [miniWeather recipe](/lab/edit/hpccm/miniweather.py) uses OpenMPI from the PGI compiler installation.

The first two steps are to generate the Dockerfile from the HPCCM recipe and build the corresponding Docker container.

The miniWeather container image will take about 5 minutes to build, assuming the PGI compiler container image was prefetched.

In [53]:
!hpccm --recipe hpccm/miniweather.py > Dockerfile.miniweather
!sudo docker build -t miniweather -f Dockerfile.miniweather .

Sending build context to Docker daemon    301MB
Step 1/17 : FROM nvcr.io/hpc/pgi-compilers:ce AS build
unauthorized: authentication required


The miniWeather recipe uses a multi-stage build to reduce the Docker container image size from approximately 9 gigabytes to about 500 megabytes.  After conversion to Singularity, the final container image size is about 120 megabytes.

Third, convert the Docker container image into a Singularity container image. This allows Singularity to (indirectly) take advantage of multi-stage builds.

In [52]:
!singularity build miniweather.sif docker-daemon://miniweather:latest

[34mINFO:   [0m Starting build...
[31mFATAL:  [0m While performing build: conveyor failed to get: Error loading image from docker engine: Error response from daemon: reference does not exist


Fourth and finally, run the OpenACC version of the code, using the `--nv` option to enable GPU support in Singularity, for a single MPI rank.

In [None]:
!singularity exec --nv miniweather.sif mpirun -n 1 -mca btl_base_warn_component_unused 0 /opt/miniWeather/bin/miniWeather_mpi_openacc

The code is configured to run the "injection" case.  A narrow jet of fast and slightly cold wind is injected into a balanced, neutral atmosphere at rest from the left domain.
This has nothing to do with atmospheric flows. It's just here for looks. 

In [54]:
import netCDF4
import matplotlib.pyplot as plt
%matplotlib inline
f = netCDF4.Dataset('output.nc', 'r')
theta = f.variables['theta']
plt.imshow(theta[90,:,:], origin='lower')

FileNotFoundError: [Errno 2] No such file or directory: b'output.nc'

## Summary

In this lab, you have learned:

1. How to build container images using Singularity.  Singularity container images are single "flat" files, making them easy to use at runtime.

2. How to build container images using Docker.  Image layers are an important concept, enabling cached and multi-stage builds.  However, incorrect use of image layers can lead to unnecessarily large container images.

3. HPC Container Maker is an open source tool that simplifies the specification of container images.  From a Python recipe, it can generate either a Dockerfile or a Singularity definition file.  Python is a more powerful language for expression container specifications, and the HPCCM building blocks separate the high level choice of what HPC software components to include in a container image from the low level complexities.

You should now understand the benefits of building HPC container images using the workflow:

1. Specify the content of container images with [HPC Container Maker](https://github.com/NVIDIA/hpc-container-maker)
2. Build container images with Docker
3. Convert the Docker images to Singularity images
4. Use Singularity to run containers on your HPC system

## Appendix: MILC

MILC is part of a set of codes written by the MIMD Lattice Computation (MILC) collaboration used to study quantum chromodynamics (QCD), the theory of the strong interactions of subatomic physics.  It performs simulations of four dimensional SU(3) lattice gauge theory on MIMD parallel machines.  "Strong interactions" are responsible for binding quarks into protons and neutrons and holding them all together in the atomic nucleus.

MILC is a real HPC application code, unlike miniWeather or MPI Bandwidth.

A [MILC recipe is included as an example](https://github.com/NVIDIA/hpc-container-maker/tree/master/recipes/milc) in the HPCCM GitHub repository.  It demonstrates the usefulness of multi-stage recipes. The Docker container image built from the first stage only is 5.93 GB, whereas the container image is only 429 MB when employing the multi-stage build process. Note it will take 30-45 minutes to build the container image.

In [55]:
!curl -O https://raw.githubusercontent.com/NVIDIA/hpc-container-maker/master/recipes/milc/milc.py

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3740  100  3740    0     0  25442      0 --:--:-- --:--:-- --:--:-- 25442


In [56]:
!hpccm --recipe milc.py > Dockerfile.milc

In [57]:
!sudo docker build -t milc -f Dockerfile.milc .

Sending build context to Docker daemon    301MB
Step 1/21 : FROM nvcr.io/nvidia/cuda:10.1-devel-ubuntu18.04 AS devel
10.1-devel-ubuntu18.04: Pulling from nvidia/cuda

[1Bcc0b8772: Pulling fs layer 
[1Bfb62ba5f: Pulling fs layer 
[1B964ece6a: Pulling fs layer 
[1Bc6a19124: Pulling fs layer 
[1B7e0c259e: Pulling fs layer 
[1Be0db918c: Pulling fs layer 
[1B9c05e34e: Pulling fs layer 
[1B5100467d: Pulling fs layer 
[1Bd28d9c57: Pulling fs layer 
[1Bda626d91: Pull complete 7.7kB/87.7kBBB0A[2K[10A[2K[10A[2K[9A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[7A[2K[7A[2K[7A[2K[6A[2K[10A[2K[6A[2K[10A[2K[10A[2K[10A[2K[9A[2K[8A[2K[7A[2K[7A[2K[7A[2K[4A[2K[7A[2K[7A[2K[4A[2K[6A[2K[4A[2K[6A[2K[4A[2K[6A[2K[6A[2K[6A[2K[6A[2K[4A[2K[5A[2K[4A[2K[4A[2K[2A[2K[1A[2K[2A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A[2K[2A[2K[4A

In [58]:
!singularity build milc.sif docker-daemon://milc:latest

[34mINFO:   [0m Starting build...
[31mFATAL:  [0m While performing build: conveyor failed to get: Error loading image from docker engine: Error response from daemon: reference does not exist


In the case of MILC, it's much easier and faster to use MILC from the NVIDIA GPU Cloud (NGC).  Rather than build your own container image from scratch, just download the [MILC container from NGC](https://ngc.nvidia.com/catalog/containers/hpc:milc).

In [59]:
!singularity build milc-ngc.sif docker://nvcr.io/hpc/milc:quda0.8-patch4Oct2017

[34mINFO:   [0m Starting build...
Getting image source signatures
Copying blob 473ede7ed136 [----------------------------------] 16.5KiB / 30.6MiB
Copying blob 473ede7ed136 [---------------------------------] 134.5KiB / 30.6MiB
Copying blob 473ede7ed136 [>--------------------------------] 594.5KiB / 30.6MiB
Copying blob 473ede7ed136 [==>--------------------------------] 2.4MiB / 30.6MiB
Copying blob 473ede7ed136 [=====>-----------------------------] 5.0MiB / 30.6MiB
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob 473ede7ed136 done
Copying blob c46b5fa4d940 done
Copying blob 473ede7ed136 done
Copying blob c46b5fa4d940 done
Copying blob 473ede7ed136 done


In either case, you can easily run MILC on nearly any system using the container image.  First download a sample dataset.

In [60]:
!mkdir $HOME/milc-dataset
!curl -o $HOME/milc-dataset/benchmarks.tar http://dli-lms.s3.amazonaws.com/data/l-ac-25-v1/benchmarks.tar
!tar -xvf $HOME/milc-dataset/benchmarks.tar -C $HOME/milc-dataset

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 36.6M  100 36.6M    0     0  38.9M      0 --:--:-- --:--:-- --:--:-- 38.8M
./
./small/
./small/small.warm.out
./small/small.bench.in
./small/README
./small/run_small.pbs
./small/small.warm.in
./small/lat.small.info
./small/lat.small
./small/small.bench.milc.out
./small/small.bench.out
./README
./medium/
./medium/medium.warm.out
./medium/medium.warm.in
./medium/medium.bench.milc.out
./medium/medium.bench.in
./medium/run_medium.pbs
./medium/medium.bench.out
./ratfunc/
./ratfunc/rat.bench
./ratfunc/rat.bench~
./ratfunc/rat.warm
./check_result.pl


Then run the container using Singularity.  The following is configured to use a single GPU.  The exact same container images may also be used for multi-node runs, but that is beyond the scope of this lab. The webinar [Scaling Out HPC Workflows with NGC and Singularity](https://info.nvidia.com/simplfying-workflows-with-singularity-reg-page.html?ondemandrgt=yes) is a good reference for multi-node MPI runs.

The first cell uses the container image you build yourself, while the second cell uses the container you downloaded from NGC.

In [61]:
!singularity exec --nv milc.sif mpirun -n 1 -mca btl_base_warn_component_unused 0 -wdir $HOME/milc-dataset/small su3_rhmd_hisq -geom 1 1 1 1 small.bench.in

[31mFATAL:  [0m could not open image /dli/task/milc.sif: failed to retrieve path for /dli/task/milc.sif: lstat /dli/task/milc.sif: no such file or directory


In [62]:
!singularity run --nv milc-ngc.sif mpirun -n 1 -mca btl_base_warn_component_unused 0 -wdir $HOME/milc-dataset/small su3_rhmd_hisq -geom 1 1 1 1 small.bench.in

2021/01/20 21:07:27 GPU driver verification failed: Host driver 0.0 not compatible with container: >=410.48, ==384.00


## Appendix: Converting Docker Container Images with Singularity 2.x

The `docker-daemon` endpoint was introduced in Singularity 3.0.  Fortunately there is a convenient [container to convert local Docker images into Singularity 2.x images](https://hub.docker.com/r/singularityware/docker2singularity) available on Docker Hub.

For example, to convert the MPI Bandwidth container image to a Singularty 2.x `simg`:

In [63]:
!sudo docker run -t --rm --cap-add SYS_ADMIN -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/output singularityware/docker2singularity mpi_bandwidth

Unable to find image 'singularityware/docker2singularity:latest' locally
latest: Pulling from singularityware/docker2singularity

[1Bc3bd43c5: Pulling fs layer 
[1Beaf8af20: Pulling fs layer 
[1B984849c1: Pulling fs layer 
[1B0ad88222: Pulling fs layer 
[1B20cf6e8a: Pulling fs layer 
[1B603b9086: Pulling fs layer 
[1B3955f0b9: Pulling fs layer 
[1B2ab0bf2e: Pulling fs layer 
[1B74f08b1e: Pulling fs layer 
[1B18496deb: Pulling fs layer 
[1B283160c9: Pulling fs layer 
[1B9e90ef6a: Pulling fs layer 
[1Ba4a6ea79: Pull complete .99kB/3.99kBBB[9A[2K[8A[2K[7A[2K[13A[2K[13A[2K[5A[2K[6A[2K[5A[2K[12A[2K[6A[2K[6A[2K[6A[2K[1A[2K[6A[2K[6A[2K[6A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[10A[2K[9A[2K[8A[2K[8A[2K[7A[2K[7A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A

## Appendix: Terminology

- Container: a running instantiation of a container image

- Container image: a standalone "package" of software that includes everything needed to run an application

- Container runtime: a software framework to run and manage containers and container images.  Examples: Docker, Singularity.

- Container registry: a server hosting container images for download ("pulling").  Examples: Docker Hub, Singularity Hub, NVIDIA GPU Cloud (NGC).

## Appendix: Getting Your System Container Ready

Docker and Singularity have been setup already for you in this lab environment. For more information on installing Singularity on your system, please see this brief [video](https://www.youtube.com/watch?v=iOLVqqHQsBU).