Skip to content
Felix Abecassis edited this page Jul 27, 2020 · 21 revisions

Overview

Pyxis being a SPANK plugin, the new command-line arguments it introduces are directly added to srun.

$ srun --help
...
      --container-image=[USER@][REGISTRY#]IMAGE[:TAG]|PATH
                              [pyxis] the image to use for the container
                              filesystem. Can be either a docker image given as
                              an enroot URI, or a path to a squashfs file on the
                              remote host filesystem.

      --container-mounts=SRC:DST[:FLAGS][,SRC:DST...]
                              [pyxis] bind mount[s] inside the container. Mount
                              flags are separated with "+", e.g. "ro+rprivate"

      --container-workdir=PATH
                              [pyxis] working directory inside the container
      --container-name=NAME   [pyxis] name to use for saving and loading the
                              container on the host. Unnamed containers are
                              removed after the slurm task is complete; named
                              containers are not. If a container with this name
                              already exists, the existing container is used and
                              the import is skipped.
      --container-save=PATH   [pyxis] Save the container state to a squashfs
                              file on the remote host filesystem.
      --container-mount-home  [pyxis] bind mount the user's home directory.
                              System-level enroot settings might cause this
                              directory to be already-mounted.

      --no-container-mount-home
                              [pyxis] do not bind mount the user's home
                              directory
      --container-remap-root  [pyxis] ask to be remapped to root inside the
                              container. Does not grant elevated system
                              permissions, despite appearances.

      --no-container-remap-root
                              [pyxis] do not remap to root inside the container
      --container-entrypoint  [pyxis] execute the entrypoint from the container
                              image

      --no-container-entrypoint
                              [pyxis] do not execute the entrypoint from the
                              container image

--container-image

This argument activates the Pyxis plugin and containerizes the submitted job. If no container registry is specified, the image will be pulled from Docker Hub:

$ srun --container-image=centos grep PRETTY /etc/os-release
PRETTY_NAME="CentOS Linux 8 (Core)"

You can pull the container image from any container registry, like you would do with the docker CLI:

$ srun --container-image nvcr.io/nvidia/pytorch:20.03-py3

You can use a squashfs file (from --container-save or enroot export) by passing its path as the argument:

$ srun --container-image ~/ubuntu.sqsh

If this file is on a shared filesystem, this is is useful for avoiding to pull the same image on all nodes of your cluster.

--container-mounts

This argument can be used to expose folders or files from the host system to the container. It is similar to the -v (or --mount type=bind) argument of docker run.

For instance, to bind-mount the /mnt folder from the host as /data inside the container:

$ srun --container-image ubuntu --container-mounts /mnt:/data ls /data

Using the same syntax, you can also mount files:

$ srun --container-image ubuntu --container-mounts /etc/os-release:/host/os-release cat /host/os-release

If the source and destination are identical, you can use the short-form with a single path:

$ srun --container-image ubuntu --container-mounts /mnt ls /mnt

You can also use relative paths (using the job's current working directory):

$ srun --container-image ubuntu --container-mounts ./config:/root/config cat /root/config

Finally, you can use additional mount flags such as ro (read-only), to prevent the container from unintentionally modifying the content from the host:

$ srun --container-image ubuntu --container-mounts /tmp/config:/root/config:ro sh -c 'echo oops > /root/config'
/usr/bin/sh: 1: cannot create /root/config: Read-only file system

--container-name

This argument is used to save the state of the container filesystem, in order to reuse it across srun commands. This is similar to docker run --name, and it is used to run or install additional tools required by the application.

# The file utility is not installed by default.
$ srun --container-image=ubuntu:20.04 which file
srun: error: luna-0173: task 0: Exited with exit code 1

# The following command creates a named container with the name "myubuntu", starting from the ubuntu 20.04 image.
$ srun --container-image=ubuntu:20.04 --container-name=myubuntu sh -c 'apt-get update && apt-get install -y file'

# Use the container filesystem created above, you don't need to specify --container-image anymore.
$ srun --container-name=myubuntu which file
/usr/bin/file

If you don't need to add anything to the container, you can also use this argument combined with a no-op command like true, to behave like docker pull and docker create, to prepare the container on all nodes before launching the application.

$ srun --container-image=ubuntu:20.04 --container-name=myubuntu true

If the container is running, --container-name will behave like docker exec. This is particularly useful on the login node of the cluster combined with --jobid, to join a running container without having to ssh to the compute node:

# From a compute node, or inside a sbatch script
$ srun --container-name=myapp --container-mounts /mnt:/data ./myapp

# From the login node
$ srun --jobid=432788 --container-name=myapp findmnt /data
TARGET SOURCE               FSTYPE OPTIONS
/data  /dev/nvme2n1p2[/mnt] ext4   rw,relatime,errors=remount-ro

As you will land in the same container, this approach can be used to debug or profile your app with gdb, perf_events, strace, etc.

--container-save

Exports the container filesystem to a squashfs file after the job completes. This file can then be passed to --container-image.

This option is useful to avoid storming a container registry with requests when running a large distributed job, all the nodes will pull the image simultaneously (unless the layers are cached already on some nodes).

Instead you can have a single job pull the container image, save it to a parallel filesystem, and then all the nodes can use the squashfs file from this shared filesystem.

$ srun --ntasks=1 --container-image nvcr.io#nvidia/pytorch:20.03-py3 --container-save /lustre/felix/pytorch.sqsh true
$ srun --nodes=128 --container-image /lustre/felix/pytorch.sqsh python train.py

This argument can also be useful to save the state of a container across jobs, as it is not possible with --container-name.

--container-workdir

By default, the working directory of the job will be taken from the container image (WORKDIR in a Dockerfile). This argument is equivalent to docker run --workdir and allows to override this path.

$ srun --container-image nvcr.io#nvidia/pytorch:20.03-py3 pwd
/workspace

$ srun --container-image nvcr.io/nvidia/pytorch:20.03-py3 --container-workdir /root pwd
/root

--[no-]container-remap-root

These arguments control whether the user will see themselves as UID 0 (root) or their usual UID when inside the container. This feature relies on the user namespaces feature of the Linux kernel, and thus the container is never granted additional privileges.

$ whoami 
fabecassis

$ srun --container-image ubuntu:20.04 --container-remap-root whoami
root

$ srun --container-image ubuntu:20.04 --no-container-remap-root whoami
fabecassis

Being root inside the container is useful to install packages, or in general for any application that expects to be root (e.g. checks that the UID is 0).

Being yourself inside the container is useful for applications that refuse to run as root (like OpenMPI), or scripts that get confused by the sudden change of UID and home directory.

There is a positive and negative form for this option, as the default behavior (when none of these arguments are used) depends on the pyxis configuration. The default behavior is to remap root, hence --container-remap-root is a no-op in this situation.

--[no-]container-mount-home

These arguments control whether the user's home directory should be mounted inside the container. This could also be achieved with --container-mounts.

If you are root inside the container (see above section), the directory will be mounted to /root:

$ srun --container-image ubuntu --container-mount-home --container-remap-root sh -c 'echo $HOME ; ls $HOME'
/root
[...]

If your UID is unchanged inside the container, the directory will be mounted at the same location than the host:

$ echo $HOME
/home/fabecassis

$ srun --container-image ubuntu --container-mount-home --no-container-remap-root sh -c 'echo $HOME ; ls $HOME'
/home/fabecassis
[...]

Mounting your home directory inside of the container is useful if you need to use code or configuration files (such as your .bashrc) stored outside of the container image.

There is a positive and negative form for this option, as the default behavior (when none of these arguments are used) depends on the enroot setting ENROOT_MOUNT_HOME. Mounting the home directory by default can create problems (such as the user's .bashrc overriding environment variables from the container), so it is not recommended.

Clone this wiki locally