Running code that requires CUDA enabled GPUs on multiple platforms
====

The following Python code: [mandelbrot_gpu.py](https://github.com/edwardchalstrey1/turingbench/blob/master/turingbench_python_cuda/mandelbrot_gpu/mandelbrot_gpu.py) creates a mandelbrot image, using Python's ```numba``` package with the CUDA toolkit on GPUs. For our purposes, let's just consider the time taken to create the image, which is printed (see line 57: [mandelbrot_gpu.py](https://github.com/edwardchalstrey1/turingbench/blob/master/turingbench_python_cuda/mandelbrot_gpu/mandelbrot_gpu.py)).

This code was taken from [*harrism*'s notebook](https://github.com/harrism/numba_examples/blob/master/mandelbrot_numba.ipynb) featured in the [NVIDIA developer blog](https://devblogs.nvidia.com/numba-python-cuda-acceleration/).

Dockerfile
---

The Dockerfile below specifies an image that can be used to create a container capable of running ```mandelbrot_gpu.py```.

Running a CUDA container requires a machine with at least one CUDA-capable GPU and a driver compatible with the CUDA toolkit version you are using. Take a look at the requirements table [here](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements).

The machine running the CUDA container only requires the NVIDIA driver, the CUDA toolkit doesn't have to be installed.

A Docker image has been built and pushed to [Docker Hub](https://cloud.docker.com/u/edwardchalstrey/repository/docker/edwardchalstrey/mandelbrot_gpu) with this Dockerfile:

1. ```docker build -t edwardchalstrey/mandelbrot_gpu .```
2. ```docker push edwardchalstrey/mandelbrot_gpu```

It can then can be run with Docker, but requires nvidia-docker to also be installed:

```nvidia-docker run edwardchalstrey/mandelbrot_gpu```

If your platform doesn't have nvidia-docker, see the [installation instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0)#installing-version-20)

In [1]:
%%writefile Dockerfile
FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04

RUN  apt-get update \
  && apt-get install -y wget vim bzip2\
  && rm -rf /var/lib/apt/lists/*

RUN apt-get update
RUN apt-get -y install curl

#Install MINICONDA
RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O Miniconda.sh && \
    /bin/bash Miniconda.sh -b -p /opt/conda && \
    rm Miniconda.sh

ENV PATH /opt/conda/bin:$PATH

RUN conda install numpy scipy matplotlib numba cudatoolkit=9.0 pyculib -y

COPY mandelbrot_gpu.py /mandelbrot_gpu.py

CMD python3 mandelbrot_gpu.py

Overwriting Dockerfile


Singularity definition file
-----

The definition file below is based on the Docker image already built and pushed to Docker Hub and so is very simple.

Singularity Commands to build from the Docker Hub image and run:

1. ```singularity build mandelbrot_gpu.sif Singularity.mandelbrot_gpu```
2. ```singularity run --nv mandelbrot_gpu.sif```

*Note, the Singularity container needs to be run in the same dir as a file called ```mandelbrot_gpu.py``` for it to run this way. You may wish to not include anything in %files and instead specify the file to run in the run command.*

In this case I have built the image with [Singularity Hub](https://www.singularity-hub.org/) by linking it to my [GitHub repo](https://github.com/edwardchalstrey1/turingbench/tree/master/turingbench_python_cuda/mandelbrot_gpu), which contains the definition file, named such that the image will be built on each commit.

A container based on the image can then be run on any platform with Singularity with the following command (using the ```--nv``` option to leverage the nvidia GPU):

```singularity run --nv shub://singularity-hub.org/edwardchalstrey1/turingbench:mandelbrot_gpu```

In [5]:
%%writefile Singularity.mandelbrot_gpu
BootStrap: docker 
From: edwardchalstrey/mandelbrot_gpu

%post
    apt-get -y update

%files      
    mandelbrot_gpu.py /mandelbrot_gpu.py

Overwriting Singularity.mandelbrot_gpu


JADE submission script
----

To run a Singularity container based on this image on the [JADE HPC a submission script was required](http://docs.jade.ac.uk/en/latest/jade/scheduler/index.html).

In JADE, make the submission script executable:
```chmod +x jade_sub.sh```

Then run with a command such as this:
```srun --gres=gpu:1 -p small --pty jade_sub.sh```

*Note: This works without loading JADE's CUDA module (e.g. ```module load cuda/9.0```) and loading it appears to break the link to the driver.*

In [1]:
%%writefile jade_sub.sh
#!/bin/bash

# set the number of nodes
#SBATCH --nodes=1

# set max wallclock time
#SBATCH --time=00:30:00

# set name of job
#SBATCH --job-name=echalstrey_singularity_cuda_test1

# set number of GPUs
#SBATCH --gres=gpu:4

# mail alert at start, end and abortion of execution
#SBATCH --mail-type=ALL

# send mail to this address
#SBATCH --mail-user=echalstrey@turing.ac.uk

# run the application
module load singularity
singularity run --nv shub://singularity-hub.org/edwardchalstrey1/turingbench:mandelbrot_gpu

Overwriting jade_sub.sh


Results
------

I can now run ```mandelbrot_gpu.py``` on these platforms:

| Platform  | Container  | Mandelbrot creation time in s  |
|---|---|---|
| Azure (Nvidia K80)  | Docker  | 5.369710  |
| Azure (Nvidia K80)  | Singularity 3.2  | 5.734466  |
| CSD3 (Nvidia V100)  | Singularity  |   |
| JADE (Nvidia P100)  | Singularity 2.4 | 0.934607  |