## OSU G2G Bandwidth Benchmark with MPI4Py
In this example we use [IPCMagic](https://github.com/eth-cscs/ipcluster_magic/tree/master) to run a test from the [OSU Bandwidth benchmark](http://mvapich.cse.ohio-state.edu/benchmarks/) with MPI4Py from a Jupyter notebook.
Using [this example](https://mpi4py.readthedocs.io/en/stable/tutorial.html#cuda-aware-mpi-python-gpu-arrays), we adapted the [osu_bw.py](https://github.com/mpi4py/mpi4py/blob/d0228f0397403ff73d8f41d90d97b411efda6128/demo/osu_bw.py) script from the MPI4Py repository so it uses an array allocated on the GPU.

* From a shell in Piz Daint this can be run using this Slurm job script:
 
```
#!/bin/bash -l

#SBATCH --job-name=osubw
#SBATCH --time=00:05:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=12
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account=<project>

# source python environment with cupy and mpi4py

export MPICH_RDMA_ENABLED_CUDA=1

srun python osu_bw_cupy.py
```

In [1]:
import os
import ipcmagic

In [2]:
os.environ['MPICH_RDMA_ENABLED_CUDA'] = '1'  # Enable direct communication between GPUs

In [3]:
%ipcluster --version

1.1.0


In [4]:
%ipcluster start -n 2

100%|██████████| 2/2 [00:09<00:00,  4.68s/engine]


In [5]:
# Disable IPyParallel's progress bar
%pxconfig --progress-after -1

In [6]:
%%px     ## for remote command to be exectued by each server on each desired node !!
import socket

socket.gethostname()

[0;31mOut[1:1]: [0m'nid02239'

[0;31mOut[0:1]: [0m'nid02238'

In [7]:
%%px
from osu_bw_cupy import osu_bw

In [8]:
%%px
osu_bw()

[stdout:0] # MPI G2G Bandwidth Test
# Size [B]    Bandwidth [MB/s]
1                         0.16
2                         0.32
4                         0.62
8                         1.30
16                        2.40
32                        5.16
64                        9.53
128                      18.44
256                      37.60
512                      83.04
1024                    166.59
2048                    332.79
4096                    630.79
8192                   1094.84
16384                  1754.34
32768                  2483.58
65536                  4585.42
131072                 6161.99
262144                 7189.73
524288                 7866.56
1048576                8276.83
2097152                8518.70
4194304                8650.03


In [None]:
%ipcluster stop