## OSU G2G Bandwidth Benchmark with MPI4Py
In this example we use [IPCMagic](https://github.com/eth-cscs/ipcluster_magic/tree/master) to run a test from the [OSU Bandwidth benchmark](http://mvapich.cse.ohio-state.edu/benchmarks/) with MPI4Py from a Jupyter notebook.
Using [this example](https://mpi4py.readthedocs.io/en/stable/tutorial.html#cuda-aware-mpi-python-gpu-arrays), we adapted the [osu_bw.py](https://github.com/mpi4py/mpi4py/blob/d0228f0397403ff73d8f41d90d97b411efda6128/demo/osu_bw.py) script from the MPI4Py repository so it uses an array allocated on the GPU.

* From a shell in Piz Daint this can be run using this Slurm job script:
 
```
#!/bin/bash -l

#SBATCH --job-name=osubw
#SBATCH --time=00:05:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=12
#SBATCH --partition=normal
#SBATCH --constraint=gpu
#SBATCH --account=<project>

# source python environment with cupy and mpi4py

export MPICH_RDMA_ENABLED_CUDA=1

srun python osu_bw_cupy.py
```

In [1]:
import os
import ipcmagic

In [2]:
os.environ['MPICH_RDMA_ENABLED_CUDA'] = '1'  # Enable direct communication between GPUs

In [None]:
%ipcluster --version

In [3]:
%ipcluster start -n 2

100%|██████████| 2/2 [00:06<00:00,  3.23s/engine]


In [None]:
# Disable IPyParallel's progress bar
%pxconfig --progress-after -1

In [4]:
%%px
import socket

socket.gethostname()

[0;31mOut[1:1]: [0m'nid02125'

[0;31mOut[0:1]: [0m'nid02124'

In [5]:
%%px
from osu_bw_cupy import osu_bw

%px: 100%|██████████| 2/2 [00:01<00:00,  1.06tasks/s]


In [6]:
%%px
osu_bw()

[stdout:0] # MPI G2G Bandwidth Test
# Size [B]    Bandwidth [MB/s]
1                         0.15
2                         0.30
4                         0.64
8                         1.28
16                        2.46
32                        5.06
64                       10.10
128                      20.37
256                      40.30
512                      78.17
1024                    162.33
2048                    323.49
4096                    630.32
8192                   1094.98
16384                  1609.12
32768                  2340.20
65536                  4557.27
131072                 6142.58
262144                 7284.93
524288                 8002.00
1048576                8384.16
2097152                8628.62
4194304                8753.23


%px: 100%|██████████| 2/2 [00:01<00:00,  1.85tasks/s]


In [None]:
%ipcluster stop