# Starting cluster

## Prerequisites
First, you need to install MPI, on windows use MS-MPI:
https://msdn.microsoft.com/en-us/library/bb524831(v=vs.85).aspx


## With a profile (not working)
In theory, you should be able to create a profile using
```
ipython profile create --parallel --profile=myprofile
```
and then set
```
c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
```
in ```<IPYTHON-DIR>/profile_myprofile/ipcluster_config.py```. This should then enable you to start a cluster using
```
ipcluster start --profile=myprofile
```
or alternatively through the Clusters tab in Jupyter


## Without a profile (not working)
An alternative is to run
```
ipcluster start --engines=MPI
```


## Manual start (working)
This, however, does *not* work for me on Windows. What does work is the following:

Start a controller using
```
ipcontroller --ip='*'
```
and then start several engines using mpiexec:
```
mpiexec -n 4 ipengine --mpi
```

In [75]:
import ipyparallel

# attach to a running cluster
cluster = ipyparallel.Client()#profile='mpi')

print('profile:', cluster.profile)
print('Number of ids:', len(cluster.ids))
print("IDs:", cluster.ids) # Print process id numbers

profile: default
Number of ids: 4
IDs: [0, 1, 2, 3]


In [76]:
%%px

from mpi4py import MPI

comm = MPI.COMM_WORLD

print("Hello! I'm rank %d from %d running in total..." % (comm.rank, comm.size))

comm.Barrier()   # wait for everybody to synchronize _here_

[stdout:0] Hello! I'm rank 0 from 4 running in total...
[stdout:1] Hello! I'm rank 1 from 4 running in total...
[stdout:2] Hello! I'm rank 2 from 4 running in total...
[stdout:3] Hello! I'm rank 3 from 4 running in total...


In [77]:
%%px

from mpi4py import MPI
import numpy

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

print("Starting")
# passing MPI datatypes explicitly
if rank == 0:
    data = numpy.arange(100, dtype='i')
    numpy.random.shuffle(data)
    comm.Send([data, MPI.INT], dest=1, tag=77)
    print("{0}: sent data to 1: {1}".format(rank, data))
elif rank == 1:
    data = numpy.empty(100, dtype='i')
    comm.Recv([data, MPI.INT], source=0, tag=77)
    print("{0}: received data from 0: {1}".format(rank, data))
else:
    print("{0}: idle".format(rank))

[stdout:0] 
Starting
0: sent data to 1: [55 11 19 64 31 41 86 94 16 39 89 40 10 18 24 12  0  7 62 54 20 48 97  2
 72 53 45 44 52 21  9 33 50 43 51 93 14 82 42 26 36 35  4 49  1  6 87 73
  3 58 79 61 77 23 74 32 56 99 67 95 15 46 96 80 63 30 76 34 22 27 60 57
 17 81 69  5 28  8 90 71 38 70 91 78 88 47 98 75 29 66 83 68 84 25 65 59
 85 92 37 13]
[stdout:1] 
Starting
1: received data from 0: [55 11 19 64 31 41 86 94 16 39 89 40 10 18 24 12  0  7 62 54 20 48 97  2
 72 53 45 44 52 21  9 33 50 43 51 93 14 82 42 26 36 35  4 49  1  6 87 73
  3 58 79 61 77 23 74 32 56 99 67 95 15 46 96 80 63 30 76 34 22 27 60 57
 17 81 69  5 28  8 90 71 38 70 91 78 88 47 98 75 29 66 83 68 84 25 65 59
 85 92 37 13]
[stdout:2] 
Starting
2: idle
[stdout:3] 
Starting
3: idle


In [78]:
%%px

#Lets have matplotlib "inline"
%matplotlib inline

#Import packages we need
import numpy as np
from matplotlib import animation, rc
from matplotlib import pyplot as plt
#import mpld3

import subprocess
import os
import sys
import gc
import datetime

from importlib import reload

sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), 'gpu_ocean')))

import pycuda.driver as cuda
from pycuda.compiler import SourceModule

#Finally, import our simulator
from SWESimulators import Common, CTCS, PlotHelper, IPythonMagic
#Import initial condition and bathymetry generating functions:
from SWESimulators.BathymetryAndICs import *

In [79]:
%%px

%setup_logging --out mpitest.log
%cuda_context_handler gpu_ctx

[stderr:0] 
Console logger using level INFO
File logger using level DEBUG to mpitest.log
Python version 3.7.2 (default, Mar 13 2019, 14:18:46) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
Registering gpu_ctx in user workspace
PyCUDA version 2018.1.1
CUDA version (10, 1, 0)
Driver version 10010
Using 'Tesla P100-PCIE-12GB' GPU
Created context handle <44145632>
[stderr:1] 
Console logger using level INFO
File logger using level DEBUG to mpitest.log
Python version 3.7.2 (default, Mar 13 2019, 14:18:46) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
Registering gpu_ctx in user workspace
PyCUDA version 2018.1.1
CUDA version (10, 1, 0)
Driver version 10010
Using 'Tesla P100-PCIE-12GB' GPU
Created context handle <29335216>
[stderr:2] 
Console logger using level INFO
File logger using level DEBUG to mpitest.log
Python version 3.7.2 (default, Mar 13 2019, 14:18:46) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
Registering gpu_ctx in user workspace
PyCUDA version 2018.1.1
CUDA version (10, 1, 0)
Driver version

In [80]:
%%px

def run_benchmark(simulator):
    with Common.Timer(simulator.__class__.__name__ + "_" + str(sim_args["nx"])) as timer:
        t = sim.step(2.0)

In [81]:
%%px

reload(Common)
reload(CTCS)

# Set initial conditions common to all simulators
sim_args = {
"gpu_ctx": gpu_ctx,
"nx": 100, "ny": 200,
"dx": 200.0, "dy": 200.0,
"dt": 1,
"g": 9.81,
"f": 0.0,
"r": 0.0,
"write_netcdf": True,
"ensemble_size": 4,
"ensemble_member": rank,
}

In [82]:
%%px

ghosts = [1,1,1,1] # north, east, south, west
dataShape = (sim_args["ny"] + ghosts[0]+ghosts[2], 
             sim_args["nx"] + ghosts[1]+ghosts[3])

h0 = np.ones(dataShape, dtype=np.float32) * 60.0;
eta0 = np.zeros(dataShape, dtype=np.float32);
u0 = np.zeros((dataShape[0], dataShape[1]+1), dtype=np.float32);
v0 = np.zeros((dataShape[0]+1, dataShape[1]), dtype=np.float32);       

#Create bump in to lower left of domain for testing
addCentralBump(eta0, sim_args["nx"], sim_args["ny"], sim_args["dx"], sim_args["dy"], ghosts)

#Initialize simulator
ctcs_args = {"H": h0, "eta0": eta0, "hu0": u0, "hv0": v0, "A": 1.0}
sim = CTCS.CTCS(**ctcs_args, **sim_args)

#Run a simulation and plot it
run_benchmark(sim)

CompositeError: one or more exceptions from call to method: execute
[0:execute]: ValueError: parallel mode requires MPI enabled netcdf-c
[1:execute]: ValueError: parallel mode requires MPI enabled netcdf-c
[2:execute]: ValueError: parallel mode requires MPI enabled netcdf-c
[3:execute]: ValueError: parallel mode requires MPI enabled netcdf-c

In [58]:
%%px

sim.cleanUp()

[stdout:0] Closing file netcdf_2019_04_24/CTCS_2019_04_24-09_23_21_0.nc ...
[stdout:1] Closing file netcdf_2019_04_24/CTCS_2019_04_24-09_23_21_1.nc ...
[stdout:2] Closing file netcdf_2019_04_24/CTCS_2019_04_24-09_23_21_2.nc ...
[stdout:3] Closing file netcdf_2019_04_24/CTCS_2019_04_24-09_23_21_3.nc ...
