# Starting cluster

## Prerequisites
First, you need to install MPI, on windows use MS-MPI:
https://msdn.microsoft.com/en-us/library/bb524831(v=vs.85).aspx


## With a profile (not working)
In theory, you should be able to create a profile using
```
ipython profile create --parallel --profile=myprofile
```
and then set
```
c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
```
in ```<IPYTHON-DIR>/profile_myprofile/ipcluster_config.py```. This should then enable you to start a cluster using
```
ipcluster start --profile=myprofile
```
or alternatively through the Clusters tab in Jupyter


## Without a profile (not working)
An alternative is to run
```
ipcluster start --engines=MPI
```


## Manual start (working)
This, however, does *not* work for me on Windows. What does work is the following:

Start a controller using
```
ipcontroller --ip='*'
```
and then start several engines using mpiexec:
```
mpiexec -n 4 ipengine --mpi
```

In [9]:
import ipyparallel

# attach to a running cluster
cluster = ipyparallel.Client()#profile='mpi')

print('profile:', cluster.profile)
print('Number of ids:', len(cluster.ids))
print("IDs:", cluster.ids) # Print process id numbers

profile: default
Number of ids: 4
IDs: [0, 1, 2, 3]


In [10]:
%%px

from mpi4py import MPI

comm = MPI.COMM_WORLD

print("Hello! I'm rank %d from %d running in total..." % (comm.rank, comm.size))

comm.Barrier()   # wait for everybody to synchronize _here_

[stdout:0] Hello! I'm rank 0 from 4 running in total...
[stdout:1] Hello! I'm rank 1 from 4 running in total...
[stdout:2] Hello! I'm rank 2 from 4 running in total...
[stdout:3] Hello! I'm rank 3 from 4 running in total...


In [11]:
%%px

from mpi4py import MPI
import numpy

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

print("Starting")
# passing MPI datatypes explicitly
if rank == 0:
    data = numpy.arange(100, dtype='i')
    numpy.random.shuffle(data)
    comm.Send([data, MPI.INT], dest=1, tag=77)
    print("{0}: sent data to 1: {1}".format(rank, data))
elif rank == 1:
    data = numpy.empty(100, dtype='i')
    comm.Recv([data, MPI.INT], source=0, tag=77)
    print("{0}: received data from 0: {1}".format(rank, data))
else:
    print("{0}: idle".format(rank))

[stdout:0] 
Starting
0: sent data to 1: [89 32 30 80 77 98 25  3 28 42 13 67  2 87  4 15 74 54 96 63 51  9 36 92
 10 73 71 68  8 75  5 61 47 76 83  0 14 93 49 41 43 72 91 84 34 23 65 31
 69 19 45 17 94 62 53 95 86 99  6 12 27  1 60 48 66 64 59 78 21 44 38 70
 90 97 33 37 57 85 52 29 88  7 24 18 39 56 46 16 22 79 35 55 26 11 81 82
 50 40 20 58]
[stdout:1] 
Starting
1: received data from 0: [89 32 30 80 77 98 25  3 28 42 13 67  2 87  4 15 74 54 96 63 51  9 36 92
 10 73 71 68  8 75  5 61 47 76 83  0 14 93 49 41 43 72 91 84 34 23 65 31
 69 19 45 17 94 62 53 95 86 99  6 12 27  1 60 48 66 64 59 78 21 44 38 70
 90 97 33 37 57 85 52 29 88  7 24 18 39 56 46 16 22 79 35 55 26 11 81 82
 50 40 20 58]
[stdout:2] 
Starting
2: idle
[stdout:3] 
Starting
3: idle


In [12]:
%%px

#Lets have matplotlib "inline"
%matplotlib inline

#Python 2.7 compatibility
from __future__ import print_function

#Import packages we need
import numpy as np
from matplotlib import animation, rc
from matplotlib import pyplot as plt
#import mpld3

import subprocess
import os
import sys
import gc
import datetime

from importlib import reload

sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), 'gpu_ocean')))

import pycuda.driver as cuda
from pycuda.compiler import SourceModule

#Finally, import our simulator
from SWESimulators import Common, CTCS, PlotHelper, IPythonMagic
from SWESimulators.BathymetryAndICs import *

In [13]:
%%px
%setup_logging MPITest.log
%cuda_context_handler gpu_ctx

[stderr:0] 
Registering logging to MPITest.log
Python version 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609]
PyCUDA version 2018.1
CUDA version (7, 5, 0)
Driver version 9010
Using 'Tesla M2090' GPU
Created context handle <54065520>
[stderr:1] 
Registering logging to MPITest.log
Python version 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609]
PyCUDA version 2018.1
CUDA version (7, 5, 0)
Driver version 9010
Using 'Tesla M2090' GPU
Created context handle <48985280>
[stderr:2] 
Registering logging to MPITest.log
Python version 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609]
PyCUDA version 2018.1
CUDA version (7, 5, 0)
Driver version 9010
Using 'Tesla M2090' GPU
Created context handle <40082656>
[stderr:3] 
Registering logging to MPITest.log
Python version 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609]
PyCUDA version 2018.1
CUDA version (7, 5, 0)
Driver version 9010
Using 'Tesla M2090' GPU
Created context handle <43312896>


In [14]:
%%px

def run_benchmark(simulator):
    with Common.Timer(simulator.__class__.__name__ + "_" + str(sim_args["nx"])) as timer:
        t = sim.step(2.0)

In [15]:
%%px

reload(Common)
reload(CTCS)

# Set initial conditions common to all simulators
sim_args = {
"gpu_ctx": gpu_ctx,
"nx": 100, "ny": 200,
"dx": 200.0, "dy": 200.0,
"dt": 1,
"g": 9.81,
"f": 0.0,
"r": 0.0,
"write_netcdf": True,
"ensemble_size": 2,
"ensemble_member": rank,
}

In [8]:
%%px

ghosts = [1,1,1,1] # north, east, south, west
dataShape = (sim_args["ny"] + ghosts[0]+ghosts[2], 
             sim_args["nx"] + ghosts[1]+ghosts[3])

h0 = np.ones(dataShape, dtype=np.float32) * 60.0;
eta0 = np.zeros(dataShape, dtype=np.float32);
u0 = np.zeros((dataShape[0], dataShape[1]+1), dtype=np.float32);
v0 = np.zeros((dataShape[0]+1, dataShape[1]), dtype=np.float32);       

#Create bump in to lower left of domain for testing
addCentralBump(eta0, sim_args["nx"], sim_args["ny"], sim_args["dx"], sim_args["dy"], ghosts)

#Initialize simulator
ctcs_args = {"H": h0, "eta0": eta0, "hu0": u0, "hv0": v0, "A": 1.0}
sim = CTCS.CTCS(**ctcs_args, **sim_args)

#Run a simulation and plot it
run_benchmark(sim)

CompositeError: one or more exceptions from call to method: execute
[0:execute]: OSError: Permission denied
[1:execute]: AttributeError: NetCDF: Attribute not found

In [None]:
%%px
sim.cleanUp()