## How to locally run parallel code with mpi4py in an IPython notebook:

## Introduction

This notebook introduces parallel exectution. Herefor, it is neccesary that you install pymofa locally by exectuing

    &pip install -e .
    
in the pymofa root directory.

The last update happed on:

In [1]:
import datetime
print(datetime.datetime.now().date())

2018-03-01


The prerequisite for this is a working installation of some MPI distribution.

Using Ubuntu or some derivative, I recommend using OpenMPI which can be istalled from the repository by means of the following packages:
    
    libopenmpi-dev, openmpi-bin, openmpi-doc

Now, you can already run MPI enabled code from you shell by calling

    $mpirun -n [numbmer_of_threads] python [script_to_run.py]

**To use MPI with iPython, one has to install ipyparallel:**

via pip: 

    $pip install ipyparallel

via conda:
    
    $conda install ipyparallel

and then enable the Clusters tab in ipython via

    $ipcluster nbextension enable
   

**To make MPI acessable via mpi4py in an ipython notebook, one has to do the following:**

open a shell and start the ipcontroller:

    $ipcontroller 

open another shell and start a number of engines:

    $mpirun -n [number of threads] ipengine --mpi=mpi4py

and then connect to the engines via the following fragment of code:

In [2]:
from ipyparallel import Client
c = Client()
view = c[:]
print(c.ids)

[0, 1, 2]


In [3]:
%%px
import os

def find(name, path):
    for root, dirs, files in os.walk(path):
        if name in files:
            return root
path = find('02_LocalParallelization.ipynb', '/home/')
print(path)
os.chdir(path)

[stdout:0] /home/barfuss/Documents/Work/Software/pymofa/tutorial
[stdout:1] /home/barfuss/Documents/Work/Software/pymofa/tutorial
[stdout:2] /home/barfuss/Documents/Work/Software/pymofa/tutorial


Now, to make the code run on all of our engines (and not just on one), the following cells have to start with the [__parallel magic__](https://ipython.org/ipython-doc/3/parallel/magics.html) command *%%px*

In [4]:
%%px
from mpi4py import MPI
com = MPI.COMM_WORLD
print(com.Get_rank())

[stdout:0] 1
[stdout:1] 0
[stdout:2] 2


Now, that we have MPI running, and mpi4py recognizing the nodes and their ranks, we can continue with the predator prey exercise, that we know from the first tutorial.


## The basic model
First, define the model:

In [5]:
%%px
import numpy as np

def predprey_model(prey_birth_rate, prey_mortality, 
                   predator_efficiency, predator_death_rate,
                   initial_prey, initial_predators,
                   time_length):
    """Discrete predetor prey model."""
    A = -1 * np.ones(time_length)
    B = -1 * np.ones(time_length)
    A[0] = initial_prey
    B[0] = initial_predators
    for t in range(1, time_length):
        A[t] = A[t-1] + prey_birth_rate * A[t-1] - prey_mortality * B[t-1]*A[t-1]
        B[t] = B[t-1] + predator_efficiency * B[t-1]*A[t-1] - predator_death_rate * B[t-1] +\
            0.02 * (0.5 - np.random.rand())
    return A, B



## pymofa
Then import the experiment_handling class from pymofa and define a run function:

In [6]:
%%px
# imports
from pymofa.experiment_handling import experiment_handling as eh
import itertools as it
import pandas as pd


# Path where to Store the simulated Data
SAVE_PATH_RAW = "./dummy/pmX01data"


# Definingh the experiment execution function
#      it gets paramater you want to investigate, plus `filename` as the last parameter
def RUN_FUNC(prey_birth_rate,
             coupling,
             predator_death_rate,
             initial_pop,
             time_length):
    """Insightful docstring."""
    # poss. process
    prey_mortality = coupling
    predator_efficiency = coupling
    initial_prey = initial_pop
    initial_predators = initial_pop
    # one could also do more complicated stuff here, e.g. 
    # drawing something from a random distribution
    
    # running the model
    preys, predators = predprey_model(prey_birth_rate,
                                      prey_mortality,
                                      predator_efficiency,
                                      predator_death_rate,
                                      initial_prey,
                                      initial_predators,
                                      time_length)
    
    # preparing the data
    res = pd.DataFrame({"preys": np.array(preys),
                        "predators": np.array(predators)})
    res.index.name = "tstep"
    
    # store run funcs model result
    # store(res)
    
    # determine exit status (if something went wrong)
    # if exit status > 0 == run passed
    # if exit status < 0 == Run Failed
    exit_status = 42
    
    # RUN_FUNC needs to return exit_status 
    return exit_status, res


# runfunc result format
RUNFUNC_RESULTSFORM = pd.DataFrame(columns=["predators", "preys"])
RUNFUNC_RESULTSFORM.index.name = "tstep"


# Parameter combinations to investiage
prey_birth_rate = [0.09, 0.1, 0.11]
coupling = [0.1]
predator_death_rate = [0.005, 0.01, 0.05, 0.1]
initial_pop = [1.0, 2.0]
time_length = [1000]

PARAM_COMBS = list(it.product(prey_birth_rate,
                              coupling,
                              predator_death_rate,
                              initial_pop,
                              time_length))


# INDEX 
INDEX = {i: RUN_FUNC.__code__.co_varnames[i]
         for i in range(RUN_FUNC.__code__.co_argcount-1)}

Specify the necessary parameters, generate their combinations and feed them to an experiment handle:

In [9]:
%%px

# Sample Size
SAMPLE_SIZE = 4


# initiate handle instance with experiment variables
handle = eh(RUN_FUNC,
            RUNFUNC_RESULTSFORM,
            PARAM_COMBS,
            SAMPLE_SIZE,
            SAVE_PATH_RAW)

[stdout:0] initializing pymofa experiment handle
[stdout:1] 
initializing pymofa experiment handle
detected 3 nodes in MPI environment
boooja
0 of 96 single computations left
[stdout:2] initializing pymofa experiment handle


And finally run the model - now in parallel:

In [8]:
%%time
%%px
# Compute experiemnts raw data
handle.compute()

[stdout:1] 
24 of 96 single computations left
Saving rawdata at /home/barfuss/Documents/Work/Software/pymofa/tutorial/dummy/pmX01data.h5
Splitting calculations to 2 nodes.
Calculating... 4.17%Calculating... 8.33%Calculating... 12.50%Calculating... 16.67%Calculating... 20.83%Calculating... 25.00%Calculating... 29.17%Calculating... 33.33%Calculating... 37.50%Calculating... 41.67%Calculating... 45.83%Calculating... 50.00%Calculating... 54.17%Calculating... 58.33%Calculating... 62.50%Calculating... 66.67%Calculating... 70.83%Calculating... 75.00%Calculating... 79.17%Calculating... 83.33%Calculating... 87.50%Calculating... 91.67%Calculating... 95.83%
Calculating... 100.00%
Calculattion done.
CPU times: user 51.7 ms, sys: 5.22 ms, total: 57 ms
Wall time: 8.71 s


[stderr:0] 
[stderr:2] 


And if everyting whent well, the calculations should have been splitted between all the engines that you've started in the beginning.

To run you experiments in scripts outside of an IPython notebook, simply run you experiment script (defining a run function, an experiment handle and calling the compute routine of that handle) with mpirun in a terminal 