# Interactive parallel computing with ipyparallel

ipyparallel is a Python package and collection of scripts for controlling clusters for Jupyter.

ipyparallel contains the following scripts:

- ipcluster - start/stop a cluster
- ipcontroller - start a scheduler
- ipengine - start an engine


## Installing/activating ipyparallel

- first install the package: `conda install ipyparallel`
- to be able to launch parallel engines from the dashboard add `c.NotebookApp.server_extensions.append('ipyparallel.nbextension')` to the file `~/.jupyter/jupyter_notebook_config.py` (may need to create it)
- then enable the IPython Clusters tab in Jupyter by typing `ipcluster nbextension enable --user` in a terminal


### *Exercise*:
- Install and activate ipyparallel!

### Example

In [8]:
import numpy as np

In [11]:
from ipyparallel import Client

In [12]:
rc = Client()

            Controller appears to be listening on localhost, but not on this machine.
            If this is true, you should specify Client(...,sshserver='you@192.168.0.10')
            or instruct your controller to listen on an external IP.


The `ids` attribute of Client instance shows identifiers of engines that IPython detected

In [14]:
rc.ids

[0, 1, 2, 3]

### *Exercise*:
- what other attributes does rc have?


### There are several ways to run parallel code

- **Direct interface** - direct access to every engine
- **load-balanced interface** - submit job to scheduler which distributes to engines depending on load

%px magic executes given python code in parallel

In [28]:
%px?

In [30]:
%px import os, time

In [31]:
%px a = os.getpid()

In [34]:
%px print(os.getpid())

[stdout:0] 15285
[stdout:1] 15284
[stdout:2] 15283
[stdout:3] 15286


Specify list of engines to run code on using `--targets`. Supports Python slicing

In [36]:
%%px --targets :-1
    print(os.getpid())

[stdout:0] 15285
[stdout:1] 15284
[stdout:2] 15283


In [38]:
%load?

In [40]:
v = rc.load_balanced_view()

In [41]:
def sample(n):
    import numpy as np
    # Random coordinates.
    x, y = np.random.rand(2, n)
    # Square distances to the origin.
    r_square = x ** 2 + y ** 2
    # Number of points in the quarter disc.
    return (r_square <= 1).sum()

In [42]:
def pi(n_in, n):
    return 4. * float(n_in) / n

In [43]:
n = 100000000

In [44]:
pi(sample(n), n)

3.14160116

In [45]:
%timeit pi(sample(n), n)

1 loop, best of 3: 4.76 s per loop


In [49]:
args = [n / 100] * 100

In [56]:
ar = v.map(sample, args)

In [60]:
ar.ready(), ar.progress

(True, 100)

In [58]:
ar.elapsed, ar.serial_time


(2.149972, 7.719009)

In [59]:
pi(np.sum(ar.result), n)

TypeError: float() argument must be a string or a number

In [61]:
ar.result

<bound method AsyncMapResult.result of <AsyncMapResult: sample:finished>>

# use example from here: http://davidmasad.com/blog/simulation-with-ipyparallel/