# Parallelized Feature Location using IPython Parallel

Feature-finding can easily be parallelized: each frame an independent task, and the tasks can be divided among the available CPUs. IPython parallel makes this very straightforward.

## Intsall ipyparallel

As for IPython 4.0 (summer 2015), IPython parallel is a separate package. Install it as follows:
```
pip install ipyparalell
```

It is simplest to start a cluster on the CPUs of the local machine. In order to start a cluster, you will need to go to a Terminal and type:
```
ipcluster start -n 4
```

The number 4 should be replaced by the number of available CPUs. Now you are running a cluster -- it's that easy. More information on IPython parallel is available in [the IPython parallel documentation](http://ipyparallel.readthedocs.io/en/latest/intro.html).

In [1]:
from ipyparallel import Client
client = Client()
view = client.load_balanced_view()

We can see that there are four cores available.

In [2]:
client[:]

<DirectView [0, 1, 2, 3]>

Use a little magic, ``%%px``, to import trackpy on all cores.

In [3]:
%%px
import trackpy as tp

Do the normal setup now, import trackpy normally and loading frames to analyze.

In [4]:
import pims
import trackpy as tp

def gray(image):
    return image[:, :, 0]

frames = pims.ImageSequence('../sample_data/bulk_water/*.png', process_func=gray)

Define a function from ``locate`` with all the parameters specified, so the function's only argument is the image to be analyzed. We can map this function directly onto our collection of images. (This is a called "currying" a function, hence the choice of name.)

In [5]:
curried_locate = lambda image: tp.locate(image, 13, invert=True)

In [6]:
view.map(curried_locate, frames[:4])  # Optionally, prime each engine: make it set up FFTW.

<AsyncMapResult: <lambda>>

Compare the time it takes to locate features in the first ten images with and without parallelization.

In [7]:
%%timeit
amr = view.map_async(curried_locate, frames[:32])
amr.wait_interactive()
results = amr.get()

  32/32 tasks finished after    1 s
done
1 loop, best of 3: 1.3 s per loop


In [8]:
%%timeit
serial_result = list(map(curried_locate, frames[:32]))

1 loop, best of 3: 3.15 s per loop
