# Ayudantía 1 - Notebook 4
### Profesor: Elwin van 't Wout
### Ayudante: Alberto Almuna Morales (alberto.almuna@uc.cl)

The library `joblib` provides functionality for parallel computing. In this notebook, let us look into the different parallel backends.

In [None]:
from joblib import Parallel, delayed

Let us create the tasks to be performed in parallel by the workers:

In [None]:
def my_task(n):
    my_sum = 0
    for m in range(n):
        my_sum += m
    return my_sum

In [None]:
tasks = [delayed(my_task)(i) for i in range(40000)]

The `joblib` library has different parallel backends that perform can perform the parallel computations. By default, the `loky` backend is used, which is based on the *multiprocessing* model, creating different Python processes and assigning tasks to each of these processes. Alternatively, the `threading` backend is based on the *multithreading* model, creating different threads within the same process. Generally speaking, creating and managing processes requires more overhead than threads, but is more robust and reliable since there will be no race conditions.

Since all tasks in this tutorial are independent, there is no risk of ```race conditions``` and both backends can be used.

When performing the parallel tasks, look in the *Activity Monitor* or *Task Manager* to see the different processes.

In [None]:
n_jobs = 4
batch_size = 1000
verbose = 10

In [None]:
with Parallel(n_jobs=n_jobs, batch_size=batch_size, verbose=verbose, backend='loky') as parallel_pool:
    parallel_results = parallel_pool(tasks)

In [None]:
with Parallel(n_jobs=n_jobs, batch_size=batch_size, verbose=verbose, backend='threading') as parallel_pool:
    parallel_results = parallel_pool(tasks)

The efficiency of each backend strongly depends on the specific arquitecture of the computer and the tasks to be performed.

Instead of specifying the the backend explicitly, one can also nudge `joblib` into using multiprocessing or multithreading by specifying ```prefer='processes'``` or ```prefer='threading'``` when creating the `Parallel` object.

For testing purposes, it can be useful to force sequential code by using ```backend='sequential'```.

In [None]:
with Parallel(n_jobs=n_jobs, batch_size=batch_size, verbose=verbose, backend='sequential') as parallel_pool:
    parallel_results = parallel_pool(tasks)

In [None]:
# Implementación secuencial directa:

import time as tm

ti = tm.time()
sequential_result = [my_task(i) for i in range(40000)]
tf = tm.time()

print(tf-ti)