# Ayudantía 1 - Notebook 1
### Profesor: Elwin van 't Wout
### Ayudante: Alberto Almuna Morales (alberto.almuna@uc.cl)

The library ```joblib``` provides functionality for parallel computing. In this notebook, let us look into the basics of the library.

In [2]:
import numpy as np

The ```joblib``` library has the class ```Parallel``` which provides the basic structure for parallel computing. The class provides the functionality to create a pool of workers that can perform tasks in parallel. let us create such object.

In [3]:
from joblib import Parallel

In [4]:
parallel_pool = Parallel()
print(parallel_pool)

Parallel(n_jobs=1)


By default, the object is initialized with only a single job. This means that no parallelization will be performed, because only one worker was created. Let us specify the number of jobs explicitly upon creating the worker pool.

In [5]:
parallel_pool = Parallel(n_jobs=2)
print(parallel_pool)

Parallel(n_jobs=2)


The number of workers can also be retrieved through the attribute ```n_jobs```.

In [6]:
print("The number of workers is:", parallel_pool.n_jobs)

The number of workers is: 2


Having created a class that can create different workers, let us specify the tasks to be performed. The tasks can be specified by the decorator `delayed` from the `joblib` library. A *decorator* is a Python function that takes one function and returns another function.

In [7]:
from joblib import delayed

Here, we will calculate the square root of different values.

In [8]:
parallel_sqrt = delayed(np.sqrt)

The function `parallel_sqrt` is now a function which can be interpreted by `joblib` as a parallel variant of the function `sqrt` of `Numpy`. It can be interpreted as a function that can assign the `Numpy` square-root function to the different workers in a parallel pool.

Before assigning the function to the workers, we need to specify the input variables for which the function needs to be called. Notice that we need to specify all tasks we like to perform but we do not have to specify which tasks needs to be assigned to which workers. This task assignment will be performed automatically by `joblib`.

In [9]:
parallel_tasks = [parallel_sqrt(i) for i in range(10)]

In [10]:
parallel_tasks

[(<ufunc 'sqrt'>, (0,), {}),
 (<ufunc 'sqrt'>, (1,), {}),
 (<ufunc 'sqrt'>, (2,), {}),
 (<ufunc 'sqrt'>, (3,), {}),
 (<ufunc 'sqrt'>, (4,), {}),
 (<ufunc 'sqrt'>, (5,), {}),
 (<ufunc 'sqrt'>, (6,), {}),
 (<ufunc 'sqrt'>, (7,), {}),
 (<ufunc 'sqrt'>, (8,), {}),
 (<ufunc 'sqrt'>, (9,), {})]

In [11]:
array = [i for i in range(10)]
parallel_tasks_2 = map(parallel_sqrt, array)

With the list of all tasks created, we can ask the parallel pool of workers to perform all tasks in parallel.

In [12]:
parallel_results = parallel_pool(parallel_tasks)

In [13]:
parallel_results_2 = parallel_pool(parallel_tasks_2)

In [14]:
print(parallel_results)

[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]


In [15]:
print(parallel_results_2)

[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]


In [16]:
parallel_results == parallel_results_2

True

The output is indeed the square root of all input values.

### Example of function with multiple arguments:

In [17]:
def my_task(n, m):
    return n*m

n = [i for i in range(1, 11)]
m = [i for i in range(5, 16)]

tasks = [delayed(my_task)(i, j) for i, j in zip(n,m)]

with Parallel(n_jobs=4) as parallel_pool:
    parallel_results = parallel_pool(tasks)
    print(parallel_results)

[5, 12, 21, 32, 45, 60, 77, 96, 117, 140]


In [18]:
tasks_2 = map(delayed(my_task), n, m)

with Parallel(n_jobs=4) as parallel_pool:
    parallel_results = parallel_pool(tasks_2)
    print(parallel_results)

[5, 12, 21, 32, 45, 60, 77, 96, 117, 140]
