# Ayudantía 1 - Notebook 3
### Profesor: Elwin van 't Wout
### Ayudante: Alberto Almuna Morales (alberto.almuna@uc.cl)

The library `joblib` provides functionality for parallel computing. In this notebook, let us look into the information that can be printed.

In [None]:
import numpy as np
from joblib import Parallel, delayed

Let us create the tasks to be performed in parallel by the workers: calculating the square root of several numbers.

In [None]:
tasks = [delayed(np.sqrt)(i) for i in range(100000)]

Upon creating a `Parallel` object with `joblib`, the attribute `verbose` can be set to a number. The higher the number, the more output will be provided by the library.

In [None]:
with Parallel(n_jobs=2, verbose=0) as parallel_pool:
    parallel_results = parallel_pool(tasks)

For a value of zero, no output will be printed. Positive values provide output.

In [None]:
with Parallel(n_jobs=2, verbose=1) as parallel_pool:
    parallel_results = parallel_pool(tasks)

The library prints the number of workers used and every now and then a progress report. A higher value of `verbose` will give increasingly more information at increasingly shorter intervals.

In [None]:
with Parallel(n_jobs=2, verbose=10) as parallel_pool:
    parallel_results = parallel_pool(tasks)

Notice that the `joblib` library tells that it is adjusting the batch size automatically. The batch size is the number of tasks it dynamically allocates to the workers. By default, it starts with a batch size of one. This means that one task (in our case on square-root calculation) is assigned to a worker, and when it finishes, it will be assigned a next task. Assigning new tasks to workers incurs **overhead**. In this case, calculating a square root is very fast, which is detected by the library and it increases the batch size by a factor of two. This means that a batch of two tasks are given to a worker at the same time.

Notice that a fixed batch size can be used by specifying the attribute `batch_size` in the `Parallel` class.

In [None]:
with Parallel(n_jobs=2, verbose=10, batch_size=1000) as parallel_pool:
    parallel_results = parallel_pool(tasks)

Let us increase increase the verbosity even more.

In [None]:
with Parallel(n_jobs=2, verbose=50) as parallel_pool:
    parallel_results = parallel_pool(tasks)

Notice that the output is now with a white background instead of a red background. For a verbosity higher than fifty, the output is printed to `stdout`. This might be useful for advanced use cases where one needs to have control over the output stream.

Also, you can see the effect of the batch size increase in the output: the number of tasks completed indeed increases with the batch size.

In [None]:
# Actividad:

def my_task(n):
    my_sum = 0
    for m in range(n):
        my_sum += m
    return my_sum

tasks = ...

In [None]:
with Parallel(n_jobs=2, verbose=10) as parallel_pool:
    parallel_results = parallel_pool(tasks)