# Ayudantía 1 - Notebook 3
### Profesor: Elwin van 't Wout
### Ayudante: Alberto Almuna Morales (alberto.almuna@uc.cl)

The library `joblib` provides functionality for parallel computing. In this notebook, let us look into the information that can be printed.

In [1]:
import numpy as np
from joblib import Parallel, delayed

Let us create the tasks to be performed in parallel by the workers: calculating the square root of several numbers.

In [2]:
tasks = [delayed(np.sqrt)(i) for i in range(100000)]

Upon creating a `Parallel` object with `joblib`, the attribute `verbose` can be set to a number. The higher the number, the more output will be provided by the library.

In [4]:
with Parallel(n_jobs=2, verbose=0) as parallel_pool:
    parallel_results = parallel_pool(tasks)

For a value of zero, no output will be printed. Positive values provide output.

In [5]:
with Parallel(n_jobs=2, verbose=1) as parallel_pool:
    parallel_results = parallel_pool(tasks)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done 12284 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done 100000 out of 100000 | elapsed:    0.2s finished


The library prints the number of workers used and every now and then a progress report. A higher value of `verbose` will give increasingly more information at increasingly shorter intervals.

In [6]:
with Parallel(n_jobs=2, verbose=10) as parallel_pool:
    parallel_results = parallel_pool(tasks)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0150s.) Setting batch_size=2.
[Parallel(n_jobs=2)]: Done   4 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0170s.) Setting batch_size=4.
[Parallel(n_jobs=2)]: Done  16 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0060s.) Setting batch_size=8.
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=16.
[Parallel(n_jobs=2)]: Done  44 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0070s.) Setting batch_size=32.
[Parallel(n_jobs=2)]: Done 156 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0070s.) Setting batch_size=64.
[Parallel(n_jobs=2)]: Batch computation too fast (0.0070s.) Setting batch_size=128.
[Parallel(n_jobs=2)]: Done 508 tasks     

Notice that the `joblib` library tells that it is adjusting the batch size automatically. The batch size is the number of tasks it dynamically allocates to the workers. By default, it starts with a batch size of one. This means that one task (in our case on square-root calculation) is assigned to a worker, and when it finishes, it will be assigned a next task. Assigning new tasks to workers incurs **overhead**. In this case, calculating a square root is very fast, which is detected by the library and it increases the batch size by a factor of two. This means that a batch of two tasks are given to a worker at the same time.

Notice that a fixed batch size can be used by specifying the attribute `batch_size` in the `Parallel` class.

In [7]:
with Parallel(n_jobs=2, verbose=10, batch_size=1000) as parallel_pool:
    parallel_results = parallel_pool(tasks)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done 1004 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done 4004 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done 9004 tasks      | elapsed:    0.1s
[Parallel(n_jobs=2)]: Done 14004 tasks      | elapsed:    0.1s
[Parallel(n_jobs=2)]: Done 21004 tasks      | elapsed:    0.1s
[Parallel(n_jobs=2)]: Done 28004 tasks      | elapsed:    0.1s
[Parallel(n_jobs=2)]: Done 37004 tasks      | elapsed:    0.2s
[Parallel(n_jobs=2)]: Done 46004 tasks      | elapsed:    0.2s
[Parallel(n_jobs=2)]: Done 57004 tasks      | elapsed:    0.2s
[Parallel(n_jobs=2)]: Done 68004 tasks      | elapsed:    0.3s
[Parallel(n_jobs=2)]: Done 81004 tasks      | elapsed:    0.3s
[Parallel(n_jobs=2)]: Done 94004 tasks      | elapsed:    0.4s
[Parallel(n_jobs=2)]: Done 99093 tasks      | elapsed:    0.4s
[Parallel(n_jobs=2)]: Done 100000 out of 100000 | elapsed:    0.4s finished


Let us increase increase the verbosity even more.

In [8]:
with Parallel(n_jobs=2, verbose=50) as parallel_pool:
    parallel_results = parallel_pool(tasks)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0093s.) Setting batch_size=2.
[Parallel(n_jobs=2)]: Done   2 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done   3 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done   4 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done   6 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0069s.) Setting batch_size=4.
[Parallel(n_jobs=2)]: Done   8 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done  10 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done  12 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done  16 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0063s.) Setting batch_size=8.
[Parallel(n_jobs=2)]: Done  20 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done  24 tasks      | elapsed:    0.0s
[Paralle

Notice that the output is now with a white background instead of a red background. For a verbosity higher than fifty, the output is printed to `stdout`. This might be useful for advanced use cases where one needs to have control over the output stream.

Also, you can see the effect of the batch size increase in the output: the number of tasks completed indeed increases with the batch size.

In [5]:
# Actividad:

def my_task(n):
    my_sum = 0
    for m in range(n):
        my_sum += m
    return my_sum

tasks = [delayed(my_task)(i) for i in range(100000)]

In [6]:
with Parallel(n_jobs=2, verbose=10) as parallel_pool:
    parallel_results = parallel_pool(tasks)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0043s.) Setting batch_size=2.
[Parallel(n_jobs=2)]: Done   4 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0101s.) Setting batch_size=4.
[Parallel(n_jobs=2)]: Done  16 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=8.
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=16.
[Parallel(n_jobs=2)]: Done  44 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=32.
[Parallel(n_jobs=2)]: Done 156 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=64.
[Parallel(n_jobs=2)]: Batch computation too fast (0.0080s.) Setting batch_size=128.
[Parallel(n_jobs=2)]: Done 508 tasks     