## Parallelization

For the `Parallelization` in this framework, we need two define to things:

- A Function receiving exactly one parameters and return an output
- A Generator that yields inputs

In [1]:
# FUNCTION
def f(x):
    return sum([x + k for k in range(1000)])

#GENERATOR
def g():
    for i in range(10):
        yield i

Then, we can use the parallelization implementation to run the function for each paramter returned in the generator.

### Serial

This does not parallelize but just runs a `for` loop. This is often used as a dummy by default.

In [2]:
from azcausal.core.parallelize import Serial

parallelize = Serial()

parallelize(f, g())

[499500,
 500500,
 501500,
 502500,
 503500,
 504500,
 505500,
 506500,
 507500,
 508500]

### Pool

The `Pool` uses the Python implementation of threads or processes to run tasks in parallel.

In [3]:
from azcausal.core.parallelize import Pool

# please just for mode `thread` or `process`
mode = 'thread'
 
# the number of workers (by default #cores-1)
max_workers = None

parallelize = Pool(mode=mode, max_workers=max_workers)

parallelize(f, g())

[499500,
 500500,
 501500,
 502500,
 503500,
 504500,
 505500,
 506500,
 507500,
 508500]

### Joblib

Uses the well-known `Joblib` implementation for parallelization.

In [4]:
from azcausal.core.parallelize import Joblib

n_jobs = None

parallelize = Joblib(n_jobs=n_jobs)

parallelize(f, g())

[499500,
 500500,
 501500,
 502500,
 503500,
 504500,
 505500,
 506500,
 507500,
 508500]