# Python Multiprocessing

## Learning Objectives

By the end of this lesson, learners will be able to:

- Understand the concept of embarrassingly parallel problems and how fractal generation fits into this category.
- Set up a serial approach for generating a fractal, such as the Julia set, using complex numbers and grid creation.
- Implement a function in Python to calculate the Julia set convergence for each element in the complex grid.
- Implement Python's `multiprocessing` library to parallelize a fractal generation task within a single code instance.
- Set up a pool of worker processes using `Pool(processes=n_processes)` and delegate tasks across these processes with the `p.map()` function.
- Use `functools.partial` to manage function parameters that remain constant across parallel tasks, optimizing code reuse.
- Divide a computational grid into slices and assign each slice to a worker process to handle independently.
- Close a pool of processes in Python's multiprocessing model once tasks are completed, resuming the main program.
- Evaluate the performance of the multiprocessing approach by timing code execution with varying numbers of slices and processes, and compare results with the serial version in `fractal_complete.py`.


## CPU bound example with Python multiprocessing

The problem we will attempt to solve is constructing a fractal. This kind of problem is often known as "embarrassingly parallel" meaning that each element of the result has no dependency on any of the other elements, meaning that we can solve this problem in parallel without too much difficulty. \Let's get started by creating a new script - `parallel_fractal.py`:

### Setting up our problem

Let's first think about our problem in serial - we want to construct the [Julia set](https://en.wikipedia.org/wiki/Julia_set) fractal, so we need to create a grid of complex numbers to operate over. We can create a simple function to do this:

```python
# fractal.py
import numpy as np

def complex_grid(extent, n_cells, grid_range):
    mesh_range = np.arange(-extent, extent, extent/ncells)
    x, y = np.meshgrid(grid_range * 1j, grid_range)
    z = x + y

    return z
```

Now, we can create a function that will calculate the Julia set convergence for each element in the complex grid:

```python
import warnings

...

def julia_set(grid):

    fractal = np.zeros(np.shape(grid))

    # Iterate through the operation z := z**2 + c.
    for j in range(num_iter):
        grid = grid ** 2 + c
        # Catch the overflow warning because it's annoying
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            index = np.abs(grid) < np.inf
        fractal[index] = fractal[index] + 1

    return fractal
```

This function calculates how many iterations it takes for each element in the complex grid to reach infinity (if ever) when operated on with the equation `x = x**2 + c`. The function itself is not the focus of this exercise as much as it is a way to make the computer perform some work! Let's use these functions to set up our problem in serial, without any parallelism:

```python

...

c = -0.8 - 0.22 * 1j
extent = 2
cells = 2000

grid = complex_grid(extent, cells)
fractal = julia_set(grid, 80, c)
```

If we run the python script (`python fractal.py`) it takes a few seconds to complete (this will vary depending on your machine), so we can already see that we are making our computer work reasonably hard with just a few lines of code. If we use the `time` command we can get a simple overview of how much time and resource are being used:

```
$ time python parallel_fractal_complete.py
python parallel_fractal_complete.py  5.96s user 3.37s system 123% cpu 7.558 total
```



```{note}
 We can also visualise the Julia set with the code snippet:
`
import matplotlib.pyplot as plt

...

plt.imshow(fractal, extent=[-extent, extent, -extent, extent], aspect='equal')
plt.show()
`
but doing so will impact the numbers returned when we time our function, so it's important to remember this before trying to measure how long the function takes.
```

### Download Complete Serial File 
[Download complete serial fractal example file](complete_files/fractal_complete.py)


### Parallelising the problem with multiprocessing

In `multiprocessing_fractal_complete.py`, the previous fractal example has been implemented using `multiprocessing` from the python standard library.

For the multi-processing model, we set up a *pool* of workers, `Pool(processes=n_processes)`, assigned to `p`.
The work can then be delegated out to these workers using the [`p.map()`](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map) method.
This `map` method (equivalent to the builtin [`map`](https://docs.python.org/3/library/functions.html#map)) takes two arguments: a function to run (our fractal function), and a collection of inputs to pass to the function (different regions of the grid to be processed in parallel).

```{note}
To pass in the parameters that don't change over grid regions, we've used [`functools.partial`](https://docs.python.org/3/library/functools.html#functools.partial):

``` python
partial_julia_set = partial(julia_set, num_iter=80, c=-0.83 - 0.22 * 1j)
```

This would be essentially equivalent to defining a new function:

``` python
def partial_julia_set(grid):
    return julia_set(grid, num_iter=80, c=-0.83  -0.22 * 1j)
```

You may be familiar with *lambda* expressions, but these cannot be passed in to the `multiprocessing.Pool.map` function.
In this script, we have split up the grid into `n_slices` vertical slices and assigned a pool of of `n_processes` workers.
These workers each take a slice, calculate the result saving the output into `fractals`, then work on a new slice.
When there are no more slices to work on, the pool is *closed* and the program resumes.
We can see how we can speed up the code by timing the full script running with different values of `n_slices` and `n_processes`.
Compare these numbers against the previous serial example in `fractal_complete.py`.

# Complete File
[Download complete multiprocessing_fractal example file](complete_files/multiprocessing_fractal_complete.py)

In [None]:
from jupyterquiz import display_quiz
display_quiz("questions/summary_multiprocessing.json")