# Part 1: Multiprocessing

(copy final text here)

In this first mini-tutorial, we are going to have a look at Python's `multiprocessing` module. This is a built-in module within the core Python modules and does not need any further installation.

Multiprocessing is a useful approach to parallelism where you can divide a task into separate sub-tasks that require minimal communication between tasks during computation time. Multiprocessing avoids the Python GIL issue by creating separate operating-system level processes, each running an instance of a Python interpreter. If tasks require commumication between each OS-level process, the multiprocessing module has functionality to coordinate this. However, multiprocessing that requires significant communications between tasks is likely to incur a lot of additional overhead.

A common use of the `multiprocessing` module is to parallelise a task over a set of operating system processes for a CPU-bound problem. I/O bound problems can also be solved using `multiprocessing`, though there are other alternatives for this which are more appropriate. (We do not cover them in these tutorials, but you may wish to investigate `asyncio`, which is standard from Python 3.4 onwards.)

In this example, we are going to introduce a set of examples that explore the multiprocessing module by calculating pi using a Monte Carlo approach. It's a simple problem that parallelises easily, although it may not be the most efficient way of actually calculating pi in practice! 

## Estimating Pi with a parallel Monte Carlo method

An interesting way to calculate pi is to imagine throwing darts or arrows at a target with a circle printed on it. If we assume where we hit on the target is random (we are not veryu good archers or darts players) then the relationship between the number of arrows hitting inside the circle compared to outside the circle can be used to help us estimate pi.

The workload can be split evenly across a number of processes, each one running a separate Python instance, on a separate CPU core.

To get a good estimate of pi using this method, we need to through around 10,000 darts at our target, which will give us an estimate to the first three decimal places. There are of course much better methods for estimating pi, but this is a nice example of using the `multiprocessing` module.

With the Monte Carlo method, we can use the Pythagorean principle to test if our dart has landed inside the circle.

`sqrt(x^2 + y^2) <= 1^2`

https://en.wikipedia.org/wiki/Pythagorean_theorem

Since we are using the _unit circle_ (The circle segment selected by drawing a square around the circle from the centre - basically a quarter of the full circle) we can simplify this further by taking out the square root operation:

`x^2 + y^2 <= 1`

The code to calculate this is as follows:



In [2]:
def estimate_number_points_in_quarter_circle(number_estimates):
    number_trials_in_quarter_circle = 0
    for step in xrange(int(number_estimates)):
        x = random.uniform(0, 1)
        y = random.uniform(0, 1)
        is_in_unit_circle = x * x + y * y <= 1.0
        number_trials_in_quarter_circle += is_in_unit_circle
    return number_trials_in_quarter_circle

This is the Python implementation of our pi estimator using the circle method.

To solve this problem with parallelism, running a simulation with 10,000 dart throws, we could apportion the work between the number of CPU cores we have available to us, and do the computation in parallel. So, on a four-core CPU system, we could do 2,500 dart throws on each CPU core. 

Now we need to use the `multiprocessing` module to apportion our work in the function above between separate parallel processes.

In [3]:
from multiprocessing import Pool

number_samples_in_total = 1e8
number_parallel_block = 4