<hr style="border-width:4px; border-color:coral"></hr>

# Practice

<hr style="border-width:4px; border-color:coral"></hr>

This notebook will cover 

* Getting timing results

* Running large number of independent jobs using a Queue.  



In [2]:
import multiprocessing as mp

Starting with Python 3.8, the default start method is 'spawn'.  However, this does not work well in Jupyter notebooks.   The following code will set the start method to `fork`. 

In [4]:
start_method =  mp.get_start_method()
if start_method != 'fork':
    print(f"Start method was {start_method}.  Setting it to 'fork'")
    mp.set_start_method('fork')

Test your code with this small example

In [5]:
def debug():
    print("In process debug")
    
p = mp.Process(target=debug)    

p.start()

p.join()

print("All Done!")

In process debug
All Done!



## Timing results using `timeit`

<hr style="border-width:4px; border-color:black"></hr>

One of the main reasons to use multiprocessng is to speed up our codes.  To see how well our multiprocessing is working, we need to be able to determine execution time for a give job.   We will do this using the `timeit` module.  

You can read more on `timeit` in the notebook `00_timeit`, available on Canvas. 

So see how time it works, let's time a process who execution time we know exactly. 

### Timing a sleep timer

In [6]:
import time

In [9]:
%%timeit -n 1 -r 1

time.sleep(1)

1.01 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


To get more accurate statistics, it may be advisable to do multiple runs of the same code and then take the best timing results.  This can be done with the `timeit` flags `-r` (runs) and `-n` (loops). Timeit will report the *averge* time from a series of consecutive runs. 

In [11]:
%%timeit -n 5 -r 1

time.sleep(1)

1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 5 loops each)


If we don't specify any parameters, `timeit` will chose default values, which might take a while.

In [14]:
%%timeit

time.sleep(1)

1 s ± 1.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Timing processes

Run the code below for $P = 1,2,4,8,\dots,1024$ jobs.  Store the timing results in an array.  

What behavior do you see?  

In [28]:
from numpy import *

In [26]:
%%timeit -n 1 -r 1

def naptime():
    time.sleep(5)  

P = 1024
jobs = []
for i in range(P):
    p = mp.Process(target=naptime)
    jobs.append(p)   

for p in jobs:
    p.start()
    
for p in jobs:
    p.join()     # Wait for each job to join 
        
print("All done ")

All done 
12.5 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [29]:
# Store timing results for P = 1,2,4,8,16,.... number of processes.  
T = []

Plot a loglog plot of *latency* vs. P.  What do you observe? 

In [None]:
figure(1)
clf()

P = 2**arange(11)   # 1,2,4,8,16,..,1024

# Plot latency vs. P



## Queues

<hr style="border-width:4px; border-color:black"></hr>


Suppose we have $N$ tasks that need to be completed.  These tasks might be

* Processing $N$ data files, 

* Multiplying a matrix by $N$ vectors, 

* Creating $N$ plots

* Run a simulation with $N$ different choices of parameters

Furthermore, suppose these tasks are relatively independent. The order in which the tasks are carried out doesn't matter, and the tasks do not need to communicate with each other. Furthermore, assume that we expect each task to take roughly the same amount of time.   

Assume that we have $P$ workers, $P \ll N$ (e.g. $P$ is *much* smaller than $N$) to carry out these tasks. Then we have to distribute the $N$ tasks more or less evenly to the $P$ workers.  

Below we do this first using $P$ pipes to communicate tasks to each worker. 

### Create a task list

In [30]:
# Create task list        
N = 10    
task_list = [f"Task {i}" for i in range(N)]    
print(task_list)

['Task 0', 'Task 1', 'Task 2', 'Task 3', 'Task 4', 'Task 5', 'Task 6', 'Task 7', 'Task 8', 'Task 9']


### Define a worker

In [31]:
def do_task_pipe(conn):
    pname = mp.current_process().name
    # ....

### Set up processses and connections

In [32]:
# Create a connection for P workers
P = 4    
pipes = [mp.Pipe() for i in range(P)]    # "list comprehension"


# TODO : Create P processes;  pass is a connections
workers = []

# TODO : Start processes

# TODO : Distribute N tasks to P workers

# TODO : Wait for processes to complete

print("Everybody is done!")

<hr style="border-width:4px; border-color:coral"></hr>

## Using a Queue

<hr style="border-width:4px; border-color:coral"></hr>

A Python Queue carries out the above.  

* A "queue", as the name suggests, is a list of tasks.  As items in queue become available, the tasks can be taken off the queue and completed.  

* Queues are built on top of pipes. 

In [None]:
queue = mp.Queue()

# Put tasks on the queue

# Remove tasks from the queue

We can use a queue instead of pipes to send tasks to worker processes. 

### Define a worker (queue)

In [33]:
def do_task_queue(queue):
    pname = mp.current_process().name
    # ....

In [34]:
# TODO : Create P processes;  pass the queue to each process
workers = []

# TODO : Start processes

# TODO :  Put tasks on the queue

# TODO : Wait for processes to complete

print("Everybody is done!")

Everybody is done!


<hr style="border-width:4px; border-color:coral"></hr>

## Shared arrays

<hr style="border-width:4px; border-color:coral"></hr>

We can create values that are shared among processes using shared values and shared arrays. 

In [39]:
import math
x = mp.Value('d',math.pi)

print(x.value)

3.141592653589793


In [40]:
def double_value(x):
    x.value *= 2

p = mp.Process(target=double_value, args=(x,))

p.start()

p.join()


print(x.value)

6.283185307179586
