# Multiprocessing and Multithreading

#### cloned from https://github.com/brianjp93/Multiprocessing-tuts

###### Goal
<ul>
<li>Speed up code by using multiple processes</li>
</ul>

###### Options
<ul>
<li>Multithreading</li>
<li>Multiprocessing</li>
</ul>

#### Multithreading

Can use when
<ul>
<li>Lots time waiting around for a response</li>
<ul>
<li>Network Requests - http get, post, put</li>
</ul>
<li>
Lots of I/O (Read, Write, Send, Recv...)
</li>
</ul>

##### Still Bound by Global Interpreter Lock

## CPU Bound Threading

In [19]:
from __future__ import division
from threading import Thread
import multiprocessing
from multiprocessing import Process
import time

Make a list with 10 million 10's.

In [2]:
myLen = 10000000*5
myList = [10]*myLen
myList[:15]

[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]

In [3]:
def squared(num):
    num**2

def squareList(lst):
    for i in lst:
        squared(i)    

# Square all of the numbers in the list.

Just a regular for loop.

In [4]:
def p_plain(list_len):
    myLen = list_len
    myList = [10]*myLen
    
    start = time.time() #  Get current time
    squareList(myList)
    serialprocesstime = time.time() - start
    print("Squaring 10 million numbers took {} seconds.".format(round(serialprocesstime,2)))
    
    return serialprocesstime

# Squaring with multiprocessing

Let's do the same squaring function we did before, but this time with multiprocessing

According to the python docs, pool cannot be used in the interactive interpreter.

It seems that this extends to an ipython notebook.
<a href="https://docs.python.org/3/library/multiprocessing.html#using-a-pool-of-workers">python pool docs</a>

In [5]:
def p_multi(list_len):
    num_processes = multiprocessing.cpu_count() # number of cores?
    
    process_list = [] # jobs to be run simultaneously
    
    # recreate a list of 10 mil numbers
    myList = [10]*(list_len//num_processes) # divide the iterable into the number of cores available
    
    for p in range(num_processes):
        p = Process(target=squareList, args=(myList,)) # create two iterations of your target function ("target=myFunc")
        process_list.append(p)

    start = time.time()
    for p in process_list:
        p.start()

    for p in process_list:
        p.join() # unsure, but maybe puts everything back together again

    squareprocesstime = time.time() - start
    print("Squaring 10 million numbers took {} seconds with {} processes.".format(round(squareprocesstime,2),num_processes))
    
    return squareprocesstime

# Limitations

Length of job needs to be long enough so that the cost of setting up all the multiprocessing
can be outweighed by the speed increases.

e.g., 
    - squaring a list of 10K numbers will be slightly slower with mp
    - squaring a list of 100K numbers will be about even
    - squaring a list of 1M numbers or more and the mp will begin to be faster
    - perhaps the max speed boost is the number of cores? (e.g. 2x boost)

In [17]:
myLen = 15**5
speed_increase = p_plain(myLen) / p_multi(myLen)
print(f"Multiprocessing was {round(speed_increase,2)}X faster.")

Squaring 10 million numbers took 0.21 seconds.
Squaring 10 million numbers took 0.13 seconds with 4 processes.
Multiprocessing was 1.65X faster.
