### Parallelism in Python

we will look at parallel programming in more detail and see which 
facilities Python offers us to make our code use more than one CPU or CPU core at the 
time (but always within the boundaries of a single machine). 

The main goal here will be speed for CPU-intensive problems, and responsiveness for I/O-intensive code.

Let's start by writing a simple program that makes use of multiple threads to 
download data from the Web. 

In [2]:
from threading import Thread
from queue import Queue
import urllib.request

In [36]:
links=[]
links.append('https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-size-bands-csv.csv')
links.append('https://www.stats.govt.nz/assets/Uploads/Household-living-costs-price-indexes/Household-living-costs-price-indexes-September-2018-quarter/Download-data/household-living-costs-price-indexes-sep18qtr-time-series-indexes.csv')
links.append('https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-csv.csv')

In [37]:
links

['https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-size-bands-csv.csv',
 'https://www.stats.govt.nz/assets/Uploads/Household-living-costs-price-indexes/Household-living-costs-price-indexes-September-2018-quarter/Download-data/household-living-costs-price-indexes-sep18qtr-time-series-indexes.csv',
 'https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-csv.csv']

In [43]:
def get_content(act_url, outq):
    with urllib.request.urlopen(act_url) as res:
        body = res.read()
    outq.put((act_url, body))

In [52]:
outputq = Queue()

In [53]:
for link in links:
        t = Thread(target=get_content,
                   kwargs={'act_url': link,
                           'outq': outputq})
        t.daemon = True
        t.start()

In [54]:
for _ in links:
    link, body = outputq.get()
    print(link, body[:100])
    outputq.task_done()
outputq.join()

https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-size-bands-csv.csv b'year,industry_code_ANZSIC,industry_name_ANZSIC,rme_size_grp,variable,value,unit\r\n2011,A,"Agriculture'
https://www.stats.govt.nz/assets/Uploads/Household-living-costs-price-indexes/Household-living-costs-price-indexes-September-2018-quarter/Download-data/household-living-costs-price-indexes-sep18qtr-time-series-indexes.csv b'hlpi_name,series_ref,quarter,hlpi,nzhec,nzhec_name,nzhec_short,level,index,change.q,change.a\r\nAll ho'
https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2017-financial-year-provisional/Download-data/annual-enterprise-survey-2017-financial-year-provisional-csv.csv b'Year,Industry_aggregation_NZSIOC,Industry_code_NZSIOC,Industry_name_NZSIOC,Units,Variable_code,Varia'


In [55]:
# no threads

import queue
q = queue.Queue()
t0 = time(); [get_content(p, q) for p in links]; dt = time() - t0; print(dt)

2.706188440322876


In [56]:
# threads

t0 = time();

for link in links:
        t = Thread(target=get_content,
                   kwargs={'act_url': link,
                           'outq': outputq})
        t.daemon = True
        t.start()
        
for _ in links:
    link, body = outputq.get()
    #print(link, body[:100])
    outputq.task_done()
outputq.join()

dt = time() - t0; print(dt)

0.9815263748168945
