Now we are going to look at how python can be used for concurrent programming. Even though Python is somewhat limited by GIL we have some basic level of cuncurrency support. 

Currency is supported in python via the following concepts
1. `threads`
2. `asyncio`
3. `process`


We'll look at an example were downloading items from the internet can be made faster with threads.

In [4]:
import os
import time
import sys
from pathlib import Path

import requests

In [11]:
POP20_CC = ('CN IN US ID BR PK NG BD RU JP '
'MX PH VN ET EG DE IR TR CD FR').split()

In [10]:
BASE_URL = 'http://flupy.org/data/flags'

DEST_DIR = Path('download/')
if not DEST_DIR.exists():
    DEST_DIR.mkdir()
    print('download dir created!')

In [26]:
def save_flag(img, filename):
    path = DEST_DIR/filename
    with open(path, 'wb') as fp:
        fp.write(img)
        
def get_flag(cc):
    url = f"{BASE_URL}/{cc.lower()}/{cc.lower()}.gif"
    resp = requests.get(url)
    return resp.content

def show(text):
    print(text, end=' ')
    sys.stdout.flush()
    
def download_many(cc_list):
    for cc in sorted(cc_list):
        img = get_flag(cc)
        show(cc)
        save_flag(img, cc.lower()+'.gif')
        
    return len(cc_list)

def main(download_many):
    t0 = time.time()
    count = download_many(POP20_CC)
    elapsed = time.time() - t0
    msg = '\n{} flags downloaded in {:.2f}s'
    print(msg.format(count, elapsed))
    

In [27]:
main(download_many)

BD BR CD CN DE EG ET FR ID IN IR JP MX NG PH PK RU TR US VN 
20 flags downloaded in 4.61s


### Downloading with concurrent.futures

The main features of concurrent.futures are `ThreadPoolExecutor` and `ProcessPoolExecutor`. These abstract the inner workings of threads so we can work with a simple api. 

In [22]:
from concurrent import futures
MAX_WORKERS = 20

def download_one(cc):
    img = get_flag(cc)
    show(cc)
    save_flag(img, cc.lower()+'.gif')
    return cc

def download_many(cc_list):
    workers = min(MAX_WORKERS, len(cc_list))
    with futures.ThreadPoolExecutor(workers) as executor:
        res = executor.map(download_one, sorted(cc_list))
        
    return len(list(res))

In [25]:
main(download_many)

BR BD EG FR ET CD IN DE ID MX NGRU  IRCN  PHJPTR   US VN PK 
20 flags downloaded in 0.59s


As you see there is a major speedup simply by using concurrency.

### Where are the Futures?

Futures encapsulate pending operations so that they can be put in queues, their state of completion can be queried, and their results (or exceptions) can be retrieved when available.

They are similar to `promise` object in javascript. 

Future instances are used in both concurrent.futures.Future and asyncio.Future. Both of the support `.done()`, `.add_done_callback()` and `.result()`. 

Several functions in both libraries return futures; others use them in their implementations but most of these are hidden from the user.


To get a practical look into futures we'll rewrite the example above with futures.

In [30]:
def download_many(cc_list):
    cc_list = cc_list[:5]
    with futures.ThreadPoolExecutor(max_workers=3) as executor:
        to_do = []
        for cc in sorted(cc_list):
            future = executor.submit(download_one, cc)
            to_do.append(future)
            msg = f'Scheduled for {cc}: {future}'
            print(msg)
            
        results = []
        for future in futures.as_completed(to_do):
            res = future.result()
            print(f'{future} result :{res}')
            results.append(res)
            
        return len(results)

In [31]:
main(download_many)

Scheduled for BR: <Future at 0x7f68251fced0 state=running>
Scheduled for CN: <Future at 0x7f6824bb28d0 state=running>
Scheduled for ID: <Future at 0x7f6824b0c390 state=running>
Scheduled for IN: <Future at 0x7f682517ced0 state=pending>
Scheduled for US: <Future at 0x7f682517c690 state=pending>
CN BR <Future at 0x7f6824bb28d0 state=finished returned str> result :CN
<Future at 0x7f68251fced0 state=finished returned str> result :BRID 
<Future at 0x7f6824b0c390 state=finished returned str> result :ID
US <Future at 0x7f682517c690 state=finished returned str> result :US
IN <Future at 0x7f682517ced0 state=finished returned str> result :IN

5 flags downloaded in 0.58s


Now stictly speaking we are still downloading over a single process due to the limitation from GIL. But we still get a good boost due to the fact that the is a I/O bound operation. 

For I/O bound operations the python interpreter frees the GIL and that mean other threads can execute. Even functions like `time.sleep()` releases the GIL.

However if you want to leverage all the CPU cores you can use `ProcessPoolExecutor`. 

In [34]:
def download_many(cc_list):
    with futures.ProcessPoolExecutor() as executor:
        res = executor.map(download_one, sorted(cc_list))
    
    return len(list(res))

In [35]:
main(download_many)

BDBRCNCD    EGDEET   FR IDIN  JP IR MXNG  PK PH RU TRUS  VN 
20 flags downloaded in 1.85s


As you noticed this is not as effective as `threads` mainly because in my system there are only 4 workers. For other CPU intensive tasks 