## Concurrency with Futures
This chapter focuses on the concurrent.futures library introduced in Python 3.2.

## A Sequential Download Script

In [6]:
import os
import time
import sys
import requests

POP20_CC = ('CN IN US ID BR PK NG BD RU JP MX PH VN ET EG DE IR TR CD FR').split()
BASE_URL = 'http://flupy.org/data/flags'
DEST_DIR = 'downloads/'

def save_flag(img, filename):
    path = os.path.join(DEST_DIR, filename)
    with open(path, 'wb') as fp:
        fp.write(img)

def get_flag(cc):
    url = '{}/{cc}/{cc}.gif'.format(BASE_URL, cc=cc.lower())
    resp = requests.get(url)
    return resp.content

def show(text):
    print(text, end=' ')
    sys.stdout.flush()

def download_many(cc_list):
    for cc in sorted(cc_list):
        image = get_flag(cc)
        show(cc)
        save_flag(image, cc.lower() + '.gif')
    return len(cc_list)

def create_downloads_folder():
    if not os.path.isdir("downloads"):
        os.mkdir("downloads")

def main(download_many):
    create_downloads_folder()
    t0 = time.time()
    count = download_many(POP20_CC)
    elapsed = time.time() - t0
    msg = '\n{} flags downloaded in {:.2f}s'
    print(msg.format(count, elapsed))

if __name__ == '__main__':
    main(download_many)


BD BR CD CN DE EG ET FR ID IN IR JP MX NG PH PK RU TR US VN 
20 flags downloaded in 18.58s


This serves as a baseline for comparing the other scripts. 

## Downloading with concurrent.futures
The main features of the concurrent.futures package are the ThreadPoolExecutor and ProcessPoolExecutor classes.

In [8]:
from concurrent import futures

MAX_WORKERS = 20

def download_one(cc):
    image = get_flag(cc)
    show(cc)
    save_flag(image, cc.lower() + '.gif')
    return cc

def download_many(cc_list):
    workers = min(MAX_WORKERS, len(cc_list))
    with futures.ThreadPoolExecutor(workers) as executor:
        res = executor.map(download_one, sorted(cc_list))
    return len(list(res))

if __name__ == '__main__':
    main(download_many)

BD ID DEEGFR CN JPRUBR TR  MX INVNNG ET    IRPKUSCD PH      
20 flags downloaded in 1.42s


That's a substantial improvement in download speed. The easiest way to implement the downloads concurrently, using the ThreadPoolExecutor.map method

## Where Are the Futures?
As of Python 3.4, there are two classes named Future in the standard library: concurrent.futures.Future and asyncio.Future. They serve the same purpose: an instance of either Future class represents a deferred computation that may or may not have
completed. This is similar to the Deferred class in Twisted, the Future class in Tornado, and Promise objects in various JavaScript libraries.

Strictly speaking, none of the concurrent scripts we tested so far can perform downloads in parallel. The concurrent.futures examples are limited by the GIL, and the flags_asyncio.py is single-threaded.

## Blocking I/O and the GIL
The CPython interpreter is not thread-safe internally, so it has a Global Interpreter Lock (GIL), which allows only one thread at a time to execute Python bytecodes. That’s why a single Python process usually cannot use multiple CPU cores at the same time.

However, all standard library functions that perform blocking I/O release the GIL when waiting for a result from the OS. This means Python programs that are I/O bound can benefit from using threads at the Python level: while one Python thread is waiting for a response from the network, the blocked I/O function releases the GIL so another thread can run.

That’s why David Beazley says: 'Python threads are great at doing nothing.'

## Launching Processes with concurrent.futures
Both ProcessPoolExecutor and ThreadPoolExecutor implement the generic Executor interface, so it’s very easy to switch from a thread-based to a process-based solution using concurrent.futures.

In [12]:
import math
from concurrent import futures

MAX_WORKERS = 20

input_list = range(20)


def run_thread_pool(input_list):
    workers = min(MAX_WORKERS, len(input_list))
    with futures.ThreadPoolExecutor(workers) as executor:
        res = executor.map(math.sqrt, input_list)
    return res


def run_process_pool(input_list):
    with futures.ProcessPoolExecutor() as executor:
        res = executor.map(math.sqrt, input_list)
    return res


In [13]:
%timeit run_thread_pool(input_list)

18 ms ± 3.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [14]:
%timeit run_process_pool(input_list)

547 ms ± 136 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


There is overhead is using different processes. The value of ProcessPoolExecutor is in CPU-intensive jobs. workers is an optional argument in ProcessPoolExecutor, and most of the time we don’t use it—the default is the number of CPUs returned by os.cpu_count(). 

## Experimenting with Executor.map
