# CH 17 - Concurrency

## TOC<a id='toc'></a>
* [Ch17 Notes](#ch17_notes)

### CH17 Notes <a id='ch17_notes'></a>
[toc](#toc)
### Concurrency

* the `requests` library by Keneth Reitz is more powerful and easier to use than urllib.request module from python 3 std library
    * is considered a model Pythonic API

## Downloading with concurrent.futures
* main features are `ThreadPoolExecutor` and `ProcessPoolExecutor` classes
    - implement interfaces that allow you to submit callables for execution in different threads or processes respectively
    - they manage an internal pool of threads or processes and a queue of tasks to be exectuted.

example code:
```
def download_many(cc_list):
    workers = min(MAX_WORKERS, len(cc_list))
    with futures.ThreadPoolExecutor(workers) as executor:
        res = executor.map(download_one, sorted(cc_list))
        
       return len(list(res))
```

* `executor.__exit__` method will call `executor.shutdown(wait=True)`, which will block untill all threads are done.
* map function returns a generator that can be iterated over to retrieve the value returned by each function.
    - if any function raised an exception, the exception would be raised when calling next on that generator
* very often, body of the loop is refactored into a separate function when calling either sequentially or concurrently.

## What are Futures?
* as of python 3.4 two Futures classes: `concurrent.futures.Futures` and `asyncio.Future`
       - an instance of either class represents a deferred computation that may or may not have completed.
       - their state of completion can be queried, and their results (or exceptions) can be retrieved when avaialble.
* You and I should not create them! They should be instantiated exclusively by concurrency framework.
    - they are instantiated when some work is scheduled - this typically returns a future.
* both have a .done() method to check if it is done - but usually you use the `.add_done_callback()` and provide a callable instead.
* there is also a `.result()` method, which if done, returns result, or re-reaises any exceptions. 
    - if future is not done, result behavior is very different for the two future classes: concurrency.future blocks and has optional timeout, asyncio does not support timeout, and preferred is to use yield from - which doesn't work with concurrency.
* Executor.map returns an iterator in which `__next__` calls the `result` method of each future.
* concurrent.futures has an .as_completed() function that takes an iterable of futures and returns an iterator that yiedls futures as they are done.
    - they are yielded in whatever order they finish.

## Blocking I/O and the GIL
* The CPython interpreter is not thread-safe internally, so it has a Global Interpreted Lock (GIL), which allows only one thread at a time to executed python bytecodes. That's why a single python processs usually cannot use multiple CPU cores at the same time.
    - when we write python code, we have no control over the GIL, but a built-in function, or an extension written in C can release the GIL. (This complciates the code of the library considerably, so most authors don't do it.)
* however all standard libray functions that perform blockin I/O release the GIL when waiting for a result from the OS. So while one Python thread is waiting for a response from the network (or other?), the blocked I/O function releases the GIL so another thread can run.
    - time.sleep() function also releases the GIL.