# [1] [Concurrency](https://en.wikipedia.org/wiki/Concurrency_(computer_science)) in Python

A task in computer programming can be defined as a piece of job that the computer do during a period of time. Tasks consumes computer resources (memory, CPUs, I/O channels, etc.), which are limited. From a task point of view, these resources are used in a degree and following a pattern that depends on the task.

Tasks can be classified in two types: CPU-bound and I/O-bound:

1. I/O-bound tasks usually must wait for I/O communications. Waiting means that the task cannot progress until some piece of data has arrived. During this time, its a good idea to dedicate the unused resources to a different task that is not waiting for. The resources switching between tasks can be done by the OS or by the programmer. The advantage of implementing the switching is that the communication between tasks is easier and there is more control over the resources.

2. CPU-bound programs, on the flip side, are those that does not need to wait for I/O operations and keep the CPU working all the time (in fact, the execution time of the task is proportional to CPU resources). In the ideal case, it is expected that by using N identical CPUs the running time should be divided by N. Notice that in this case there is not CPU switching.

Concurrency is the action of running several tasks in an interval of time:

1. If a task A is I/O-bounded, the resources are re-assigned to a different task B when the A waits for the I/O. What the user see is that several tasks are running concurrently, although what is really happening is that only one of the tasks is active in a given moment of time.

2. If a task is CPU-bounded, to speed it up the only solution is to dedicate several CPUs, at the same time, in [parallel](https://en.wikipedia.org/wiki/Parallel_computing).

Notice that it is also possible to use paralellism for I/O-bound tasks (see [channel bonding](https://www.makeuseof.com/tag/internet-speed-channel-bonding/).

## Concurrency solutions in Python

Python provides 3 alternatives for running more than one task, concurrently:

1. [threading](https://docs.python.org/3/library/threading.html). A [thread](https://en.wikipedia.org/wiki/Thread_(computing)) is a sequence of instructions that periodically are executed, typically, in concurrency (and in parallel if the *computing context* allows that) with other threads [of the same process space](https://pymotw.com/3/threading/index.html#module-threading). However, [CPU-bound tasks are not a good fit for Python threads](http://eli.thegreenplace.net/2011/12/27/python-threads-communication-and-stopping), due to the Global Interpreter Lock (GIL) used in [CPython](https://en.wikipedia.org/wiki/CPython) is only able to run one one thread in parallel. Concurrency based on threads is also called *Pre-emptive multitasking*. In Python, threads are useful to solve IO-bound problems.

2. [multiprocessing](https://docs.python.org/3/library/multiprocessing.html).

Parallel computations in Python should be done in multiple [processes](https://en.wikipedia.org/wiki/Process_%28computing%29), not threads, because they allow the parallel execution of tasks ([multiprocessing](https://en.wikipedia.org/wiki/Multiprocessing)) if enough computing resources are available.



### [Coroutines](https://docs.python.org/3/library/asyncio-task.html#coroutines)

A [coroutine](https://en.wikipedia.org/wiki/Coroutine) is a function that voluntarly gives the CPU control (*yield*s) to a different coroutine, by explicitally indicating where to stop and if necessary, which task to wake up. Coroutines remember their running context when they resume. Coroutines are also named cooperative tasks, and therefore, concurrency based on coroutines is also called *cooperative multitasking*. 

## When to use them (specially, in Python)
In general, threads and processes are useful when the problem to solve has blocking instructions (basically, it uses I/O operations that waits for a data transference completion, such as the incomming of a packet from a socket). Coroutines are interesting when the user explicitly specify the points in the code where the execution must be transfered between tasks (coroutines).

## Threads vs coroutines in Python
In Python, the GIL only allows to run one thread at the same time. Moreover, the OS overhead produced by the thread random and fine-grain switching between threads is higher than the negligible overhead generated by the coroutines. Therefore, when the problem is divided into subtasks (coroutines) and there is a dependence between them (some must wait for data the the others provide), coroutines are more efficient. Amother advantage of coroutines is that it is not necessary to have support from the OS to implement concurrency.

## What happen with the process space
Threads and coroutines share the same process space (the same segments of memory), and therefore is trivial to share data between threads and coroutines. However, processes by definition does not share their process spaces. To overcome this pitfall, the Python Librabry provides data sharing solution between processes that are run in the same Python module. In any case, interprocess communications are slower and more difficult to implement that their thread/coroutines versions. 