# General idea

Concurrent programming is different of parallel programming. Parallel is when the process is run in two different cores at the same time. Concurrent is dealing with multiple tasks at same time, what could be done by creating a queue of processes. So, parallelism is a form of concurrency, according to Rob Pike.

Creating a thread or a process is not cheap. Usually, we put a thread in infinite loop to wait to receive data to process, creating a worker. To deal with this we need a way to send and receive data (messages) between threads.

# The jargon

- Concurrency: The ability to deal with multiple tasks, making them progress one at a time or in parallel. Also called multitasking.
- Parallelism: The ability to execute multiple operations at the same time. Requires a CPU with multiple cores, or multiple CPUs or a cluster of machines
- Execution units: objects that run code concurrently, each with their own state or a stack of independent calls. Available natively in python: process, threads or coroutines.
- Process: An instance of a program in execution, using memory and a slice of the CPU's time. Every CPU keeps hundreds of processes running with their own private memory space. Process talk to each other using pipes, sockets or files mapped in the memory. All those methods of communication only supports pure bytes, which means that python objects that will be transmitted must be serialized. That's not cheap and not all python objects can be serialized. A process can create subprocesses that will be run in isolation of the original process and other subprocesses.
- Thread: a execution unit inside the process, that share the same memory space. A process will start with a main thread, which can call the OS API to create new threads. The objects shared by threads don't need to be serialized, which can cause data corruption.
- Coroutine: a function that can suspend its own execution and continue later, using the keywords `yield` or `await`. Classical coroutines are created using generator functions and native coroutines are defined with the keyword `async def`. Coroutines are run in one thread, in general, under the supervision of an event loop. Any blocking code in a coroutine would block the execution of the other coroutines, which is different of what's done in the threads or processes. On the other hand, coroutines will consume less resources than a thread or a process.
- Queue: a data structure (first in, first out) used in the communication execution units.
- Lock: a object that shows to the execution units that some data is available or not to be read/written. Most common is a mutex (mutual exclusion).
- Contention: conflict between multiple execution unit when they need to access the same resource.

# How does it work in Python?

1. Every instance of the python interpreter is a process. You can start new processes using the `multiprocessing` or the `concurrent.futures` libs. The `subprocess` lib is used to create subprocess specially for external programs.
2. The python interpreter uses one thread to run the user code and to perform the garbage collection. You can start new threads using the `threading` or `concurrent.futures` libs.
3. The access to the object count of references or other internal states is controlled by the GIL (Global Interpreter Lock). At any moment in time, only one thread can hold the GIL, i. e., only one thread can execute Python code, independent on how many cores a CPU might have.
4. By default, the GIL is automatically liberated every 5 ms, to prevent a thread of holding it indefinitely. 
5. We don't have the direct control of the GIL with Python code, but we can have it using a Python's C API.
6. Every function that performs a `syscall` (call to a kernel function) liberates the GIL, which includes disk write/read, web transfers and `time.sleep()` as well as a lot of functions of framework such as numpy and scipy.
7. Threads that are not of a Python process can be initiated by the C API and don't affect GIL.
8. Web programming effect over the GIL is small because the latency of the web is way bigger than the memory write/read latency.
9. GIL contents slows python threads of intensive processing. For this type of task, use serial code
10. To run intensive code in python in multiple cores, you need to use multiple processes.
11. GIL does not affect the coroutines because they run in the same thread.