# Concurrency and Parallelism
 Take, for instance, a baker baking two different cakes. To bake these
cakes, we need to preheat our oven. Preheating can take tens of minutes depending on the oven and the baking temperature, but we don’t need to wait for our oven to
preheat before starting other tasks, such as mixing the flour and sugar together with
eggs. We can do other work until the oven beeps, letting us know it is preheated.
 We also don’t need to limit ourselves from starting work on the second cake before
finishing the first. We can start one cake batter, put it in a stand mixer, and start pre￾paring the second batter while the first batter finishes mixing. In this model, we’re
switching between different tasks concurrently. This switching between tasks (doing
something else while the oven heats, switching between two different cakes) is concurrent behavior
<img src="PdfIamage.png" height="500">
With concurrency, we have multiple tasks happening at the same time, but only
one we’re actively doing at a given point in time. With parallelism, we have multiple tasks
happening and are actively doing more than one simultaneously.
<img src="PdfImage.png" height="500">
 With concurrency, we switch between running two applications. With
parallelism, we actively run two applications simultaneously.


## PREEMPTIVE MULTITASKING
In this model, we let the operating system decide how to switch between which work is
currently being executed via a process called time slicing. When the operating system
switches between work, we call it preempting.
 How this mechanism works under the hood is up to the operating system itself. It is
primarily achieved through using either multiple threads or multiple processes.
## COOPERATIVE MULTITASKING
In this model, instead of relying on the operating system to decide when to switch
between which work is currently being executed, we explicitly code points in our
application where we can let other tasks run. The tasks in our application operate in a
model where they cooperate, explicitly saying, “I’m pausing my task for a while; go ahead
and run other tasks.“ asyncio uses cooperative multitasking to achieve concurrency


# Process
A process is an application run that has a memory space that other applications cannot
access. An example of creating a Python process would be running a simple “hello
world” application or typing python at the command line to start up the REPL (read
eval print loop). Multiple processes can run on a single machine. If we are on a machine that has a
CPU with multiple cores, we can execute multiple processes at the same time. If we
are on a CPU with only one core, we can still have multiple applications running
simultaneously, through time slicing.

# Thread
Threads can be thought of as lighter-weight processes. In addition, they are the small￾est construct that can be managed by an operating system. They do not have their own
memory as does a process; instead, they share the memory of the process that created
them. Threads are associated with the process that created them. A process will always
have at least one thread associated with it, usually known as the main thread. A process
can also create other threads, which are more commonly known as worker or back￾ground threads.When we run a normal Python application, we create a process as well as a main
thread that will be responsible for running our Python application.
<img src="process-vs-thread3.png" height="500">



# Understanding the global interpreter lock
The global interpreter lock, abbreviated GIL and pronounced gill, is a controversial
topic in the Python community. Briefly, the GIL prevents one Python process from
executing more than one Python bytecode instruction at any given time. This means
that even if we have multiple threads on a machine with multiple cores, a Python
process can have only one thread running Python code at a time. In a world where we
have CPUs with multiple cores, this can pose a significant challenge for Python devel￾opers looking to take advantage of multithreading to improve the performance of
their application.

<span style="font-weight:bold;">NOTE</span> Multiprocessing can run multiple bytecode instructions concurrently
because each Python process has its own GIL



So why does the GIL exist? The answer lies in how memory is managed in CPython. In
CPython, memory is managed primarily by a process known as reference counting.
<br>

<span style="font-weight:bold;">What is CPython?</span>
CPython is the reference implementation of Python. By reference implementation we
mean it is the standard implementation of the language and is used as the reference
for proper behavior of the language. There are other implementations of Python such
as Jython, which is designed to run on the Java Virtual Machine, and IronPython,
which is designed for the .NET framework.


The conflict with threads arises in that the implementation in CPython is not thread
safe. When we say CPython is not thread safe, we mean that if two or more threads mod￾ify a shared variable, that variable may end in an unexpected state. This unexpected
state depends on the order in which the threads access the variable, commonly known
as a race condition.Race conditions can arise when two threads need to reference a
Python object at the same time.
 As shown in figure 1.6, if two threads increment the reference count at one time,
we could face a situation where one thread causes the reference count to be zero
when the object is still in use by the other thread.
<img src="GIL_benefit.png" height="500">


A race condition where two threads try to increment a reference count simultaneously. Instead of an expected count of two, we get one.

In [2]:
import time
def print_fib(number: int) -> None:
    def fib(n: int) -> int:
        if n == 1:
            return 0
        elif n == 2:
            return 1
        else:
            return fib(n - 1) + fib(n - 2)
    print(f'fib({number}) is {fib(number)}')
def fibs_no_threading():
 print_fib(40)
 print_fib(41)
start = time.time()
fibs_no_threading()
end = time.time()
print(f'Completed in {end - start:.4f} seconds.')

fib(40) is 63245986
fib(41) is 102334155
Completed in 38.2830 seconds.


In [3]:
import threading
import time
def print_fib(number: int) -> None:
    def fib(n: int) -> int:
        if n == 1:
            return 0
        elif n == 2:
            return 1
        else:
            return fib(n - 1) + fib(n - 2)
def fibs_with_threads():
    fortieth_thread = threading.Thread(target=print_fib, args=(40,))
    forty_first_thread = threading.Thread(target=print_fib, args=(41,))
    fortieth_thread.start()
    forty_first_thread.start()
    fortieth_thread.join()
    forty_first_thread.join()
start_threads = time.time()
fibs_with_threads()
end_threads = time.time()
print(f'Threads took {end_threads - start_threads:.4f} seconds.')

Threads took 0.0010 seconds.



Our threaded version took almost the same amount of time. In fact, it was even a little
slower! This is almost entirely due to the GIL and the overhead of creating and manag￾ing threads. While it is true the threads run concurrently, only one of them is allowed to
run Python code at a time due to the lock. This leaves the other thread in a waiting state
until the first one completes, which completely negates the value of multiple threads.


 The global interpreter lock is released when I/O operations happen. This lets us
employ threads to do concurrent work when it comes to I/O, but not for CPU-bound
Python code itself (there are some notable exceptions that release the GIL for CPU￾bound work in certain circumstances, and we’ll look at these in a later chapter). To
illustrate this, let’s use an example of reading the status code of a web page.


# asyncio and the GIL
asyncio exploits the fact that I/O operations release the GIL to give us concurrency,
even with only one thread. When we utilize asyncio we create objects called coroutines.
A coroutine can be thought of as executing a lightweight thread. Much like we can
have multiple threads running at the same time, each with their own concurrent I/O
operation, we can have many coroutines running alongside one another. While we are
waiting for our I/O-bound coroutines to finish, we can still execute other Python
code, thus, giving us concurrency. It is important to note that asyncio does not circum￾vent the GIL, and we are still subject to it. If we have a CPU-bound task, we still need
to use multiple processes to execute it concurrently (which can be done with asyncio
itself); otherwise, we will cause performance issues in our application. Now that we
know it is possible to achieve concurrency for I/O with only a single thread, let’s dive
into the specifics of how this works with non-blocking sockets.

Socket
<img src="socket-thread.png" height="500">


## How an event loop works
An event loop is at the heart of every asyncio application. Event loops are a fairly com-mon design pattern in many systems and have existed for quite some time.Event loops run asynchronous tasks and callbacks, perform network IO operations, and run subprocesses.
In asyncio, the event loop keeps a queue of tasks instead of messages.  Tasks are wrap￾pers around a coroutine. A coroutine can pause execution when it hits an I/O-bound
operation and will let the event loop run other tasks that are not waiting for I/O oper￾ations to complete.When we create an event loop, we create an empty queue of tasks. We can then add
tasks into the queue to be run. Each iteration of the event loop checks for tasks that need
to be run and will run them one at a time until a task hits an I/O operation. At that time
the task will be “paused,” and we instruct our operating system to watch any sockets for
I/O to complete.
On every iteration of the
event loop, we’ll check to see if any of our I/O has completed; if it has, we’ll “wake up”
any tasks that were paused and let them finish running. 