# Process and Thread

A process and a thread are both units of execution, but they have some differences in terms of memory, resources, communication, and performance.

## Definition

- A **process** is an instance of a program that runs in its own memory space and has its own resources.
- A **thread** is a subset of a process that shares the same memory space and resources with other threads of the same process.

## Memory and Resources

- A process has its own **address space**, which is the range of memory locations that it can access.
- A thread shares the address space of its parent process, which means that it can access the same variables and data structures as other threads of the same process.
- A process has its own **resources**, such as file descriptors, sockets, and pipes, that it can use to interact with the system or other processes.
- A thread shares the resources of its parent process, which means that it can use the same files, sockets, and pipes as other threads of the same process.

## Communication

- A process is **isolated** from other processes, which means that it cannot directly access or modify the memory or resources of another process.
- A thread can **communicate** with other threads of the same process using shared variables or inter-thread communication methods, such as locks, semaphores, or queues.
- A process can communicate with another process using inter-process communication methods, such as message passing, shared memory, or remote procedure calls.

## Performance

- A process takes more time and resources to **create**, **terminate**, and **switch**, which means that it has more overhead and latency than a thread.
- A thread takes less time and resources to create, terminate, and switch, which means that it has less overhead and latency than a process.
- A process can run multiple threads in **parallel**, which means that it can utilize multiple CPU cores and improve the performance of CPU-bound tasks, such as heavy computations or data processing.
- A thread can run only one instruction at a **time**, which means that it can benefit from concurrency but not parallelism. A thread is suitable for I/O-bound tasks, such as web scraping or network requests, because it can wait for the data without blocking the main thread or the program.


# Python Multithreading and Multiprocessing

Python offers two techniques to achieve concurrency and parallelism in Python programs: multithreading and multiprocessing.

## Concurrency vs Parallelism

Concurrency means that multiple tasks can run at the same time, but not necessarily simultaneously. For example, a single CPU core can switch between different tasks quickly, giving the illusion of concurrency.

Parallelism means that multiple tasks can run simultaneously on multiple CPU cores. For example, a quad-core CPU can run four tasks at the same time, achieving parallelism.

## Multithreading

Multithreading allows you to create multiple threads within a single process, which share the same memory space and resources. This is useful for tasks that are I/O-bound, such as web scraping or network requests, because the threads can wait for the data without blocking the main thread.

However, multithreading in Python is limited by the Global Interpreter Lock (GIL), which prevents multiple threads from executing Python code at the same time. Therefore, multithreading cannot be used for CPU-bound tasks, such as heavy computations or data processing.

## Multiprocessing

Multiprocessing allows you to create multiple processes, each with its own memory space and Python interpreter. This enables you to run CPU-bound tasks in parallel, taking advantage of multiple CPU cores. Multiprocessing can also avoid the GIL limitation, since each process has its own Python interpreter.

However, multiprocessing has more overhead than multithreading, such as creating and terminating processes, and communicating between processes.


In [10]:
import os
import time

In [11]:
CPU_CORES=os.cpu_count()
print(f"CPU CORES: {CPU_CORES}")

CPU CORES: 4


In [15]:
# Start performance counter
start=time.perf_counter()

def do_something():
    print("Sleeping 1 second")
    time.sleep(1)
    print("Done sleeping !!!")

do_something()
do_something()

# Finish performance counter
finish=time.perf_counter()

print(f"\nFinished in {round(finish-start)} second")

Sleeping 1 second
Done sleeping
Sleeping 1 second
Done sleeping

Finished in 2 second
