## Process vs Threads 

# 1. Program

> **A program is passive code on disk.**

* Example: `app.py`
* Stored as:
  * Instructions
  * Static data
* Has **no execution**, no state, no resources

Until loaded, it **cannot do anything**.

---

# 2. Process

> **A process is a running instance of a program.**

When you run:

```bash
python app.py
```

The OS creates a **process**:

* Private virtual address space
* File descriptors
* Heap, stack
* PID
* Signal handlers

### Key property

> Processes are **isolated** from each other.

They can only communicate via:

* Files
* Pipes
* Sockets
* Shared memory (explicit)

---

# 3. Thread

> **A thread is a path of execution inside a process.**

A process may have:

* One thread (single-threaded)
* Many threads (multi-threaded)

Threads in a process:

* Share memory
* Share file descriptors
* Share heap
* Have **separate stacks**

ðŸ“Œ This is why threads need **locks**.

---
![](https://raw.githubusercontent.com/Ankush-Chander/Tech-Talks/34c46f2c52eb3bf2afe9ad8cabae956c835a9c74/img/concurrency/process_thread.jpg)

---
# 4. Concurrency (system-level)

> **Concurrency means multiple execution paths overlap in time.**

This does **not** require multiple CPUs.

### On a single CPU:

* OS context-switches between threads
* Each makes progress
* Interleaving occurs

### On multiple CPUs:

* Threads literally run in parallel

### Important distinction

| Term        | Meaning                |
| ----------- | ---------------------- |
| Concurrency | Overlap in time        |
| Parallelism | Simultaneous execution |

Concurrency â‡’ correctness problem
Parallelism â‡’ performance opportunity

---

# 5. Why concurrency is hard

Because of:

* Shared memory
* Shared files
* Shared I/O
* Non-deterministic scheduling

Classic problem:

```text
read â†’ modify â†’ write
```

interleaved between threads â†’ corruption.

---

# 6. Concurrency at OS vs application

### OS guarantees

* Filesystem integrity
* Process isolation
* Atomic syscalls (limited)

### OS does NOT guarantee

* Logical correctness
* Application invariants
* Ordering across syscalls

ðŸ“Œ Thatâ€™s the programmerâ€™s job.

---

# 7. Concurrency in web servers (general)

A web server:

* Accepts many connections
* Must handle them **concurrently**
* Cannot block on one request

Common models:

* Process-per-request
* Thread-per-request
* Event-loop (async)

---

# 8. Concurrency in FastAPI (important)

> **FastAPI handlers execute concurrently by default.**

This surprises many people.

### Why?

FastAPI runs on **ASGI servers** (e.g., Uvicorn).

ASGI is designed for concurrency.

---

## 8.1 `async def` endpoints

```python
@app.get("/")
async def handler():
    ...
```

* Runs in an **event loop**
* Multiple requests interleave at `await` points
* No automatic serialization

Even without `await`, scheduling still overlaps.

---

## 8.2 `def` endpoints

```python
@app.get("/")
def handler():
    ...
```

* Executed in a **thread pool**
* Multiple threads run simultaneously
* Share memory

ðŸ“Œ This is real multi-threading.

---

## 8.3 Multiple workers

```bash
uvicorn app:app --workers 4
```

* 4 OS processes
* No shared memory
* All may write the same files

This is the **most dangerous** mode for storage engines.

---

# 9. Concurrency in FastAPI (summary table)

| Level      | Concurrency            |
| ---------- | ---------------------- |
| Requests   | Concurrent             |
| Handlers   | Concurrent             |
| Threads    | Yes (sync endpoints)   |
| Event loop | Yes (async endpoints)  |
| Processes  | Optional (`--workers`) |

---

# 10. Why this matters for your storage engine

Your FastAPI app:

* Has shared memory (`KEY_OFFSET_MAP`)
* Has shared file (`database.bin`)
* Has concurrent handlers

Without synchronization:

* Duplicate offsets
* Torn reads
* Lost updates
* Corruption

FastAPI **assumes stateless handlers**.

Storage engines are **stateful**.

That mismatch is the core issue.

---

# 11. One-line mental model

> **Programs are passive.
> Processes own resources.
> Threads interleave execution.
> FastAPI makes concurrency the default.**

---


## Context managers

In [4]:
from contextlib import contextmanager

@contextmanager
def visit(dest):
    print("ticket check")
    print(f"fly to {dest}")
    try:
        yield
    finally:
        print("fly back")


In [5]:
with visit("paris"):
    print("yay effile tower!")

ticket check
fly to paris
yay effile tower!
fly back


## References