
---

# ⚡ Async Patterns & Pitfalls (Avoid Blocking)

> **Intent** → Achieve **high concurrency** without starving the event loop. Keep requests snappy under load.

---

## 🧭 What Should Be Async

* **I/O-bound work**: DB queries, HTTP calls, file/network streams
* **Short-lived tasks** inside a single request/response lifecycle
* **Reusable clients**: one `AsyncClient` / DB session per request (via DI)

---

## ❌ Common Pitfalls (and fixes)

* **Blocking libraries** (e.g., pure `requests`, heavy crypto/image ops)

  * ✅ Use async libs (httpx, SQLAlchemy async, async drivers) or offload to thread/process (`to_thread`, worker queue).
* **CPU-bound work on event loop**

  * ✅ Move to **background workers** (Celery/RQ/Arq/process pool).
* **Unbounded `gather()`** → connection pool exhaustion

  * ✅ Use **Semaphore/limit concurrency**; batch requests.
* **Missing timeouts/retries** → hung tasks

  * ✅ Always set **timeouts**, **retry with backoff + jitter**.
* **Creating clients per call**

  * ✅ **Reuse** clients (DB/HTTP) via DI; close on teardown.
* **Large payloads in memory**

  * ✅ Stream uploads/downloads; chunk processing.
* **Swallowing cancellations**

  * ✅ Propagate `CancelledError`; clean up gracefully.

---

## 🔌 DB & HTTP Patterns

* Use **async drivers** (`asyncpg`, `aiosqlite`) with SQLAlchemy async sessions.
* **Pool wisely**: match pool sizes to DB limits; watch timeouts & overflow.
* For outbound HTTP: **httpx.AsyncClient** with per-host limits, timeouts, retries.
* Prefer **cursor pagination** and **streaming** for large result sets.

---

## 🎛️ Concurrency Control

* **Bounded parallelism**: semaphore limits around fan-out tasks.
* **Rate limiting**: per user/key/route to protect resources.
* **Backpressure**: queue size limits; drop/defer low-priority work.
* **Circuit breaker**: open on repeated failures; fast-fail until healthy.

---

## 🕒 Timeouts, Retries, Backoff

* **Client timeouts**: connect/read/write; never “infinite”.
* **Exponential backoff + jitter** to avoid thundering herd.
* **Idempotency keys** for safe retries on POST-like actions.

---

## 🧹 Resource Lifecycle

* Open **once per request**, close on **teardown** (DI + lifespan).
* Avoid global singletons with hidden state; prefer explicit injection.
* Clean up on **cancellation** (close cursors/streams, rollback transactions).

---

## 🧪 Testing Async

* Use **AsyncClient** and lifespan-aware app factories.
* Simulate **timeouts/failures**; assert retries/backoff logic.
* Load-test critical endpoints with concurrency; watch 95/99th percentiles.

---

## 📊 Observability

* Track **event loop lag**, **pool metrics**, **in-flight tasks**, **timeout rates**.
* Add **request IDs**; trace DB/HTTP spans with duration & errors.
* Alert on **error spikes**, **slow endpoints**, **retry storms**.

---

## ✅ Outcome

Your app remains **responsive under load**: async I/O done right, CPU work off the loop, bounded concurrency, and robust timeouts/retries—backed by metrics and traces.
