# 05 — Code Architecture Patterns for AI Apps

This notebook shows **practical architecture patterns** for LLM apps that you can evolve from a workshop demo to small/medium services, and then to enterprise-grade systems.

We will cover:
- **Monolithic modular** (clean and minimal)
- **Clean layers** (small/medium, recommended)
- **Hexagonal / Ports & Adapters** (larger teams)
- **RAG-first structure** (when retrieval is a core concern)

> All code snippets are intentionally minimal. The goal is clarity and *separation of concerns*.

## Monolithic modular

A compact structure where each file has a clear responsibility. Great for **workshops** and **small prototypes** that still need maintainability.

```text
.
├─ .data/                 # persistent state (SQLite)
├─ app/
│  ├─ __init__.py
│  ├─ config.py           # .env loading (models, paths, prompts)
│  ├─ schema.py           # pydantic request/response
│  ├─ graph.py            # LangGraph: state, nodes, trimming, LLM
│  └─ main.py             # FastAPI: lifespan + /chat endpoint
│  
├─ .env
├─ pyproject.toml
└─ README.md
```

### Principles
- **Separation of concerns**: `config` (env), `schema` (contracts), `graph` (conversational logic), `main` (server).
- **Loosely coupled**: the graph does **not** know FastAPI; FastAPI does **not** know how you trim messages.
- **Short, readable files**.
- **Scalable without rewrites**: change model, add tools, or switch checkpointer without editing the endpoint.

> With FastAPI lifespan you can avoid `runtime.py`. **No need for runtime.py** in this layout.

### Minimalist variant (for demos/workshops)

```text
app/
  ├─ main.py        # FastAPI + graph in the same file (all-in-one)
  └─ config.py
.env, pyproject.toml, README.md
```

**Pros**: ultra simple, single pass to understand.

**Cons**: grows messy once you add tools, RAG, or multiple endpoints.

## Clean layers (small/medium — recommended)

A modular layout that scales well while staying straightforward.

```text
app/
  ├─ api/
  │   ├─ http.py          # endpoints & routers
  │   └─ deps.py          # FastAPI dependencies
  ├─ core/
  │   ├─ config.py        # settings, load .env
  │   ├─ schema.py        # Pydantic models
  │   ├─ prompts.py       # centralized system prompts
  │   └─ utils.py
  ├─ graph/
  │   ├─ state.py         # TypedDict/Pydantic: graph state
  │   ├─ nodes.py         # graph nodes (respond, tools, summary)
  │   ├─ memory.py        # checkpointers, trimming
  │   └─ build.py         # build_graph() factory
  ├─ services/
  │   ├─ agents.py        # agent logic
  │   └─ rag.py           # retrieval logic (optional)
  └─ main.py              # FastAPI app + router registration
```

**Pros**: modular, testable, easy to extend (add summary node, plug RAG).

**Cons**: more files than monolithic modular.

## Hexagonal / Ports & Adapters (enterprise / larger teams)

Strong decoupling. Domain logic independent from frameworks and vendors.

```text
app/
├─ domain/              # business rules (e.g., memory policies)
├─ application/         # use cases (invoke graph, validate inputs)
├─ adapters/
│   ├─ http/            # FastAPI
│   ├─ llms/            # OpenAI, Azure, Anthropic (interfaces)
│   ├─ memory/          # SqliteSaver, PostgresSaver, MemorySaver
│   └─ retrievers/      # FAISS, Pinecone, etc.
└─ infra/
   ├─ logging.py
   ├─ tracing.py
   └─ settings.py
```

**Pros**: swap LLM/DB/vector store without touching domain.

**Cons**: more ceremony; only worth it if you expect significant growth or multiple teams.

## RAG-first structure (when retrieval is central)

When your app is primarily about **retrieval + generation**, make that explicit from day one.

```text
app/
  ├─ api/
  │   └─ chat.py
  ├─ core/
  │   ├─ config.py
  │   └─ schema.py
  ├─ graph/
  │   ├─ nodes_chat.py
  │   ├─ nodes_rag.py     # nodes that call retrievers
  │   └─ build.py
  ├─ rag/
  │   ├─ ingest.py        # chunking + embeddings
  │   ├─ retriever.py     # retrieval logic
  │   ├─ store.py         # vector DB connection (FAISS, Pinecone…)
  │   └─ schema.py        # chunk/document formats
  └─ main.py
```

**Pros**: clear boundaries when teams split work (RAG vs chat/agents).

**Cons**: extra complexity if you don’t actually need RAG yet.

# Practical advice for your current repo

- ✅ Keep **config.py / schema.py / graph.py / main.py**: clear and effective.
- 🧹 **Remove `runtime.py`** if you already use lifespan (avoid cognitive duplication).
- 🧪 Add `tests/` (at least one test for `/chat`).
- 🩺 Consider `app/api/health.py` with `/healthz` and `/readyz`.
- 📦 If planning Docker: `Dockerfile`, `docker-compose.yml`, and `alembic/` if you move to Postgres for persistence.

> Start simple (monolithic modular), then move to clean layers as you add features. Keep **graph** and **API** independent by design.