An LLM-powered cooking and recipe Q&A application built with LangGraph, FastAPI, and Next.js.
# 1. Clone and set up environment
git clone https://github.com/ir272/cooking-chatbot.git
cd cooking-chatbot
cp .env.example .env
# Edit .env β add your OPENAI_API_KEY (required)
# 2. Start backend
cd backend
uv sync
uv run uvicorn app.main:app --reload --port 8001
# 3. Start frontend (in a new terminal)
cd frontend
pnpm install
pnpm dev --port 3001
# 4. Open http://localhost:3001Prerequisites: Python 3.12+, Node.js 22+, uv (curl -LsSf https://astral.sh/uv/install.sh | sh), pnpm (corepack enable)
Docker alternative:
docker compose up --build
# Backend: http://localhost:8001 | Frontend: http://localhost:3001| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
Yes | OpenAI API key for GPT-4o-mini |
TAVILY_API_KEY |
No | Tavily search API key (DuckDuckGo is used as fallback) |
User β Next.js (React) β POST /api/chat β FastAPI β LangGraph Agent
β
ββββββββββββββββββββββ€
βΌ βΌ
classify_query (if cooking)
β β
off_topic? agent node
β (LLM + tools)
βΌ β
reject_query ββββββββββββΌβββββββββββ
β βΌ βΌ βΌ
END Tavily DuckDuckGo Cookware
Search Search Check
β β β
ββββββ¬βββββββββββββββββ
βΌ
SSE stream back
to frontend
cooking-chatbot/
βββ backend/ # Python β FastAPI + LangGraph
β βββ app/
β β βββ config.py # Pydantic Settings (env loading)
β β βββ main.py # FastAPI app + SSE/REST endpoints
β β βββ graphs/cooking.py # LangGraph StateGraph (classify β agent/reject)
β β βββ prompts/system.py # System prompts
β β βββ schemas/chat.py # Pydantic request/response models
β β βββ tools/ # cookware.py (lookup), search.py (web search)
β βββ tests/ # pytest test suite
β βββ Dockerfile
β βββ pyproject.toml
βββ frontend/ # TypeScript β Next.js 16 + Tailwind CSS 4
β βββ src/
β β βββ app/ # Pages, API proxy route, global styles
β β βββ components/chat/ # ChatContainer, ChatMessage, ChatInput, SuggestedPrompts
β β βββ hooks/use-chat.ts # SSE streaming hook
β β βββ lib/api.ts # Raw fetch + ReadableStream client
β β βββ types/chat.ts # Message interface
β βββ Dockerfile
β βββ package.json
βββ docker-compose.yml # Full-stack orchestration
βββ .github/workflows/ci.yml # CI pipeline (lint, test, build, docker)
βββ CLAUDE.md # AI agent project context
βββ AGENTS.md # AI agent contribution guidelines
- classify_query β A lightweight LLM call determines if the user's question is cooking-related or off-topic (~200ms)
- reject_query β Off-topic queries receive a polite redirect message (no agent loop cost)
- agent β The main LLM node with bound tools processes cooking queries
- tools β LangGraph's
ToolNodeexecutes tool calls (web search, cookware check) - The agent loops with tools until it has a final answer, then streams it back via SSE
| Layer | Technology | Version |
|---|---|---|
| LLM Orchestration | LangGraph (custom StateGraph) | 0.3+ |
| LLM | GPT-4o-mini via LangChain | β |
| Backend | FastAPI + Uvicorn | 0.115+ |
| Streaming | SSE via sse-starlette | 2.2+ |
| Frontend | Next.js (App Router) | 16.x |
| Styling | Tailwind CSS | 4.x |
| Language | Python 3.13 / TypeScript 5.x | β |
| Package Mgmt | uv (Python) / pnpm (Node) | β |
| Containerization | Docker + Docker Compose | β |
The classification gate (cooking vs. off-topic) requires routing before the agent loop. A custom StateGraph makes this control flow explicit and auditable. create_react_agent has no concept of pre-loop classification, so we'd need to embed topic filtering into the system prompt β making it invisible, untestable, and harder to debug.
Trade-off: More boilerplate to set up the graph, but full control over the execution path.
At $0.15/$0.60 per 1M input/output tokens, it's 30x cheaper than GPT-4o with excellent tool-calling reliability. A cooking Q&A bot doesn't need frontier reasoning β the tools (search, cookware check) do the heavy lifting. The model is easily swappable via the MODEL_NAME config.
Trade-off: Less creative/nuanced responses than GPT-4o, but the cost savings are enormous for a Q&A use case.
A dedicated async LLM call for classification enables fail-fast rejection of off-topic queries (~200ms) instead of running the full agent loop (1-5s with tool calls). It uses structured output for deterministic routing and is independently testable.
Trade-off: Adds one extra LLM call per request. At $0.15/1M tokens for a ~20-token classification, this costs <$0.01 per 1000 requests β negligible vs. the savings from not running the full agent on off-topic queries.
Chat streaming is unidirectional (serverβclient). SSE works over standard HTTP, plays nicely with load balancers/proxies, auto-reconnects, and requires ~10 lines of frontend code vs. WebSocket lifecycle management. Next.js API routes proxy the SSE stream transparently.
Trade-off: No bidirectional communication, but chat doesn't need it β user messages are standard POST requests.
Tavily is purpose-built for LLM applications with pre-extracted, citation-ready content. DuckDuckGo serves as a zero-config fallback requiring no API key, ensuring the bot always has search capability even without a Tavily key.
Trade-off: DuckDuckGo results are less structured than Tavily's, but functional. Only one is active at a time (not both) to avoid duplicate results and latency.
The frontend calls its own /api/chat route, which proxies to FastAPI. This keeps the backend URL server-side only, eliminates CORS for the browser, and enables future middleware (rate limiting, auth, logging) without modifying the backend.
Trade-off: Adds a hop, but it's in-process so latency is negligible.
Zero additional dependencies for consuming SSE on the frontend. The data flow from backend to UI is fully transparent β no SDK magic to debug. The streamChat function is ~60 lines of explicit, readable code.
Trade-off: More code than useChat() from Vercel AI SDK, but no vendor lock-in and full control over parsing behavior.
10-100x faster than pip, deterministic lockfile (uv.lock), Docker cache-friendly. Rapidly becoming the standard Python tooling choice.
Trade-off: Newer tool, less community content than pip/poetry, but the developer experience improvement is substantial.
Health check endpoint. Returns {"status": "ok"}.
Non-streaming chat endpoint.
Request: {"message": "How do I make scrambled eggs?", "thread_id": "optional-uuid"}
Response: {"message": "Here's how to make scrambled eggs...", "thread_id": "uuid"}
SSE streaming chat endpoint. Same request format as /chat.
data: {"token": "Here's", "thread_id": "uuid"}
data: {"token": " how", "thread_id": "uuid"}
...
data: [DONE]
| Component | AWS Service | Purpose |
|---|---|---|
| Containers | ECS Fargate | Serverless container orchestration for backend + frontend |
| Routing | Application Load Balancer | Path-based routing: /api/* β backend, /* β frontend |
| Registry | ECR | Docker image storage |
| CDN | CloudFront | Edge caching for static frontend assets |
| Secrets | Secrets Manager | API keys (OPENAI_API_KEY, TAVILY_API_KEY) injected via ECS task definition valueFrom |
| Config | SSM Parameter Store | Non-sensitive config (model name, temperature) |
| Networking | VPC | Private subnets for ECS tasks, public subnets for ALB |
| TLS | ACM | HTTPS certificate on ALB, HTTPβHTTPS redirect |
| CI/CD | GitHub Actions | Build β test β push to ECR β deploy to ECS (blue/green rolling updates) |
- CORS restricted to frontend origin only
- HTTPS enforced via ALB with ACM certificate
- Input validation via Pydantic v2 (max 2000 chars, whitespace rejection)
- Classification gate rejects non-cooking queries before they reach the agent
- Sanitized errors β raw exception details are logged server-side, never exposed to clients
- Non-root containers β both Dockerfiles use unprivileged
USERdirectives .dockerignoreprevents.envsecrets from being baked into images- Rate limiting (planned) β
slowapimiddleware at 30 req/min per IP
- Phase 1: API key authentication via
X-API-Keyheader + FastAPI middleware - Phase 2: JWT tokens for user-specific sessions and conversation history persistence
- Phase 3: OAuth 2.0 integration for third-party login
| Edge Case | How It's Handled |
|---|---|
| Off-topic queries | Classification gate rejects with friendly redirect |
| Missing cookware | Tool returns available alternatives; agent suggests modifications |
| Empty/whitespace message | Pydantic validation rejects (min_length=1 + whitespace strip) |
| Very long message | Pydantic validation rejects (max_length=2000) |
| Tavily rate limits | DuckDuckGo search serves as automatic fallback |
| LLM hallucination | Search tools ground responses in real web results |
| Streaming disconnect | Frontend handles incomplete streams; error state displayed |
| Concurrent conversations | Thread-isolated via thread_id with MemorySaver (swap to PostgresSaver for production) |
| Ambiguous classification | Defaults to "cooking" β better to attempt an answer than reject incorrectly |
| Backend unavailable | Frontend API proxy returns 502 with user-friendly message |
# Backend
cd backend
uv run pytest -v # Run tests
uv run ruff check . # Lint
uv run ruff format . # Format
# Frontend
cd frontend
pnpm exec tsc --noEmit # Type check
pnpm lint # ESLint
pnpm build # Production buildTests cover tool behavior (cookware lookup, case sensitivity, edge cases), API endpoints (health, validation), and graph classification routing.
Findings from a comprehensive code review. None of these affect the core user experience in local development, but they would need to be addressed before a production deployment.
| Priority | Issue | Location |
|---|---|---|
| High | Settings() instantiated at import time β crashes if OPENAI_API_KEY is missing before any useful error |
app/config.py |
| High | No try/except around build_graph() in lifespan β startup failure gives no clear log |
app/main.py |
| High | Streaming endpoint outer scope has no error guard β can return HTTP 200 with empty body | app/main.py |
| Medium | CookingState(total=False) makes messages optional unintentionally |
app/graphs/cooking.py |
| Medium | No test coverage for /chat/stream endpoint |
tests/ |
| Medium | MemorySaver breaks silently with multiple uvicorn workers (no shared state) |
app/graphs/cooking.py |
| Low | @pytest.fixture on async fixture β deprecated in pytest-asyncio, should use @pytest_asyncio.fixture |
tests/conftest.py |
| Priority | Issue | Location |
|---|---|---|
| High | No AbortController on SSE fetch β a hung backend locks the user out permanently |
lib/api.ts |
| Medium | setTimeout in CopyButton not cleared on unmount (minor memory leak) |
chat-message.tsx |
| Medium | ReactMarkdown components prop recreated every render (unnecessary re-renders) |
chat-message.tsx |
| Medium | No aria-live region for screen reader announcements on new messages |
chat-container.tsx |
| Low | Textarea missing aria-label for accessibility |
chat-input.tsx |
| Priority | Issue | Location |
|---|---|---|
| High | Backend Dockerfile doesn't copy app source to final image β container crashes on start | backend/Dockerfile |
| High | Frontend Dockerfile missing HOSTNAME=0.0.0.0 β container unreachable in Docker |
frontend/Dockerfile |
| Medium | tailwind.config.ts is dead code under Tailwind v4 (config lives in globals.css) β should be deleted |
frontend/tailwind.config.ts |
| Medium | Docker CI job doesn't depend on backend/frontend jobs passing first | .github/workflows/ci.yml |
| Low | docker-compose.yml healthcheck uses curl which isn't in python:3.13-slim |
docker-compose.yml |
Interactive docs at GET /docs when backend is running. Export static schema:
cd backend && uv run python scripts/export_openapi.py # β backend/openapi.json