Skip to content

ir272/cooking-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Cooking Chatbot

An LLM-powered cooking and recipe Q&A application built with LangGraph, FastAPI, and Next.js.

Quick Start

# 1. Clone and set up environment
git clone https://github.com/ir272/cooking-chatbot.git
cd cooking-chatbot
cp .env.example .env
# Edit .env β†’ add your OPENAI_API_KEY (required)

# 2. Start backend
cd backend
uv sync
uv run uvicorn app.main:app --reload --port 8001

# 3. Start frontend (in a new terminal)
cd frontend
pnpm install
pnpm dev --port 3001

# 4. Open http://localhost:3001

Prerequisites: Python 3.12+, Node.js 22+, uv (curl -LsSf https://astral.sh/uv/install.sh | sh), pnpm (corepack enable)

Docker alternative:

docker compose up --build
# Backend: http://localhost:8001 | Frontend: http://localhost:3001

Environment Variables

Variable Required Description
OPENAI_API_KEY Yes OpenAI API key for GPT-4o-mini
TAVILY_API_KEY No Tavily search API key (DuckDuckGo is used as fallback)

Architecture

User β†’ Next.js (React) β†’ POST /api/chat β†’ FastAPI β†’ LangGraph Agent
                                                         β”‚
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                                    β–Ό                    β–Ό
                             classify_query         (if cooking)
                                    β”‚                    β”‚
                              off_topic?            agent node
                                    β”‚            (LLM + tools)
                                    β–Ό                    β”‚
                             reject_query     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                    β”‚         β–Ό          β–Ό          β–Ό
                                   END    Tavily    DuckDuckGo   Cookware
                                          Search     Search      Check
                                             β”‚          β”‚          β”‚
                                             β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                  β–Ό
                                            SSE stream back
                                            to frontend

Monorepo Structure

cooking-chatbot/
β”œβ”€β”€ backend/                  # Python β€” FastAPI + LangGraph
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ config.py         # Pydantic Settings (env loading)
β”‚   β”‚   β”œβ”€β”€ main.py           # FastAPI app + SSE/REST endpoints
β”‚   β”‚   β”œβ”€β”€ graphs/cooking.py # LangGraph StateGraph (classify β†’ agent/reject)
β”‚   β”‚   β”œβ”€β”€ prompts/system.py # System prompts
β”‚   β”‚   β”œβ”€β”€ schemas/chat.py   # Pydantic request/response models
β”‚   β”‚   └── tools/            # cookware.py (lookup), search.py (web search)
β”‚   β”œβ”€β”€ tests/                # pytest test suite
β”‚   β”œβ”€β”€ Dockerfile
β”‚   └── pyproject.toml
β”œβ”€β”€ frontend/                 # TypeScript β€” Next.js 16 + Tailwind CSS 4
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/              # Pages, API proxy route, global styles
β”‚   β”‚   β”œβ”€β”€ components/chat/  # ChatContainer, ChatMessage, ChatInput, SuggestedPrompts
β”‚   β”‚   β”œβ”€β”€ hooks/use-chat.ts # SSE streaming hook
β”‚   β”‚   β”œβ”€β”€ lib/api.ts        # Raw fetch + ReadableStream client
β”‚   β”‚   └── types/chat.ts     # Message interface
β”‚   β”œβ”€β”€ Dockerfile
β”‚   └── package.json
β”œβ”€β”€ docker-compose.yml        # Full-stack orchestration
β”œβ”€β”€ .github/workflows/ci.yml  # CI pipeline (lint, test, build, docker)
β”œβ”€β”€ CLAUDE.md                 # AI agent project context
└── AGENTS.md                 # AI agent contribution guidelines

Graph Flow

  1. classify_query β€” A lightweight LLM call determines if the user's question is cooking-related or off-topic (~200ms)
  2. reject_query β€” Off-topic queries receive a polite redirect message (no agent loop cost)
  3. agent β€” The main LLM node with bound tools processes cooking queries
  4. tools β€” LangGraph's ToolNode executes tool calls (web search, cookware check)
  5. The agent loops with tools until it has a final answer, then streams it back via SSE

Tech Stack

Layer Technology Version
LLM Orchestration LangGraph (custom StateGraph) 0.3+
LLM GPT-4o-mini via LangChain β€”
Backend FastAPI + Uvicorn 0.115+
Streaming SSE via sse-starlette 2.2+
Frontend Next.js (App Router) 16.x
Styling Tailwind CSS 4.x
Language Python 3.13 / TypeScript 5.x β€”
Package Mgmt uv (Python) / pnpm (Node) β€”
Containerization Docker + Docker Compose β€”

Design Decisions & Trade-offs

Custom StateGraph over create_react_agent

The classification gate (cooking vs. off-topic) requires routing before the agent loop. A custom StateGraph makes this control flow explicit and auditable. create_react_agent has no concept of pre-loop classification, so we'd need to embed topic filtering into the system prompt β€” making it invisible, untestable, and harder to debug.

Trade-off: More boilerplate to set up the graph, but full control over the execution path.

GPT-4o-mini as the LLM

At $0.15/$0.60 per 1M input/output tokens, it's 30x cheaper than GPT-4o with excellent tool-calling reliability. A cooking Q&A bot doesn't need frontier reasoning β€” the tools (search, cookware check) do the heavy lifting. The model is easily swappable via the MODEL_NAME config.

Trade-off: Less creative/nuanced responses than GPT-4o, but the cost savings are enormous for a Q&A use case.

Separate Classification Node

A dedicated async LLM call for classification enables fail-fast rejection of off-topic queries (~200ms) instead of running the full agent loop (1-5s with tool calls). It uses structured output for deterministic routing and is independently testable.

Trade-off: Adds one extra LLM call per request. At $0.15/1M tokens for a ~20-token classification, this costs <$0.01 per 1000 requests β€” negligible vs. the savings from not running the full agent on off-topic queries.

SSE over WebSockets

Chat streaming is unidirectional (server→client). SSE works over standard HTTP, plays nicely with load balancers/proxies, auto-reconnects, and requires ~10 lines of frontend code vs. WebSocket lifecycle management. Next.js API routes proxy the SSE stream transparently.

Trade-off: No bidirectional communication, but chat doesn't need it β€” user messages are standard POST requests.

Tavily + DuckDuckGo Fallback Search

Tavily is purpose-built for LLM applications with pre-extracted, citation-ready content. DuckDuckGo serves as a zero-config fallback requiring no API key, ensuring the bot always has search capability even without a Tavily key.

Trade-off: DuckDuckGo results are less structured than Tavily's, but functional. Only one is active at a time (not both) to avoid duplicate results and latency.

Next.js API Route Proxy

The frontend calls its own /api/chat route, which proxies to FastAPI. This keeps the backend URL server-side only, eliminates CORS for the browser, and enables future middleware (rate limiting, auth, logging) without modifying the backend.

Trade-off: Adds a hop, but it's in-process so latency is negligible.

Raw fetch + ReadableStream (no Vercel AI SDK)

Zero additional dependencies for consuming SSE on the frontend. The data flow from backend to UI is fully transparent β€” no SDK magic to debug. The streamChat function is ~60 lines of explicit, readable code.

Trade-off: More code than useChat() from Vercel AI SDK, but no vendor lock-in and full control over parsing behavior.

uv for Python Packaging

10-100x faster than pip, deterministic lockfile (uv.lock), Docker cache-friendly. Rapidly becoming the standard Python tooling choice.

Trade-off: Newer tool, less community content than pip/poetry, but the developer experience improvement is substantial.

API Reference

GET /health

Health check endpoint. Returns {"status": "ok"}.

POST /chat

Non-streaming chat endpoint.

Request:  {"message": "How do I make scrambled eggs?", "thread_id": "optional-uuid"}
Response: {"message": "Here's how to make scrambled eggs...", "thread_id": "uuid"}

POST /chat/stream

SSE streaming chat endpoint. Same request format as /chat.

data: {"token": "Here's", "thread_id": "uuid"}
data: {"token": " how", "thread_id": "uuid"}
...
data: [DONE]

Deployment Plan (AWS)

Component AWS Service Purpose
Containers ECS Fargate Serverless container orchestration for backend + frontend
Routing Application Load Balancer Path-based routing: /api/* β†’ backend, /* β†’ frontend
Registry ECR Docker image storage
CDN CloudFront Edge caching for static frontend assets
Secrets Secrets Manager API keys (OPENAI_API_KEY, TAVILY_API_KEY) injected via ECS task definition valueFrom
Config SSM Parameter Store Non-sensitive config (model name, temperature)
Networking VPC Private subnets for ECS tasks, public subnets for ALB
TLS ACM HTTPS certificate on ALB, HTTP→HTTPS redirect
CI/CD GitHub Actions Build β†’ test β†’ push to ECR β†’ deploy to ECS (blue/green rolling updates)

Security

  • CORS restricted to frontend origin only
  • HTTPS enforced via ALB with ACM certificate
  • Input validation via Pydantic v2 (max 2000 chars, whitespace rejection)
  • Classification gate rejects non-cooking queries before they reach the agent
  • Sanitized errors β€” raw exception details are logged server-side, never exposed to clients
  • Non-root containers β€” both Dockerfiles use unprivileged USER directives
  • .dockerignore prevents .env secrets from being baked into images
  • Rate limiting (planned) β€” slowapi middleware at 30 req/min per IP

Auth (Planned)

  • Phase 1: API key authentication via X-API-Key header + FastAPI middleware
  • Phase 2: JWT tokens for user-specific sessions and conversation history persistence
  • Phase 3: OAuth 2.0 integration for third-party login

Edge Cases

Edge Case How It's Handled
Off-topic queries Classification gate rejects with friendly redirect
Missing cookware Tool returns available alternatives; agent suggests modifications
Empty/whitespace message Pydantic validation rejects (min_length=1 + whitespace strip)
Very long message Pydantic validation rejects (max_length=2000)
Tavily rate limits DuckDuckGo search serves as automatic fallback
LLM hallucination Search tools ground responses in real web results
Streaming disconnect Frontend handles incomplete streams; error state displayed
Concurrent conversations Thread-isolated via thread_id with MemorySaver (swap to PostgresSaver for production)
Ambiguous classification Defaults to "cooking" β€” better to attempt an answer than reject incorrectly
Backend unavailable Frontend API proxy returns 502 with user-friendly message

Testing

# Backend
cd backend
uv run pytest -v          # Run tests
uv run ruff check .       # Lint
uv run ruff format .      # Format

# Frontend
cd frontend
pnpm exec tsc --noEmit    # Type check
pnpm lint                 # ESLint
pnpm build                # Production build

Tests cover tool behavior (cookware lookup, case sensitivity, edge cases), API endpoints (health, validation), and graph classification routing.

Known Issues & Future Improvements

Findings from a comprehensive code review. None of these affect the core user experience in local development, but they would need to be addressed before a production deployment.

Backend

Priority Issue Location
High Settings() instantiated at import time β€” crashes if OPENAI_API_KEY is missing before any useful error app/config.py
High No try/except around build_graph() in lifespan β€” startup failure gives no clear log app/main.py
High Streaming endpoint outer scope has no error guard β€” can return HTTP 200 with empty body app/main.py
Medium CookingState(total=False) makes messages optional unintentionally app/graphs/cooking.py
Medium No test coverage for /chat/stream endpoint tests/
Medium MemorySaver breaks silently with multiple uvicorn workers (no shared state) app/graphs/cooking.py
Low @pytest.fixture on async fixture β€” deprecated in pytest-asyncio, should use @pytest_asyncio.fixture tests/conftest.py

Frontend

Priority Issue Location
High No AbortController on SSE fetch β€” a hung backend locks the user out permanently lib/api.ts
Medium setTimeout in CopyButton not cleared on unmount (minor memory leak) chat-message.tsx
Medium ReactMarkdown components prop recreated every render (unnecessary re-renders) chat-message.tsx
Medium No aria-live region for screen reader announcements on new messages chat-container.tsx
Low Textarea missing aria-label for accessibility chat-input.tsx

Infrastructure

Priority Issue Location
High Backend Dockerfile doesn't copy app source to final image β€” container crashes on start backend/Dockerfile
High Frontend Dockerfile missing HOSTNAME=0.0.0.0 β€” container unreachable in Docker frontend/Dockerfile
Medium tailwind.config.ts is dead code under Tailwind v4 (config lives in globals.css) β€” should be deleted frontend/tailwind.config.ts
Medium Docker CI job doesn't depend on backend/frontend jobs passing first .github/workflows/ci.yml
Low docker-compose.yml healthcheck uses curl which isn't in python:3.13-slim docker-compose.yml

OpenAPI Schema

Interactive docs at GET /docs when backend is running. Export static schema:

cd backend && uv run python scripts/export_openapi.py   # β†’ backend/openapi.json

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors