A multi-service AI assistant platform built for teaching production system design patterns, scalability, and security. Includes a polished React frontend with streaming chat, a system dashboard, and batch job management.
graph TB
Browser([Browser]) --> Frontend[React Frontend\n:5173 dev / :3000 prod]
Frontend -->|"/api/* /health"| Nginx[Nginx Reverse Proxy :80]
Nginx -->|"/api/* /health /docs"| Gateway[API Gateway :8000]
Nginx -->|"/*"| FrontendNginx[Frontend nginx :3000]
Gateway -->|HTTP| ModelService[Model Service :8001]
Gateway -->|HTTP + SSE proxy| ModelService
Gateway -->|HTTP| WorkerService[Worker Service :8002]
ModelService -->|Groq SDK| Groq[Groq Cloud API]
WorkerService -->|HTTP| ModelService
WorkerService --> Queue[(In-Memory Queue)]
subgraph baseline["Backend Services"]
Gateway
ModelService
WorkerService
Queue
end
subgraph shared["Shared Module"]
Config[Config Management]
Logger[Structured Logging]
Errors[Error Handling]
Schemas[Pydantic Schemas]
end
Gateway -.-> shared
ModelService -.-> shared
WorkerService -.-> shared
style Frontend fill:#3b82f6,color:#fff
style Groq fill:#10b981,color:#fff
Dev mode: Vite on
:5173proxies/api/*and/healthdirectly to the API Gateway on:8000— no Nginx needed.Docker mode: Nginx on
:80routes/api/*to the Gateway and/*to the Frontend container.
Streaming chat (primary flow):
- Browser sends
POST /api/v1/generate/streamthrough the Vite proxy (dev) or Nginx (prod) - API Gateway proxies the request as an SSE stream to Model Service
- Model Service calls Groq API with
stream=True, yields tokens asdata: token\n\nevents - Tokens flow back through the Gateway to the browser in real-time
- Frontend parses SSE events and appends each token to the chat message
Synchronous generation:
- Browser sends
POST /api/v1/generateto the API Gateway - Gateway adds request ID, logs the request, and proxies to Model Service
- Model Service calls Groq API for inference and returns the full response
- Gateway returns the JSON result to the browser
Batch jobs (async):
- Browser sends
POST /api/v1/jobsto the API Gateway - Gateway forwards to Worker Service, which enqueues the job and returns
202 Accepted - Background worker picks up the job, calls Model Service for each prompt, updates progress
- Browser polls
GET /api/v1/jobs/{id}every 2 seconds for status and results
Prerequisites: Python 3.11+, Node.js 20+, Git
# 1. Clone and setup
bash scripts/setup.sh
# 2. Activate virtual environment (venv is already created in setup.sh run)
source venv/Scripts/activate # Windows Git Bash
# source venv/bin/activate # macOS / Linux
# 3. Configure environment (optional if done already)
cp .env.example .env
# Edit .env: set GROQ_API_KEY=your-key (or USE_MOCK=true for offline mode)# 1. Run all services
make run
# 2. Verify
curl http://localhost:8000/health
# optional test
curl -X POST http://localhost:8000/api/v1/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "Explain microservices in one sentence"}'Important: Always activate the virtual env (
source venv/Scripts/activate) before runningmake run. The system Python does not have uvicorn installed.
cd frontend
npm install
npm run devNavigate to http://localhost:5173 — you should see the Prodigon chat interface with streaming, dashboard, and batch jobs.
make run-docker # Starts all services + frontend + nginx on :80prod-ai-system-design/
├── baseline/ # Backend services
│ ├── api_gateway/ # Public API entry point (:8000)
│ ├── model_service/ # LLM inference via Groq (:8001)
│ ├── worker_service/ # Async job processing (:8002)
│ ├── shared/ # Config, logging, schemas, errors, HTTP client
│ ├── infra/ # Nginx reverse proxy config
│ ├── protos/ # gRPC definitions (Task 1)
│ ├── tests/ # Integration tests
│ └── docker-compose.yml
├── frontend/ # React + Vite SPA
│ ├── src/ # Components, stores, hooks, API client
│ ├── Dockerfile # Multi-stage build (Node → Nginx)
│ └── nginx.conf # SPA routing config
├── architecture/ # Architecture documentation (v0)
├── workshop/ # Teaching materials
│ ├── part1_design_patterns/ # Tasks 1-4 (complete)
│ ├── part2_scalability/ # Tasks 5-8 (pending)
│ └── part3_security/ # Tasks 9-11 (pending)
├── scripts/ # setup.sh, run_all.sh, check_health.sh
├── .env.example
├── Makefile
└── pyproject.toml
| Part | Task | Topic |
|---|---|---|
| I | 1 | REST APIs vs gRPC |
| I | 2 | Microservices vs Monolith |
| I | 3 | Batch vs Real-time vs Streaming |
| I | 4 | FastAPI Dependency Injection |
| II | 5 | Code Profiling & Optimization |
| II | 6 | Concurrency & Parallelism |
| II | 7 | Memory Management |
| II | 8 | Load Balancing & Caching |
| III | 9 | Authentication vs Authorization |
| III | 10 | Securing API Endpoints |
| III | 11 | Secrets Management |
# Backend
make run # Start all backend services
make run-docker # Run everything with Docker Compose
make test # Run pytest
make health # Check service health
make lint # Run ruff linter
# Frontend
make install-frontend # npm install
make run-frontend # Start Vite dev server (:5173)
make build-frontend # Production build
# General
make setup # Install Python dependencies
make clean # Remove caches and build artifacts
make help # Show all commandsBackend:
- Python 3.11+ with FastAPI
- Groq API (llama-3.3-70b-versatile) for LLM inference
- structlog for structured JSON logging
- Pydantic v2 for config and validation
- httpx for async HTTP and SSE proxy streaming
Frontend:
- React 18 + TypeScript with Vite
- Zustand for state management (chat sessions, settings, health, jobs)
- Tailwind CSS for styling with dark mode support
- react-markdown + react-syntax-highlighter for AI response rendering
Infrastructure:
- Docker + docker-compose for containerization
- Nginx as reverse proxy (API routing + SSE support)
- Redis (stubbed for Workshop Task 8)
For detailed architecture documentation, see architecture/README.md:
- System Overview — high-level architecture and tech stack
- Getting Started — detailed setup guide with troubleshooting
- API Reference — complete endpoint documentation
- Data Flow — request lifecycle diagrams
- Design Decisions — why things are built this way