A multi-agent AI platform that reviews system architecture descriptions using RAG-grounded specialist agents.
The System Design Reviewer is an automated architecture assessment tool designed to evaluate system design proposals against established best practices.
Core Workflow:
- A developer submits a system design architecture description (e.g., "A highly available e-commerce backend using Next.js, Postgres, and Redis").
- A Supervisor Agent analyzes the request and delegates the review to specialized sub-agents.
- Specialist Agents (Scalability, Security, Database, Cost, DevOps) analyze the architecture independently, grounding their insights in a local RAG pipeline (vector database of architecture best practices).
- A Judge Agent aggregates the specialist reviews, synthesizes a final verdict, assigns a score out of 100, and generates an improved architectural diagram.
- The review is streamed in real-time to the frontend via Server-Sent Events (SSE).
(Placeholders — replace with actual UI screenshots)
- Landing Page: Submit your architecture details in the sleek modern UI.
- Live Streaming Review: Watch the LangGraph agents stream their analysis in real-time as SSE chunks arrive.
- Final Report: View the aggregated Judge score, individual specialist feedback, and generated Mermaid.js architecture diagrams.
The application uses an asynchronous, event-driven architecture based on LangGraph to orchestrate LLM agents.
- Frontend (Next.js) sends a POST request containing the architecture description to the backend API.
- Backend (FastAPI) initiates a LangGraph execution session and returns an SSE stream.
- The stream continuously pushes discrete node events (e.g.,
agent:thinking,agent:output,judge:scores) back to the frontend.
- Supervisor: Dynamically determines which specialists are required for the given design.
- Specialists: Execute in parallel. Each queries the local Qdrant vector database to fetch relevant architectural principles (RAG).
- Judge: Waits for all parallel specialists to finish, synthesizes the results, and finalizes the output.
- Frontend: Next.js App Router deployed to Vercel/Render.
- Backend: FastAPI + LangGraph deployed to Render (Free Tier).
- Database: Neon Serverless PostgreSQL.
- Cache: Upstash Serverless Redis.
- Vector Store: Qdrant Cloud.
| Category | Technology | Version | Description |
|---|---|---|---|
| Frontend | Next.js, React | 14.2.16 | App Router, Server-Sent Events |
| Styling | Vanilla CSS | - | Modern CSS Variables, Animations |
| Backend | FastAPI | 0.115.5 | High-performance async Python framework |
| AI / Orchestration | LangGraph, LangChain | 0.2.51 | Multi-agent coordination |
| LLMs | Gemini 1.5 Flash | latest | High-speed, cost-effective inference |
| RAG Embeddings | FastEmbed | 0.4.1 | Local BAAI/bge-small-en-v1.5 |
| Vector DB | Qdrant | 1.12.1 | Semantic search storage |
| Relational DB | Neon (PostgreSQL) | - | Persistent application state |
| Caching | Upstash (Redis) | 1.3.0 | Ephemeral state / rate limiting |
| DevOps | Docker, Make | - | Containerization & automation |
arch-review-platform/
├── backend/
│ ├── agents/ # LangGraph workflows, supervisor, and specialist nodes
│ ├── api/ # FastAPI route handlers (health, ingest, reviews SSE)
│ ├── models/ # Pydantic schemas and shared data models
│ ├── rag/ # FastEmbed embedding logic and Qdrant retriever
│ ├── scripts/ # Utility scripts (e.g., seed_corpus)
│ ├── services/ # Infrastructure clients (Neon, Upstash, Qdrant)
│ └── tests/ # Pytest unit and integration suites
├── frontend/
│ ├── app/ # Next.js App Router (pages, layouts, globals.css)
│ ├── components/ # Reusable UI React components (MarkdownRenderer, Button)
│ └── lib/ # React hooks (useSSE)
├── infra/
│ ├── docker-compose.yml # Local development container orchestration
│ ├── Dockerfile.backend # Backend deployment container
│ ├── Dockerfile.frontend # Frontend deployment container
│ └── render.yaml # Render infrastructure as code
├── evals/ # RAGAS evaluation datasets and runners
├── Makefile # Developer commands automation
└── pyproject.toml # Python dependencies, Ruff, MyPy, and Pytest configs
- Docker & Docker Compose
- Python 3.12+
- Node.js 20+
cp .env.example .env
# Open .env and add your GOOGLE_API_KEY, Neon DB URL, etc.make install
# This runs `pip install -e ".[dev]"`, `npm ci` in frontend, and installs pre-commit hooks.Start Postgres, Redis, and Qdrant locally:
make dev- Backend runs on
http://localhost:8000 - Frontend runs on
http://localhost:3000
make seed-corpusThis script downloads the local FastEmbed models and populates Qdrant.
| Variable | Required | Description | Example |
|---|---|---|---|
GOOGLE_API_KEY |
Yes | Gemini API key for LLM inference | AIzaSy... |
DATABASE_URL |
Yes | Neon PostgreSQL connection string | postgresql://user:pass@ep-rest.neon.tech/db |
QDRANT_URL |
Yes | Qdrant Cloud or local URL | http://localhost:6333 |
QDRANT_API_KEY |
No | Required if using Qdrant Cloud | secret_key |
UPSTASH_REDIS_REST_URL |
Yes | Redis URL | https://eu1-cool-redis.upstash.io |
UPSTASH_REDIS_REST_TOKEN |
Yes | Redis auth token | token123 |
LANGCHAIN_API_KEY |
No | LangSmith tracing key (Dev only) | ls__... |
NEXT_PUBLIC_BACKEND_URL |
No | Frontend pointer to API | http://localhost:8000 |
GET /api/v1/health
- Response:
200 OK - Returns: JSON object with
statusandversionto confirm backend wake-up.
POST /api/v1/reviews/stream
- Request Body:
{"architecture_description": "NextJS frontend, Python backend..."} - Response:
text/event-stream - Events:
server:warming: Backend is initializing.agent:thinking: Specialist agent started execution.agent:output: Specialist agent completed review chunk.judge:scores: Final aggregated scores.review:complete: Stream finalized with complete JSON dump.
The orchestration flow is powered by LangGraph.
- Supervisor Agent: Parses user input to extract the tech stack and decides which specialists to activate.
- Specialist Agents:
- Scalability: Evaluates bottleneck mitigation and horizontal scaling.
- Security: Analyzes auth flows, data at rest, and attack vectors.
- Database: Reviews schema design, caching layers, and transaction isolation.
- Cost: Flags expensive cloud primitives and suggests budget alternatives.
- DevOps: Reviews deployment, CI/CD, and observability.
- Judge Agent: Aggregates all specialist findings, guarantees formatting consistency, and assigns an overarching score.
Note: There are currently no human-in-the-loop checkpoints required for this read-only review pipeline.
- Ingestion: Raw architectural documentation is read via
backend.scripts.seed_corpus. - Embeddings: Uses
BAAI/bge-small-en-v1.5via thefastembedPython package. This runs entirely locally to save API costs. - Vector DB: Chunks are stored in Qdrant.
- Retrieval: During review execution, specialist agents invoke the retriever to fetch relevant context before passing it to the Gemini LLM as system prompts.
This project uses containerized deployment optimized for free-tier platforms.
- Frontend: Can be deployed seamlessly to Vercel via GitHub integrations.
- Backend: Containerized via
infra/Dockerfile.backendand deployed to Render. - Render "Cold Start" Handled: Since the Render free tier sleeps after 15 minutes of inactivity, the frontend triggers a pre-flight
/api/v1/healthrequest and displays a "Warming up the server..." state to the user until the container boots (~30s).
- Linting: Python code uses
Ruff. TypeScript usesESLint. - Type Checking: Python uses
MyPy(make typecheck). TypeScript usestsc. - Formatting: Python uses
Ruff format. Frontend usesPrettier(npm run format). - Testing:
pytest. Must maintain >80% coverage. - Branching: Branch off
main, use semantic commit messages (e.g.,feat:,fix:).
If the frontend drops the stream connection with a timeout error on the very first run, it is likely because fastembed is downloading the massive bge-small AI model locally.
Fix: Wait 1-2 minutes for the download to finish, then hit submit again.
If you run pytest tests and get connection refused errors, you forgot to start the Docker services.
Fix: Run make dev in a separate terminal before running integration tests.
Ensure you are using a valid Google AI Studio key and that it supports the gemini-flash-latest model.
If npm run lint fails with '_err' is defined but never used, ensure you are using the modern ES2019 catch syntax without the variable: catch { console.error("error") }.
- Multi-agent LangGraph execution pipeline
- Next.js dynamic SSE streaming frontend
- Pytest test suite with >80% coverage
- Docker + Render deployment configurations
- Agent evaluations using RAGAS
- Mermaid.js auto-generation from the Judge agent rendered dynamically in the UI
- Frontend unit testing suite (Jest / React Testing Library)
- User authentication and saving review histories
- Fork the repo and create your branch (
git checkout -b feature/my-feature). - Run
make installto set up pre-commit hooks. - Make your changes and ensure
make lint,make typecheck, andmake testall pass. - If modifying the frontend, ensure
cd frontend && npm run lintandnpm run format:checkpass. - Submit a PR outlining your architectural changes.