A production-ready Retrieval-Augmented Generation (RAG) AI Agent built with LangGraph Server, LangChain, LlamaIndex, FAISS, and OpenAI. Features official Agent Chat UI with comprehensive analytics and monitoring.
Build your own AI agent with custom knowledge base - Production-ready solution for deploying conversational AI applications with document retrieval capabilities.
Try it now: AVideo AI Agent - A fully functional AI assistant specialized in the AVideo Platform documentation, demonstrating enterprise-grade RAG capabilities with 500+ documents indexed.
Why This Project Stands Out:
- β‘ Performance Optimized: Incremental document processing with SHA256 hash tracking - only re-processes changed files, saving compute time and costs
- π’ Production-Grade Architecture: Enterprise-ready with ClickHouse analytics processing millions of events/second, comprehensive error tracking, and Grafana dashboards
- π Security-First Design: Implements API authentication, rate limiting (20 req/min default), input validation, and cost monitoring to prevent abuse
- π Observable & Debuggable: Real-time metrics tracking response times (p50/p90/p95/p99 percentiles), token usage, tool calls, and session analytics
- π§ Developer Experience: Single
.envconfiguration, automatic document ingestion on startup, hot-reload in dev mode, comprehensive error logging - π Scalable Foundation: Containerized microservices architecture ready for horizontal scaling, with async operations throughout
Technical Achievements:
- Built complete RAG pipeline from scratch (ingestion β embedding β retrieval β generation)
- Integrated 5 major technologies (LangGraph, LangChain, LlamaIndex, FAISS, OpenAI) into cohesive system
- Implemented professional analytics stack comparable to commercial products
- Designed modular architecture enabling easy customization for different knowledge domains
- π€ Production-Ready Agent: LangGraph-powered agent with streaming responses and conversation memory
- π Smart RAG Pipeline: LlamaIndex document parsing (PDF/Markdown) β OpenAI embeddings β FAISS vector search with similarity ranking
- π¬ Modern Chat UI: Official LangChain Agent Chat interface with real-time WebSocket streaming
- π Enterprise Analytics: ClickHouse (10M+ events/sec capability) + Grafana dashboards with 90-day retention and auto-cleanup
- π Security Hardened: JWT authentication, configurable rate limiting, input sanitization, and usage cost tracking
- π³ Fully Containerized: 6 microservices orchestrated with Docker Compose, health checks, and graceful shutdown
- π Zero-Config Start: Single
.envfile, automatic document ingestion, and pre-configured analytics dashboards - π³ Docker Ready: Complete containerized deployment
- π Easy Setup: Single
.envconfiguration for all services
See detailed architecture documentation in ARCHITECTURE.md.
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β β β β
β Agent Chat UI ββββββΆβ LangGraph ββββββΆβ RAG Agent β
β (Next.js) β WS β Server β β (LangGraph) β
β Port 3001 βββββββ Port 2024 βββββββ β
βββββββββββββββββββ βββββββββββ¬ββββββββ ββββββββββ¬βββββββββ
β β
β βΌ
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββ
β β β search_docs β
β β β Tool β
β β βββββββββ¬ββββββββ
βΌ βΌ β
ββββββββββββββββ ββββββββββββββββ βΌ
β Grafana βββββββββββ ClickHouse ββββββββββββββββββββββββββ
β Dashboards β β Analytics β β FAISS β
β Port 3002 β β DB β β Vector Store β
ββββββββββββββββ ββββββββββββββββ βββββββββ¬ββββββββ
β
ββββββββββββββββββββββββββββββββββ€
β β
βΌ βΌ
βββββββββββββββββ βββββββββββββββββ
β OpenAI β β Ingestion β
β LLM API β β Pipeline β
βββββββββββββββββ βββββββββ¬ββββββββ
β
βββββββββββββββββ
β PDF/MD Files β
β ./data/docs β
βββββββββββββββββ
Document Ingestion Pipeline:
- PDF/Markdown β Extract text β Chunk into segments β Generate embeddings β Store in FAISS
- SHA256 hash tracking enables incremental updates (only changed files are re-processed)
RAG Agent Flow:
- User question β LangGraph routes to agent β Agent uses
search_documentstool - FAISS similarity search β Retrieve relevant chunks β LLM generates answer with citations
- Response streams back to UI in real-time
- Docker & Docker Compose (handles all dependencies)
- OpenAI API Key (Get one here)
# Clone the repository
git clone https://github.com/YPT-ME/AIAgent/
cd AIAgent
# Copy and configure the SINGLE .env file (at project root)
cp .env.example .env
# Edit .env and add your OpenAI API key
# OPENAI_API_KEY=sk-...Note: All environment variables are now centralized in the root .env file. No need for separate .env files in backend/ or ui/ directories.
Important: Add your documents BEFORE starting the services for automatic ingestion.
# Add your PDF or Markdown files to data/docs/
cp your-documents.pdf data/docs/
cp your-documentation.md data/docs/
# Supported formats: PDF (.pdf) and Markdown (.md)# Build and start all services
docker compose up --build
# Or run in detached mode
docker compose up --build -dThe services will be available at:
- LangGraph Server: http://localhost:2024
- Agent Chat UI: http://localhost:3001
- Analytics API: http://localhost:9081
- Grafana Dashboard: http://localhost:3002 (admin/admin)
- ClickHouse: http://localhost:8123
Note: Documents in data/docs/ are automatically ingested on first startup!
# Check ingestion status
docker compose exec langgraph-server python -m src.ingestion.ingest --status
# Manually re-ingest documents if needed
docker compose exec langgraph-server python -m src.ingestion.ingest --docs-dir /app/data/docsFor detailed development setup, testing, and contribution guidelines, see:
- CONTRIBUTING.md - Development workflows
- ARCHITECTURE.md - Technical deep dive
You can customize your agent's behavior by creating your own system prompt:
# Copy the example template
cp backend/system_prompt.example.txt backend/system_prompt.txt
# Edit with your custom instructions
nano backend/system_prompt.txt
# Restart the backend to apply changes
docker compose restart langgraph-serverWhat you can customize:
- Agent's personality and tone
- How it handles different types of questions
- Citation formats and source references
- Specific domain knowledge or rules
- Response style and structure
If backend/system_prompt.txt doesn't exist, the agent will automatically use backend/system_prompt.example.txt as a fallback.
LangGraph Server provides RESTful endpoints for thread management, streaming responses, and health checks.
# View logs
docker compose logs -f
# Restart services
docker compose restart
# Stop all services
docker compose downBuilt-in real-time analytics with ClickHouse + Grafana:
- π Response times, token usage, tool calls
- π₯ User engagement and session tracking
- π¨ Error monitoring and alerting
- β‘ Handles millions of events/second
Access Dashboard: http://localhost:3002 (admin/admin)
Required:
OPENAI_API_KEY- Your OpenAI API keyLANGGRAPH_API_KEY- Server authentication (min 32 chars)
Optional: Customize models, chunk sizes, rate limits, analytics settings, and more.
π Complete Environment Variables Reference
See detailed system architecture, scalability considerations, and design decisions:
π ARCHITECTURE.md
Backend: LangGraph, LangChain, LlamaIndex, FAISS, OpenAI, Python 3.11+
Frontend: Next.js, TypeScript, Agent Chat UI
Analytics: ClickHouse, Grafana
Infrastructure: Docker, Docker Compose
Want to contribute? Read CONTRIBUTING.md β’ Code of Conduct
Security is important to us. Please review our Security Policy for:
- Reporting vulnerabilities
- Security best practices
- API key management
- Production security guidelines
This project is licensed under the MIT License - see the LICENSE file for details.
If this project helped you, please:
- β Star this repository
- π Report bugs and suggest features
- π€ Contribute code improvements
- π Share with others learning AI development
- Issues: GitHub Issues
- Email: developer@ypt.me
Daniel Neto - Full-stack developer specializing in AI/ML applications and video platform solutions.
- π GitHub: @DanielnetoDotCom
- πΌ Project Maintainer & Lead Developer
- LangChain - LLM application framework
- LangGraph - Agent orchestration
- Agent Chat UI - Official chat interface
- LlamaIndex - Document parsing
- FAISS - Vector search
- OpenAI - LLM provider
- ClickHouse - Analytics database
- Grafana - Monitoring dashboards
RAG Retrieval-Augmented-Generation AI-Agent LangChain LangGraph OpenAI GPT-4 LLM Chatbot Virtual-Assistant Document-QA Knowledge-Base Vector-Database FAISS Embeddings Semantic-Search LlamaIndex Python TypeScript Next.js Docker Microservices ClickHouse Grafana Analytics Real-Time Streaming WebSocket Enterprise-AI Production-Ready Chat-UI PDF-Parser Markdown NLP Machine-Learning AI-Application Conversational-AI LangGraph-Server FastAPI React Full-Stack DevOps Monitoring Observability
Built with β€οΈ as a production-ready foundation for building custom AI agents
