Skip to content

YPT-ME/AIAgent

Repository files navigation

RAG AI Agent Monorepo

A production-ready Retrieval-Augmented Generation (RAG) AI Agent built with LangGraph Server, LangChain, LlamaIndex, FAISS, and OpenAI. Features official Agent Chat UI with comprehensive analytics and monitoring.

Build your own AI agent with custom knowledge base - Production-ready solution for deploying conversational AI applications with document retrieval capabilities.

License: MIT Python 3.11+ Production Ready Enterprise Grade Docker

πŸ“Έ Screenshots

πŸ’¬ AI Chat Interface - Real-time Streaming Responses

chrome-capture-2026-01-27

πŸ“Š Analytics Dashboard - Performance Monitoring with Grafana

image

πŸŽ₯ Live Demo

Try it now: AVideo AI Agent - A fully functional AI assistant specialized in the AVideo Platform documentation, demonstrating enterprise-grade RAG capabilities with 500+ documents indexed.

🎯 Key Highlights

Why This Project Stands Out:

  • ⚑ Performance Optimized: Incremental document processing with SHA256 hash tracking - only re-processes changed files, saving compute time and costs
  • 🏒 Production-Grade Architecture: Enterprise-ready with ClickHouse analytics processing millions of events/second, comprehensive error tracking, and Grafana dashboards
  • πŸ” Security-First Design: Implements API authentication, rate limiting (20 req/min default), input validation, and cost monitoring to prevent abuse
  • πŸ“Š Observable & Debuggable: Real-time metrics tracking response times (p50/p90/p95/p99 percentiles), token usage, tool calls, and session analytics
  • πŸ”§ Developer Experience: Single .env configuration, automatic document ingestion on startup, hot-reload in dev mode, comprehensive error logging
  • πŸš€ Scalable Foundation: Containerized microservices architecture ready for horizontal scaling, with async operations throughout

Technical Achievements:

  • Built complete RAG pipeline from scratch (ingestion β†’ embedding β†’ retrieval β†’ generation)
  • Integrated 5 major technologies (LangGraph, LangChain, LlamaIndex, FAISS, OpenAI) into cohesive system
  • Implemented professional analytics stack comparable to commercial products
  • Designed modular architecture enabling easy customization for different knowledge domains

✨ Core Features

  • πŸ€– Production-Ready Agent: LangGraph-powered agent with streaming responses and conversation memory
  • πŸ“š Smart RAG Pipeline: LlamaIndex document parsing (PDF/Markdown) β†’ OpenAI embeddings β†’ FAISS vector search with similarity ranking
  • πŸ’¬ Modern Chat UI: Official LangChain Agent Chat interface with real-time WebSocket streaming
  • πŸ“Š Enterprise Analytics: ClickHouse (10M+ events/sec capability) + Grafana dashboards with 90-day retention and auto-cleanup
  • πŸ”’ Security Hardened: JWT authentication, configurable rate limiting, input sanitization, and usage cost tracking
  • 🐳 Fully Containerized: 6 microservices orchestrated with Docker Compose, health checks, and graceful shutdown
  • πŸš€ Zero-Config Start: Single .env file, automatic document ingestion, and pre-configured analytics dashboards
  • 🐳 Docker Ready: Complete containerized deployment
  • πŸš€ Easy Setup: Single .env configuration for all services

πŸ—οΈ Architecture

See detailed architecture documentation in ARCHITECTURE.md.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚  Agent Chat UI  │────▢│  LangGraph      │────▢│   RAG Agent     β”‚
β”‚   (Next.js)     β”‚ WS  β”‚  Server         β”‚     β”‚   (LangGraph)   β”‚
β”‚   Port 3001     │◀────│  Port 2024      │◀────│                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚                       β”‚
                                  β”‚                       β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                         β”‚               β”‚ search_docs   β”‚
        β”‚                         β”‚               β”‚    Tool       β”‚
        β”‚                         β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
        β–Ό                         β–Ό                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                β–Ό
β”‚   Grafana    │◀────────│  ClickHouse  β”‚β—€β”€β”€β”€β”€β”€β”€β”€β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Dashboards  β”‚         β”‚  Analytics   β”‚        β”‚    FAISS      β”‚
β”‚  Port 3002   β”‚         β”‚     DB       β”‚        β”‚ Vector Store  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                          β”‚
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                         β”‚                                β”‚
                         β–Ό                                β–Ό
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β”‚   OpenAI      β”‚              β”‚  Ingestion    β”‚
                 β”‚   LLM API     β”‚              β”‚  Pipeline     β”‚
                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                                β”‚  PDF/MD Files β”‚
                                                β”‚  ./data/docs  β”‚
                                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”„ How It Works

Document Ingestion Pipeline:

  1. PDF/Markdown β†’ Extract text β†’ Chunk into segments β†’ Generate embeddings β†’ Store in FAISS
  2. SHA256 hash tracking enables incremental updates (only changed files are re-processed)

RAG Agent Flow:

  1. User question β†’ LangGraph routes to agent β†’ Agent uses search_documents tool
  2. FAISS similarity search β†’ Retrieve relevant chunks β†’ LLM generates answer with citations
  3. Response streams back to UI in real-time

πŸ“‹ Prerequisites

  • Docker & Docker Compose (handles all dependencies)
  • OpenAI API Key (Get one here)

πŸ› οΈ Quick Start

1. Clone and Setup Environment

# Clone the repository
git clone https://github.com/YPT-ME/AIAgent/
cd AIAgent

# Copy and configure the SINGLE .env file (at project root)
cp .env.example .env

# Edit .env and add your OpenAI API key
# OPENAI_API_KEY=sk-...

Note: All environment variables are now centralized in the root .env file. No need for separate .env files in backend/ or ui/ directories.

2. Add Your Documents

Important: Add your documents BEFORE starting the services for automatic ingestion.

# Add your PDF or Markdown files to data/docs/
cp your-documents.pdf data/docs/
cp your-documentation.md data/docs/

# Supported formats: PDF (.pdf) and Markdown (.md)

3. Run with Docker Compose (Recommended)

# Build and start all services
docker compose up --build

# Or run in detached mode
docker compose up --build -d

The services will be available at:

Note: Documents in data/docs/ are automatically ingested on first startup!

4. Verify Ingestion (Optional)

# Check ingestion status
docker compose exec langgraph-server python -m src.ingestion.ingest --status

# Manually re-ingest documents if needed
docker compose exec langgraph-server python -m src.ingestion.ingest --docs-dir /app/data/docs

πŸ”§ Local Development

For detailed development setup, testing, and contribution guidelines, see:

🎨 Customizing the Agent

You can customize your agent's behavior by creating your own system prompt:

# Copy the example template
cp backend/system_prompt.example.txt backend/system_prompt.txt

# Edit with your custom instructions
nano backend/system_prompt.txt

# Restart the backend to apply changes
docker compose restart langgraph-server

What you can customize:

  • Agent's personality and tone
  • How it handles different types of questions
  • Citation formats and source references
  • Specific domain knowledge or rules
  • Response style and structure

If backend/system_prompt.txt doesn't exist, the agent will automatically use backend/system_prompt.example.txt as a fallback.

πŸ”Œ API Reference

LangGraph Server provides RESTful endpoints for thread management, streaming responses, and health checks.

πŸ“š Full API Documentation

🐳 Useful Commands

# View logs
docker compose logs -f

# Restart services
docker compose restart

# Stop all services
docker compose down

πŸ“Š Analytics & Monitoring

Built-in real-time analytics with ClickHouse + Grafana:

  • πŸ“ˆ Response times, token usage, tool calls
  • πŸ‘₯ User engagement and session tracking
  • 🚨 Error monitoring and alerting
  • ⚑ Handles millions of events/second

Access Dashboard: http://localhost:3002 (admin/admin)

πŸ“š Detailed Analytics Guide

βš™οΈ Configuration

Required:

  • OPENAI_API_KEY - Your OpenAI API key
  • LANGGRAPH_API_KEY - Server authentication (min 32 chars)

Optional: Customize models, chunk sizes, rate limits, analytics settings, and more.

πŸ“‹ Complete Environment Variables Reference

πŸ—οΈ Architecture

See detailed system architecture, scalability considerations, and design decisions:

πŸ“š ARCHITECTURE.md

Technology Stack

Backend: LangGraph, LangChain, LlamaIndex, FAISS, OpenAI, Python 3.11+
Frontend: Next.js, TypeScript, Agent Chat UI
Analytics: ClickHouse, Grafana
Infrastructure: Docker, Docker Compose

🀝 Contributing

Want to contribute? Read CONTRIBUTING.md β€’ Code of Conduct

πŸ”’ Security

Security is important to us. Please review our Security Policy for:

  • Reporting vulnerabilities
  • Security best practices
  • API key management
  • Production security guidelines

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🌟 Show Your Support

If this project helped you, please:

  • ⭐ Star this repository
  • πŸ› Report bugs and suggest features
  • 🀝 Contribute code improvements
  • πŸ“– Share with others learning AI development

πŸ“ž Contact & Discussion

πŸ‘¨β€πŸ’» About the Author

Daniel Neto - Full-stack developer specializing in AI/ML applications and video platform solutions.

πŸ™ Acknowledgments


🏷️ Keywords & Topics

RAG Retrieval-Augmented-Generation AI-Agent LangChain LangGraph OpenAI GPT-4 LLM Chatbot Virtual-Assistant Document-QA Knowledge-Base Vector-Database FAISS Embeddings Semantic-Search LlamaIndex Python TypeScript Next.js Docker Microservices ClickHouse Grafana Analytics Real-Time Streaming WebSocket Enterprise-AI Production-Ready Chat-UI PDF-Parser Markdown NLP Machine-Learning AI-Application Conversational-AI LangGraph-Server FastAPI React Full-Stack DevOps Monitoring Observability


Built with ❀️ as a production-ready foundation for building custom AI agents

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors