A full-stack Retrieval-Augmented Generation (RAG) chat application that enables users to interact with an AI assistant using their own documents and web sources as context.
- Interactive Chat: Real-time conversation with AI assistant
- Multi-Format Support: PDFs, DOCX, images (with OCR), and web URLs
- Swappable LLM Models: Easily swap LLM models for different use cases
- Advanced Model Features: Select models have built-in Web search, Reasoning, Code Execution, etc.
- Session Management: Per-user isolated conversations and documents
- Markdown and LaTeX Support: Rich chats with support for bullet points, bold/italic text, code blocks, and complex math using LaTeX
- Vector Search: Document chunking and retrieval using embeddings
- RAG Pipeline: Intelligent context retrieval with overflow prevention
- Framework: Flask
- LLM: GroqCloud API
- Embeddings: HuggingFace (all-mpnet-base-v2, etc.)
- Vector Store: Chroma with persistent storage
- Document Processing: pypdf, python-docx, EasyOCR
- Web Scraping: Dynamic JS detection and website loading with Selenium + BeautifulSoup
- Framework: React + TypeScript
- Build Tool: Vite
- Styling: Tailwind CSS
- Components: ShadCN UI
- Markdown Rendering: React Markdown + Syntax Highlighting
llm/
├── backend/ # Python Flask API
│ ├── app.py # Main Flask application & route handlers
│ ├── llm.py # LLM chat and retrieval logic
│ ├── chroma.py # Vector database management
│ ├── db.py # Session database caching
│ ├── history.py # Chat history with timestamps
│ ├── embeddings.py # Embedding initialization
│ ├── extractors.py # Document extraction (PDF, DOCX, OCR, URLs)
│ ├── logger.py # Centralized logging
│ ├── config.py # Configuration management
│ ├── app_types.py # Type definitions (renamed to avoid shadowing stdlib)
│ └── requirements.txt # Python dependencies
│
├── frontend/ # React TypeScript application
│ ├── src/
│ │ ├── App.tsx # Main application component
│ │ ├── main.tsx # Entry point
| | ├── constants.tsx # Frontend constants
│ │ ├── components/
│ │ │ ├── chat-interface.tsx # Main chat UI
│ │ │ ├── chat-input.tsx # Message input & source selection
│ │ │ ├── chat-messages.tsx # Message display
│ │ │ ├── source-manager.tsx # File/URL management
│ │ │ └── ui/ # Reusable UI components
│ │ ├── lib/
│ │ │ └── utils.ts # Utility functions
│ │ └── types/
│ │ ├── chat.ts # chat message typedef
| | └── session.ts # session typedef
│ ├── package.json
│ ├── vite.config.ts
│ └── tsconfig.json
├── .env.sample # sample environment file
├── start.sh # One-click script to setup and run entire project
├── stop.sh # One-click script to stop running the project
└── README.md # This file
# Clone repository
git clone https://github.com/spaceshark123/llm.git
cd llm
# Create .env file
cp .env.sample .env
# IMPORTANT: Edit .env and create/add GROQ_API_KEY from https://console.groq.com/keys
./start.shThere are two provided bash shell scripts for easy setup and reproduction on any system:
# Clone repository
git clone https://github.com/spaceshark123/llm.git
cd llm
# Create .env file
cp .env.sample .env
# IMPORTANT: Edit .env and create/add GROQ_API_KEY from https://console.groq.com/keys
# Backend setup
cd backend
pip install -r requirements.txt
# Frontend setup
cd frontend
npm install# Terminal 1 - Backend
cd backend
python app.py
# Terminal 2 - Frontend
cd frontend
npm run devVisit http://localhost:5173 in your browser.
- Upload a PDF or paste a URL
- Type a question in the chat
- Select the sources you want to include
- Watch as the AI answers with context
- Try switching models for faster, more detailed, or more accurate responses
curl -X POST http://localhost:5050/api/chat \
-H "Session-ID: my-session" \
-H "Content-Type: application/json" \
-d '{
"message": "What is this about?",
"selectedSources": [{"name": "document.pdf", "size": 1024}]
}'curl -X POST http://localhost:5050/api/sources \
-H "Session-ID: my-session" \
-F "file=@document.pdf"curl -X GET http://localhost:5050/api/history \
-H "Session-ID: my-session"Create .env file with same format as the provided .env.sample file. Make sure to set the GROQ_API_KEY value by creating an account/key at GroqCloud:
# Overall Settings
GROQ_API_KEY=your_api_key_here
DATA_PATH="data"
TEMP_PATH="temp"
CHROMA_PATH="chroma"
BACKEND_PORT=5050
VITE_API_URL=http://localhost:5050/api
USER_AGENT="Mozilla/5.0"
# LLM Settings
TEMPERATURE=0.7
# Initial default model (this doesn't matter as much, since the user can change it in the frontend)
MODEL_NAME="llama-3.1-70b-versatile"
# Maximum prompt length in characters
VITE_MAX_PROMPT_LENGTH=5000
# RAG Settings
RAG_ENABLED=True
RAG_TOP_K=5
CHUNK_SIZE=300
CHUNK_OVERLAP=100
# MAIN CHOICES: all-mpnet-base-v2, sentence-transformers/all-MiniLM-L6-v2, etc.
EMBEDDING_MODEL="all-mpnet-base-v2" A full list of usable models for the MODEL_NAME field can be found here, along with rate limits for each (RPM/RPD, TPM/TPD)
For issues, questions, or suggestions, please refer to the documentation or check the GitHub Issues.