🔍 RAG Document Chat

Chat with your documents using Retrieval-Augmented Generation (RAG). Upload PDFs, Word docs, or text files and ask questions — powered by LangChain, FastAPI, and a React frontend.

🚀 Features

Upload and index multiple documents (PDF, DOCX, TXT)
Semantic search using vector embeddings (FAISS / ChromaDB)
Conversational memory — ask follow-up questions
Source citations with every answer
Clean React UI with streaming responses

🛠️ Tech Stack

Layer	Technology
Backend	Python, FastAPI, LangChain
Embeddings	OpenAI `text-embedding-ada-002` / HuggingFace
Vector Store	FAISS (local) / ChromaDB
LLM	OpenAI GPT-4 / Claude (Anthropic)
Frontend	React, Vite, TailwindCSS
Deployment	Railway (backend) + Vercel (frontend)

📁 Project Structure

rag-document-chat/
├── backend/
│   ├── app/
│   │   ├── main.py          # FastAPI entry point
│   │   ├── routes/
│   │   │   ├── upload.py    # Document upload & indexing
│   │   │   └── chat.py      # Chat & retrieval endpoint
│   │   ├── services/
│   │   │   ├── embedder.py  # Embedding logic
│   │   │   ├── retriever.py # Vector store retrieval
│   │   │   └── llm.py       # LLM chain setup
│   │   └── models.py        # Pydantic schemas
│   ├── requirements.txt
│   └── .env.example
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   │   ├── ChatWindow.jsx
│   │   │   ├── FileUpload.jsx
│   │   │   └── MessageBubble.jsx
│   │   ├── App.jsx
│   │   └── main.jsx
│   ├── package.json
│   └── .env.example
└── docker-compose.yml

⚡ Quick Start

Backend

cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env      # Add your API keys
uvicorn app.main:app --reload

Frontend

cd frontend
npm install
cp .env.example .env
npm run dev

Open http://localhost:5173

🔑 Environment Variables

# backend/.env
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here   # optional, for Claude
CHROMA_PERSIST_DIR=./chroma_db
MAX_UPLOAD_SIZE_MB=20

🧠 How It Works

Upload — Documents are chunked and embedded into a vector store
Query — User question is embedded and top-k similar chunks are retrieved
Generate — LLM generates an answer grounded in retrieved context
Cite — Response includes source document + page references

📊 Architecture

User Question
     │
     ▼
[Embedding Model] ──► [Vector Store] ──► [Top-K Chunks]
                                               │
                                               ▼
                                    [LLM + System Prompt]
                                               │
                                               ▼
                                    Answer + Source Citations

🚢 Deployment

# Backend → Railway
railway login && railway up

# Frontend → Vercel
vercel --prod

📄 License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 RAG Document Chat

🚀 Features

🛠️ Tech Stack

📁 Project Structure

⚡ Quick Start

Backend

Frontend

🔑 Environment Variables

🧠 How It Works

📊 Architecture

🚢 Deployment

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔍 RAG Document Chat

🚀 Features

🛠️ Tech Stack

📁 Project Structure

⚡ Quick Start

Backend

Frontend

🔑 Environment Variables

🧠 How It Works

📊 Architecture

🚢 Deployment

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages