A production-ready, full-stack Retrieval-Augmented Generation (RAG) system with context-aware conversations. Upload documents (PDF/DOCX/TXT) and chat with an AI that remembers your conversation history.
- ๐ Multi-Format Support: PDF, DOCX, TXT, MD files
- ๐ง Context-Aware AI: Remembers last 6 conversation exchanges
- ๐ Vector Search: ChromaDB-powered semantic retrieval
- ๐ฏ Intelligent Chunking: 1500-char chunks with 200-char overlap
- ๐ Document Management: Track indexed documents
- ๐ Production Ready: Health checks, rate limiting, error handling
- ๐ณ Fully Dockerized: One-command deployment
- ๐ Cloud Ready: AWS/GCP/Azure deployment configurations
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Frontend โ
โ React + Tailwind (Vite) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FastAPI API โ
โ (Document & Chat APIs) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ MongoDB โ โ ChromaDB โ โ Gemini/OpenAIโ
โ (metadata) โ โ (embeddings) โ โ (generation) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
- Docker & Docker Compose (recommended)
- OR Python 3.12+ & Node.js 20+
- Google Gemini API Key (Get one here)
git clone https://github.com/satvikmishra44/PanScienceRAG.git
cd PanScienceRAGdocker-compose up --buildThis launches:
| Service | Description | URL |
|---|---|---|
| ๐ง Backend (FastAPI) | Core API | http://localhost:8000 |
| ๐ฌ Frontend (React) | UI Dashboard | http://localhost:5173 |
Just Go To Frontend localhost port by clicking here (after running the docker command) and use the application yourself. As Simple As That
docker-compose downCheck API health:
curl http://localhost:8000/pingExpected response:
{"status": "ok", "service": "RAG-pipeline"}http://localhost:8000
Endpoint: /ingest
Method: POST
Uploads and indexes a document for retrieval and querying.
curl -X POST "http://localhost:8000/ingest" \
-F "file=@research_paper.pdf"{
"status": "success",
"doc_id": "66e432f78df123abc",
"chunks": 98
}Endpoint: /query
Method: POST
Ask questions based on the indexed documents.
Supports chat history for contextual and conversational responses.
{
"query": "What are the applications of quantum entanglement?",
"top_k": 4,
"history": [
{"role": "user", "text": "Tell me about quantum mechanics"},
{"role": "ai", "text": "Quantum mechanics studies the behavior of matter and energy..."}
]
}{
"status": "success",
"answer": "Quantum entanglement enables applications in quantum computing, teleportation, and cryptography...",
"sources": [
{
"text": "Quantum entanglement is a phenomenon...",
"meta": {"source_filename": "quantum_intro.pdf"},
"distance": 0.12
}
]
}Endpoint: /documents
Method: GET
curl http://localhost:8000/documentsEndpoint: /ping
Method: GET
curl http://localhost:8000/ping{"status": "ok", "service": "RAG-pipeline"}PanScience/ โ โโโ backend/ โ โโโ app/ โ โ โโโ main.py # FastAPI entry point โ โ โโโ db.py # Database + LLM initialization โ โ โโโ rag.py # RAG ingestion and query logic โ โ โโโ utils.py # File handling and text chunking โ โโโ requirements.txt โ โโโ Dockerfile โ โโโ frontend/ โ โโโ src/ โ โ โโโ components/ โ โ โ โโโ Chat.jsx โ โ โ โโโ DocManager.jsx โ โ โ โโโ Landing.jsx โ โโโ package.json โ โโโ Dockerfile โ โโโ docker-compose.yaml โโโ README.md
docker-compose up --buildThis launches:
| Service | Description | URL |
|---|---|---|
| ๐ง Backend (FastAPI) | Core API | http://localhost:8000 |
| ๐ฌ Frontend (React) | UI Dashboard | http://localhost:5173 |
docker-compose down| Layer | Technology |
|---|---|
| ๐ฅ๏ธ Frontend | React (Vite), TailwindCSS |
| โ๏ธ Backend | FastAPI, Python |
| ๐๏ธ Database | MongoDB |
| ๐งฎ Vector Store | ChromaDB |
| ๐ง LLM | Gemini / OpenAI / Claude |
| ๐ณ Containerization | Docker, Docker Compose |
- Ensure MongoDB and ChromaDB services are running before ingesting documents.
- Large PDFs are automatically chunked for efficient retrieval.