A Retrieval-Augmented Generation (RAG) system that transforms your documents into an intelligent, queryable knowledge base — enabling accurate, context-aware answers grounded in your own data.
The RAG Knowledge Assistant combines the power of large language models with semantic document retrieval to answer questions based on your custom knowledge sources. Unlike standard LLMs, this system grounds its responses in your actual documents — eliminating hallucinations and ensuring answers are always relevant and traceable.
| Standard LLM | RAG Knowledge Assistant |
|---|---|
| Answers from training data only | Answers from your documents |
| May hallucinate facts | Grounded in retrieved context |
| Static knowledge cutoff | Always up-to-date with your data |
| No source attribution | Cites source documents |
┌─────────────────────────────────────┐
│ RAG Pipeline │
│ │
Your Documents ──────►│ INGESTION │
(PDF, TXT, etc.) │ ├── Load & parse documents │
│ ├── Chunk text │
│ └── Embed & store in vector DB │
│ │ │
User Query ──────────►│ RETRIEVAL │
│ ├── Embed query │
│ ├── Similarity search │
│ └── Fetch top-k relevant chunks │
│ │ │
│ API / LLM │
│ ├── Augment prompt with context │
│ └── Generate grounded answer │
└─────────────────────────────────────┘
│
Answer + Sources
rag-knowledge-assistant/
│
├── api/ # REST API layer
│ ├── app.py # FastAPI / Flask application
│ ├── routes.py # Endpoint definitions
│ └── schemas.py # Request / response models
│
├── ingestion/ # Document ingestion pipeline
│ ├── loader.py # Document loaders (PDF, TXT, DOCX, web)
│ ├── chunker.py # Text chunking strategies
│ └── embedder.py # Embedding generation & vector storage
│
├── retrieval/ # Semantic retrieval engine
│ ├── retriever.py # Vector similarity search
│ ├── reranker.py # Result reranking
│ └── generator.py # LLM response generation with context
│
├── Dockerfile # Docker image definition
├── docker-compose.yml # Multi-service orchestration
├── requirements.txt # Python dependencies
└── README.md
| Layer | Technology |
|---|---|
| API | FastAPI / Flask |
| Embeddings | OpenAI text-embedding-ada-002 / HuggingFace |
| Vector Store | FAISS / ChromaDB / Pinecone |
| LLM | OpenAI GPT-4 / GPT-3.5 |
| Document Parsing | LangChain / LlamaIndex |
| Containerization | Docker + Docker Compose |
| Language | Python 3.9+ |
- Python 3.9+
- Docker & Docker Compose
- OpenAI API key (or compatible LLM provider)
git clone https://github.com/workgarimaswami/rag-knowledge-assistant.git
cd rag-knowledge-assistantcp .env.example .envEdit .env:
OPENAI_API_KEY=your_openai_api_key_here
VECTOR_STORE=faiss # faiss | chroma | pinecone
EMBEDDING_MODEL=text-embedding-ada-002
LLM_MODEL=gpt-4
CHUNK_SIZE=500
CHUNK_OVERLAP=50pip install -r requirements.txtpython ingestion/loader.py --source ./documents/python api/app.pyAPI available at http://localhost:8000.
# Build and start all services
docker-compose up --build
# Run in detached mode
docker-compose up -d
# Stop services
docker-compose downServices started:
api— REST API on port8000vector-db— Vector store on port6333(if using ChromaDB)
POST /ingest
{
"source": "path/to/document.pdf",
"metadata": { "category": "finance", "author": "Jane Doe" }
}POST /query
{
"question": "What are the key findings from the Q3 report?",
"top_k": 5
}Response:
{
"answer": "According to the Q3 report, revenue grew by 23% year-over-year...",
"sources": [
{ "document": "q3_report.pdf", "page": 4, "relevance_score": 0.94 }
]
}GET /health → { "status": "ok", "documents_indexed": 142 }
| Parameter | Default | Description |
|---|---|---|
CHUNK_SIZE |
500 |
Token size per document chunk |
CHUNK_OVERLAP |
50 |
Overlap between consecutive chunks |
TOP_K |
5 |
Number of retrieved chunks per query |
VECTOR_STORE |
faiss |
Backend vector store |
LLM_MODEL |
gpt-4 |
Language model for generation |
- 📄 PDF (
.pdf) - 📝 Plain text (
.txt) - 📋 Word documents (
.docx) - 🌐 Web URLs
- 📊 Markdown (
.md) - 🗂️ CSV / JSON
pytest tests/ -v- Multi-modal support (images, tables)
- Streaming responses
- Conversation memory / chat history
- Web UI / chat interface
- Support for multiple vector store backends
- Document versioning and update tracking
- Fork the repository
- Create your feature branch (
git checkout -b feature/your-feature) - Commit your changes (
git commit -m 'Add your feature') - Push to the branch (
git push origin feature/your-feature) - Open a Pull Request
This project is licensed under the MIT License — see the LICENSE file for details.
Garima Swami · @workgarimaswami
⭐ If this project helped you build something awesome, give it a star!