A production-grade Retrieval-Augmented Generation (RAG) chatbot built with LangChain, FAISS, Groq, and Streamlit. Upload documents or scrape websites and chat with your knowledge base using free LLMs.
- Document Loaders — PDF (PyPDF2), DOCX, TXT, Markdown
- Smart Chunking — Recursive, Character, Token-based strategies
- Free Embeddings —
sentence-transformers/all-MiniLM-L6-v2(runs locally) - Hybrid Search — FAISS dense + BM25 sparse with Reciprocal Rank Fusion
- Free LLMs — Groq (llama-3.1, mixtral, gemma) + HuggingFace
- RAG Evaluation — RAGAS framework (Faithfulness, Answer Relevancy, Context Precision)
- Web Scraping — BeautifulSoup + Selenium fallback
- Multilingual — Auto-detect + translate (13 languages)
- Export Chat — TXT, CSV, JSON, PDF
- Conversation Memory — Sliding window history with standalone query rewriting
git clone https://github.com/ronitgulia/RAG-Chatbot
cd RAG-Chatbotpip install -r requirements.txtSign up at console.groq.com — it's completely free.
streamlit run app.py- Paste your Groq API key in the sidebar → click Connect LLM
- Go to Documents tab → upload files or scrape URLs
- Go to Chat tab → ask questions
- View Evaluation tab for quality metrics
- Export your conversation in any format
RagChatBot/
├── app.py # Streamlit UI (Chat, Documents, Evaluation, Export)
├── rag_pipeline.py # Core orchestrator
├── document_loader.py # PDF, DOCX, TXT loaders
├── text_chunker.py # LangChain text splitters
├── embeddings.py # Sentence-transformers wrapper
├── vector_store.py # FAISS + BM25 hybrid store
├── llm_provider.py # Groq + HuggingFace providers
├── evaluation.py # RAGAS evaluation pipeline
├── web_scraper.py # Web scraping module
├── multilingual.py # Language detection & translation
├── export_utils.py # Chat export utilities
├── styles.py # Custom Streamlit CSS
├── config.py # Centralized configuration
├── requirements.txt # Dependencies
└── .env.example # Environment variable template
| Metric | Description |
|---|---|
| Faithfulness | Is the answer grounded in retrieved context? |
| Answer Relevancy | Does the answer address the question? |
| Context Precision | Are retrieved chunks relevant to the query? |
| Model | Context | Best For |
|---|---|---|
llama-3.1-8b-instant |
128k | Fast responses (default) |
llama-3.3-70b-versatile |
128k | Best quality |
deepseek-r1-distill-llama-70b |
128k | Reasoning tasks |
gemma2-9b-it |
8k | Balanced |
mixtral-8x7b-32768 |
32k | Long documents |
MIT License