An interactive RAG (Retrieval-Augmented Generation) chatbot designed to chat with your own documents, powered by Google Gemini, LangChain, and a user-friendly Streamlit interface.
- Python 3.9 or higher
- Git
- Google API Key for Gemini model
# Clone the repository
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name
# Create and activate virtual environment
# Windows
python -m venv venv
.\venv\Scripts\Activate.ps1
# macOS / Linux
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtstreamlit run app.pyYour default web browser will automatically open the application.
In the left sidebar, paste your Google API Key into the "Enter your Google API Key" field. The key is securely stored in st.session_state for the duration of your session and is never permanently saved.
- Drag and drop files (.pdf, .txt, .docx, .csv) into the uploader
- Upload multiple files at once if needed
- Click the "Process Documents" button
The application will display real-time status updates as it:
- Ingests files
- Chunks the text
- Generates and stores embeddings
Once processing is complete, type your questions into the chat interface at the bottom of the screen. The chatbot will provide context-aware answers based on your documents.
The application implements a complete Retrieval-Augmented Generation pipeline that enhances LLM capabilities by grounding responses in your specific data.
Raw text is extracted from uploaded files and cleaned to remove inconsistencies, non-standard characters, and formatting issues. This ensures high-quality data for downstream processing.
Large documents are split into smaller, manageable chunks with intentional overlaps. This overlap preservation helps maintain semantic context that might otherwise be lost at chunk boundaries.
- Each text chunk is converted to a numerical vector using the
sentence-transformers/all-MiniLM-L6-v2model - Vectors capture semantic meaning of text
- All vectors are stored in a FAISS (Facebook AI Similarity Search) index for fast, efficient searching
When you ask a question:
- Your question is converted into a vector
- The FAISS index searches for the most semantically similar text chunks from your documents
- Retrieved chunks are formatted with your question into a detailed prompt
- Google Gemini Pro generates an answer based solely on this context
rag_tutorial/
│
├── app.py # Main Streamlit interface
├── main.py # Optional CLI pipeline execution
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore configuration
├── README.md # Documentation
│
├── vectorstore/ # FAISS index storage directory
│
├── src/ # Core RAG pipeline modules
│ ├── data_ingest.py # File extraction and text cleaning
│ ├── chunk_data.py # Document chunking logic
│ ├── embed_store.py # Embedding and FAISS indexing
│ └── retrive_generate.py # Retrieval and answer generation
│
└── data/ # Sample documents (optional)
🎨 Interactive UI — Clean, intuitive interface built with Streamlit
📄 Multi-Format Support — Upload .pdf, .txt, .docx, and .csv files
🔐 Secure API Handling — Session-based key management, no persistent storage
⚡ Fast Retrieval — FAISS-powered vector search with instant results
💬 Real-Time Q&A — Get answers synthesized directly from your documents
🧩 Modular Design — Organized codebase that's easy to understand and extend
🔄 Persistent Storage — Embeddings saved locally for quick reloading (source documents discarded for privacy)
| Component | Technology |
|---|---|
| Frontend | Streamlit |
| LLM | Google Gemini Pro |
| Framework | LangChain |
| Embeddings | sentence-transformers/all-MiniLM-L6-v2 (Hugging Face) |
| Vector Store | FAISS |
| File Processing | pypdf, python-docx, pandas |
The following features are planned for future releases:
- Support for additional document types (.pptx, .html)
- UI option to select different embedding models and LLMs
- Display source chunks alongside answers for verification
- Smart caching to prevent re-embedding of unchanged files