This project is a robust, scalable conversational AI system that integrates Retrieval-Augmented Generation (RAG) and Conversational RAG (CRAG) in a clean, togglable interface using Streamlit, PostgreSQL, and OpenAI or BGE embeddings.
Built as an evolution of a production-grade internal tool, this repo now serves as a high-quality, reusable personal project for document-based chatbot development.
- Embeds and retrieves top relevant document chunks for each user query.
- Supports both OpenAI and BGE embeddings.
- Stateless: each query is processed independently.
- Maintains a full user-bot conversation history (stored in PostgreSQL).
- Retrieval and response generation are contextualized using recent dialogue.
- Auto-formatted history embedded into the prompt dynamically.
- Vector storage and history: PostgreSQL
- Embedding models:
OpenAI/BAAI/bge-base-en-v1.5 - Streamlit UI with RAG/CRAG toggle
- File ingestion, chunking, and OCR-ready hooks
┌──────────────────────┐
│ Streamlit UI │ ◄── User interaction (Chat.py)
└────────┬─────────────┘
│
┌─────────────▼──────────────┐
│ RAG / CRAG Mode Selector │ ◄── User selects mode
└──────┬────────────┬────────┘
│ │
┌──────────────▼──┐ ┌───▼────────────────┐
│ RAG Retriever │ │ CRAG Retriever │
│ (rag_utils1.py) │ │ (crag_utils_pg.py) │
└─────────────────┘ └────────────────────┘
│ │
┌────▼────┐ ┌─────▼──────┐
│ Embeds │ │ History + │
│ Chunks │ │ Contextual │
└────┬────┘ │ Retrieval │
│ └────────────┘
┌────▼──────────────────────────────┐
│ PostgreSQL: chunks + history │
└───────────────────────────────────┘
my-chat-app/
├── Chat_with_CRAG_PostgreSQL.py # Streamlit frontend with RAG/CRAG toggle
├── rag_utils1_cleaned_no_comments.py # RAG utilities (chunking, store, retrieve)
├── crag_utils_pg.py # CRAG utilities with PostgreSQL backend
├── .env # Environment variables (API keys, DB config)
├── requirements.txt # Python dependencies
└── README.mdgit clone https://github.com/YOUR_USERNAME/my-chat-app.git
cd my-chat-apppip install -r requirements.txtCreate a .env file in root:
OPENAI_API_KEY=your_openai_key
PG_HOST=localhost
PG_PORT=5432
PG_DB=ragdb
PG_USER=raguser
PG_PASS=ragpassstreamlit run Chat_with_CRAG_PostgreSQL.py- Add login control (
auth.py) for real user security - Integrate OCR pipeline using
pytesseractandPIL - Serve on cloud platforms (e.g., Streamlit Cloud, Docker, GCP, AWS)
This project is licensed under the MIT License — free to use, modify, and distribute.
Created by [Your Name] as a personal showcase of full-stack applied AI, with production-quality structure and deployment readiness.