A powerful AI-driven system integrating Retrieval-Augmented Generation (RAG), LangGraph, OpenAI embeddings, and Supabase for efficient document retrieval, processing, and conversational AI workflows.
- Document Ingestion: Load and split HTML documents into smaller chunks.
- Vector Storage: Store embeddings in Supabase for efficient retrieval.
- RAG Pipeline: Retrieve relevant context and generate answers using OpenAI.
- LangGraph Integration (Coming Soon): Implement graph-based LLM workflows.
- Chatbot Capabilities: Extendable to real-time conversational AI.
- Python (Primary Language)
- OpenAI API (LLM & Embeddings)
- Supabase (Vector Database & Storage)
- LangChain & LangGraph (AI Workflow & Retrieval)
- BeautifulSoup (HTML Parsing)
-
Clone the Repository
git clone https://github.com/prinzeval/graph-rag-pipeline.git cd YOUR_REPO_NAME -
Install Dependencies
pip install -r requirements.txt
-
Set Up Environment Variables
Create a
.envfile and add your credentials:OPENAI_API_KEY=your_openai_api_key SUPABASE_URL=your_supabase_url SUPABASE_KEY=your_supabase_key
-
Run the Notebook
If using Jupyter Notebook, run:
jupyter notebook
Then, open the notebook and execute the cells step by step.
- Load & Split Documents: Extract text from HTML files.
- Generate & Store Embeddings: Store document vectors in Supabase.
- Retrieve Relevant Chunks: Match user queries with stored documents.
- Generate Answer: Use OpenAI LLM to create responses.
- LangGraph Integration (Upcoming): Advanced AI workflow management.
- ✅ Implement RAG-based document retrieval
- ✅ Store embeddings in Supabase
- ⏳ Integrate LangGraph for workflow automation
- ⏳ Develop chatbot functionalities
- ⏳ Optimize for real-time inference
Feel free to submit issues, suggestions, or pull requests!
This project is licensed under the MIT License.
🚀 Built with Passion 🚀