Skip to content

0xMihirK/EduRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ“ EduRAG Voice Assistant

An AI-powered educational voice assistant that answers questions from your knowledge base using Retrieval-Augmented Generation.

Python LiveKit RAG License


🌟 What is this?

EduRAG ingests your educational documents (NCERT textbooks, notes, PDFs-to-text, etc.), creates semantic embeddings, and lets you ask questions by voice. A LiveKit-powered voice agent retrieves the most relevant passages and answers in natural language β€” in English or Hindi/English bilingual mode.

Key Features

Feature Description
πŸ” Semantic Search Find answers by meaning, not just keywords β€” powered by all-MiniLM-L6-v2 sentence-transformers
🎀 Voice Agent Talk to your knowledge base using LiveKit real-time voice infrastructure
🌐 Bilingual English-only or Hindi/English code-switching agent
πŸ“„ Auto-Ingest Drop .txt, .md, .py, .json, or .csv files into documents/ and ingest with one command
πŸ”’ Change Detection MD5 hashing skips re-processing unchanged files
πŸ’Ύ Local-first SQLite database β€” no external vector DB required

πŸ—οΈ Architecture

graph LR
    subgraph Client
        A[🎀 User Voice]
    end

    subgraph LiveKit Cloud
        B[WebRTC Room]
    end

    subgraph Voice Agent
        C[English Agent]
        D[Hindi/English Agent]
    end

    subgraph RAG Engine
        E[Embeddings<br/>all-MiniLM-L6-v2]
        F[Text Chunking<br/>Sentence Boundaries]
        G[SQLite Database]
    end

    subgraph AI Services
        H[Google Gemini<br/>LLM]
        I[Deepgram<br/>STT]
        J[Cartesia<br/>English TTS]
        K[Sarvam AI<br/>Hindi STT & TTS]
    end

    A <-->|WebRTC| B
    B <--> C
    B <--> D
    C --> E
    D --> E
    E <--> G
    F --> G
    C <--> H
    D <--> H
    C <--> I
    C <--> J
    D <--> K
Loading

Tech Stack: Python 3.10+ Β· LiveKit Agents Β· Sentence-Transformers Β· Deepgram Β· Google Gemini Β· Cartesia TTS Β· Sarvam AI Β· SQLite


πŸ€– AI Service Providers

The project uses a modular provider architecture. Here are the current and alternative options:

LLM (Large Language Model)

Provider Model Used In API Key Env
Google Gemini βœ… gemini-2.5-flash Both agents GOOGLE_API_KEY
Groq llama-3.3-70b, mixtral-8x7b β€” (alternative) GROQ_API_KEY
OpenAI gpt-4o, gpt-4o-mini β€” (alternative) OPENAI_API_KEY
Anthropic claude-sonnet-4-20250514 β€” (alternative) ANTHROPIC_API_KEY

STT (Speech-to-Text)

Provider Model Used In API Key Env
Groq βœ… whisper-large-v3-turbo English agent GROQ_API_KEY
Sarvam AI βœ… saarika:v2.5 Hindi agent SARVAM_API_KEY
Deepgram nova-2 β€” (alternative) DEEPGRAM_API_KEY
Google Cloud chirp β€” (alternative) GOOGLE_API_KEY

TTS (Text-to-Speech)

Provider Model / Voice Used In API Key Env
Deepgram βœ… aura-2-thalia-en English agent DEEPGRAM_API_KEY
Sarvam AI βœ… anushka (Hindi) Hindi agent SARVAM_API_KEY
Cartesia sonic-2 β€” (alternative) CARTESIA_API_KEY
ElevenLabs eleven_multilingual_v2 β€” (alternative) ELEVENLABS_API_KEY

VAD (Voice Activity Detection)

Provider Model Used In
Silero βœ… silero-vad Both agents (runs locally, no API key needed)

βœ… = Currently configured in the codebase. To switch providers, update the entrypoint() function in the respective agent file.


πŸ“¦ Project Structure

Rag-Mcp/
β”œβ”€β”€ rag_mcp/                  # Core Python package
β”‚   β”œβ”€β”€ config.py             #   Centralized config (frozen dataclass)
β”‚   β”œβ”€β”€ database.py           #   SQLite layer (context-managed)
β”‚   β”œβ”€β”€ embeddings.py         #   Embedding model (lazy-loaded)
β”‚   β”œβ”€β”€ chunking.py           #   Sentence-boundary text chunking
β”‚   β”œβ”€β”€ rag_engine.py         #   MCPRAGTool β€” high-level RAG API
β”‚   β”œβ”€β”€ cli.py                #   CLI (ingest / search / stats / list)
β”‚   └── agents/               #   LiveKit voice agents
β”‚       β”œβ”€β”€ base.py           #     Shared logic + search tool
β”‚       β”œβ”€β”€ english_agent.py  #     English-only agent
β”‚       └── multi_agent.py    #     Hindi/English bilingual agent
β”œβ”€β”€ documents/                # Knowledge base (NCERT textbooks)
β”œβ”€β”€ voice-agent.py            # β˜… Primary entry point
β”œβ”€β”€ pyproject.toml            # Project metadata & tool config
β”œβ”€β”€ requirements.txt          # pip dependencies
β”œβ”€β”€ .env.example              # API key template
└── .gitignore

Note

The documents/ folder currently contains textbooks from NCERT (National Council of Educational Research and Training) covering subjects like Science, Maths, History, Economics, English Literature, Business Studies, and more. You can replace or add your own .txt, .md, or .csv files to build a custom knowledge base.


πŸš€ Getting Started

Prerequisites

Installation

# 1. Clone the repository
git clone https://github.com/0xMihirK/EduRAG.git
cd EduRAG

# 2. Create a virtual environment
python -m venv .venv
.venv\Scripts\activate        # Windows
# source .venv/bin/activate   # macOS / Linux

# 3. Install dependencies
pip install -r requirements.txt

# 4. Download required model files
python voice-agent.py download-files

# 5. Set up API keys
copy .env.example .env        # Windows
# cp .env.example .env        # macOS / Linux
# Then edit .env with your actual keys

πŸ“– Usage

🎀 Voice Agent

Start the voice assistant with LiveKit:

# Download model files (first time only)
python voice-agent.py download-files

# English agent (default)
python voice-agent.py dev

# Hindi / English bilingual agent
python voice-agent.py --language hindi dev

Then connect to the LiveKit room via the LiveKit Playground or your own frontend.

πŸ”§ CLI β€” Document Management

# Ingest all documents from the documents/ folder
python -m rag_mcp --action ingest

# Semantic search
python -m rag_mcp --action search --query "Explain photosynthesis"

# Database statistics
python -m rag_mcp --action stats

# List ingested documents
python -m rag_mcp --action list

βš™οΈ Configuration

All settings can be overridden via environment variables:

Setting Env Variable Default
Database path RAG_DB_PATH rag_database.db
Embedding model RAG_MODEL_NAME all-MiniLM-L6-v2
Documents dir RAG_DOCUMENTS_DIR documents
Chunk size RAG_CHUNK_SIZE 500
Chunk overlap RAG_CHUNK_OVERLAP 50
Similarity threshold RAG_SIMILARITY_THRESHOLD 0.3
Top-K results RAG_TOP_K 5

See .env.example for the full list of required API keys.


🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m "Add amazing feature"
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

πŸ™ Acknowledgements

  • Sarvam AI β€” Hindi speech-to-text (saarika:v2.5) and text-to-speech models powering the bilingual agent
  • LiveKit β€” Real-time voice infrastructure
  • Deepgram β€” English speech-to-text and text-to-speech (aura-2)
  • Google Gemini β€” Large language model for response generation
  • Groq β€” Ultra-fast Whisper STT inference
  • Sentence-Transformers β€” Semantic embedding models

πŸ“„ License

This project is licensed under the MIT License β€” see the LICENSE file for details.


Built with ❀️ for education

About

An AI-powered educational voice assistant that answers questions from your knowledge base using Retrieval-Augmented Generation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages