An AI-powered educational voice assistant that answers questions from your knowledge base using Retrieval-Augmented Generation.
EduRAG ingests your educational documents (NCERT textbooks, notes, PDFs-to-text, etc.), creates semantic embeddings, and lets you ask questions by voice. A LiveKit-powered voice agent retrieves the most relevant passages and answers in natural language β in English or Hindi/English bilingual mode.
| Feature | Description |
|---|---|
| π Semantic Search | Find answers by meaning, not just keywords β powered by all-MiniLM-L6-v2 sentence-transformers |
| π€ Voice Agent | Talk to your knowledge base using LiveKit real-time voice infrastructure |
| π Bilingual | English-only or Hindi/English code-switching agent |
| π Auto-Ingest | Drop .txt, .md, .py, .json, or .csv files into documents/ and ingest with one command |
| π Change Detection | MD5 hashing skips re-processing unchanged files |
| πΎ Local-first | SQLite database β no external vector DB required |
graph LR
subgraph Client
A[π€ User Voice]
end
subgraph LiveKit Cloud
B[WebRTC Room]
end
subgraph Voice Agent
C[English Agent]
D[Hindi/English Agent]
end
subgraph RAG Engine
E[Embeddings<br/>all-MiniLM-L6-v2]
F[Text Chunking<br/>Sentence Boundaries]
G[SQLite Database]
end
subgraph AI Services
H[Google Gemini<br/>LLM]
I[Deepgram<br/>STT]
J[Cartesia<br/>English TTS]
K[Sarvam AI<br/>Hindi STT & TTS]
end
A <-->|WebRTC| B
B <--> C
B <--> D
C --> E
D --> E
E <--> G
F --> G
C <--> H
D <--> H
C <--> I
C <--> J
D <--> K
Tech Stack: Python 3.10+ Β· LiveKit Agents Β· Sentence-Transformers Β· Deepgram Β· Google Gemini Β· Cartesia TTS Β· Sarvam AI Β· SQLite
The project uses a modular provider architecture. Here are the current and alternative options:
| Provider | Model | Used In | API Key Env |
|---|---|---|---|
| Google Gemini β | gemini-2.5-flash |
Both agents | GOOGLE_API_KEY |
| Groq | llama-3.3-70b, mixtral-8x7b |
β (alternative) | GROQ_API_KEY |
| OpenAI | gpt-4o, gpt-4o-mini |
β (alternative) | OPENAI_API_KEY |
| Anthropic | claude-sonnet-4-20250514 |
β (alternative) | ANTHROPIC_API_KEY |
| Provider | Model | Used In | API Key Env |
|---|---|---|---|
| Groq β | whisper-large-v3-turbo |
English agent | GROQ_API_KEY |
| Sarvam AI β | saarika:v2.5 |
Hindi agent | SARVAM_API_KEY |
| Deepgram | nova-2 |
β (alternative) | DEEPGRAM_API_KEY |
| Google Cloud | chirp |
β (alternative) | GOOGLE_API_KEY |
| Provider | Model / Voice | Used In | API Key Env |
|---|---|---|---|
| Deepgram β | aura-2-thalia-en |
English agent | DEEPGRAM_API_KEY |
| Sarvam AI β | anushka (Hindi) |
Hindi agent | SARVAM_API_KEY |
| Cartesia | sonic-2 |
β (alternative) | CARTESIA_API_KEY |
| ElevenLabs | eleven_multilingual_v2 |
β (alternative) | ELEVENLABS_API_KEY |
| Provider | Model | Used In |
|---|---|---|
| Silero β | silero-vad |
Both agents (runs locally, no API key needed) |
β = Currently configured in the codebase. To switch providers, update the
entrypoint()function in the respective agent file.
Rag-Mcp/
βββ rag_mcp/ # Core Python package
β βββ config.py # Centralized config (frozen dataclass)
β βββ database.py # SQLite layer (context-managed)
β βββ embeddings.py # Embedding model (lazy-loaded)
β βββ chunking.py # Sentence-boundary text chunking
β βββ rag_engine.py # MCPRAGTool β high-level RAG API
β βββ cli.py # CLI (ingest / search / stats / list)
β βββ agents/ # LiveKit voice agents
β βββ base.py # Shared logic + search tool
β βββ english_agent.py # English-only agent
β βββ multi_agent.py # Hindi/English bilingual agent
βββ documents/ # Knowledge base (NCERT textbooks)
βββ voice-agent.py # β
Primary entry point
βββ pyproject.toml # Project metadata & tool config
βββ requirements.txt # pip dependencies
βββ .env.example # API key template
βββ .gitignore
Note
The documents/ folder currently contains textbooks from NCERT (National Council of Educational Research and Training) covering subjects like Science, Maths, History, Economics, English Literature, Business Studies, and more. You can replace or add your own .txt, .md, or .csv files to build a custom knowledge base.
# 1. Clone the repository
git clone https://github.com/0xMihirK/EduRAG.git
cd EduRAG
# 2. Create a virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS / Linux
# 3. Install dependencies
pip install -r requirements.txt
# 4. Download required model files
python voice-agent.py download-files
# 5. Set up API keys
copy .env.example .env # Windows
# cp .env.example .env # macOS / Linux
# Then edit .env with your actual keysStart the voice assistant with LiveKit:
# Download model files (first time only)
python voice-agent.py download-files
# English agent (default)
python voice-agent.py dev
# Hindi / English bilingual agent
python voice-agent.py --language hindi devThen connect to the LiveKit room via the LiveKit Playground or your own frontend.
# Ingest all documents from the documents/ folder
python -m rag_mcp --action ingest
# Semantic search
python -m rag_mcp --action search --query "Explain photosynthesis"
# Database statistics
python -m rag_mcp --action stats
# List ingested documents
python -m rag_mcp --action listAll settings can be overridden via environment variables:
| Setting | Env Variable | Default |
|---|---|---|
| Database path | RAG_DB_PATH |
rag_database.db |
| Embedding model | RAG_MODEL_NAME |
all-MiniLM-L6-v2 |
| Documents dir | RAG_DOCUMENTS_DIR |
documents |
| Chunk size | RAG_CHUNK_SIZE |
500 |
| Chunk overlap | RAG_CHUNK_OVERLAP |
50 |
| Similarity threshold | RAG_SIMILARITY_THRESHOLD |
0.3 |
| Top-K results | RAG_TOP_K |
5 |
See .env.example for the full list of required API keys.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m "Add amazing feature" - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Sarvam AI β Hindi speech-to-text (
saarika:v2.5) and text-to-speech models powering the bilingual agent - LiveKit β Real-time voice infrastructure
- Deepgram β English speech-to-text and text-to-speech (
aura-2) - Google Gemini β Large language model for response generation
- Groq β Ultra-fast Whisper STT inference
- Sentence-Transformers β Semantic embedding models
This project is licensed under the MIT License β see the LICENSE file for details.
Built with β€οΈ for education