# NCERT AI TUTOR


# How to use?

**Step 1: Create a virtual environment**

In [None]:
python -m venv .venv

**Step 2: Activate the Virtual Environment**

In [None]:
source venv/bin/activate

**Step 3: Install Requirements**

pip install --upgrade pip
pip install -r requirements.txt

**Step 4: Create .env**
Create environment file .env with all API KEYs required.

In [None]:
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
GOOGLE_API_KEY=your_google_api_key_here
HUGGINGFACE_API_TOKEN=your_huggingface_token_here
LANGCHAIN_API_KEY=your_langchain_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here

**Step 5: Create vector store**

In [None]:
python scripts/build_vector_store.py

**Step 6: Start FastAPI backend**

In [None]:
uvicorn app.backend:app --reload

**Step 7:Start Streamlit UI**

In [None]:
streamlit run app/ui.py

# Design Decisions

## Vector Store: FAISS vs Chroma

In the development of this Retrieval-Augmented Generation (RAG)-based tutoring system, the selection of the vector store plays a critical role in enabling efficient and accurate semantic search over educational content. After evaluating multiple options, FAISS (Facebook AI Similarity Search) was chosen over Chroma for the following reasons:

1. **Performance and Scalability**
FAISS is a highly optimized library developed by Meta AI Research. It is designed for high-performance similarity search and clustering of dense vectors. The library supports several advanced indexing strategies such as Inverted File Indexes (IVF), Product Quantization (PQ), and Hierarchical Navigable Small World (HNSW) graphs, which are well-suited for large-scale deployments. This makes FAISS a more robust option for handling large volumes of text data, such as educational textbooks spanning multiple grades and subjects.

2. **Stability and Maturity**
FAISS has been widely adopted in academic and industrial applications since its release. Its maturity ensures greater stability and reliability in production environments. This is particularly valuable in educational tools, where the accuracy and speed of retrieval are critical to user experience.

3. **Ease of Integration and Persistence**
The LangChain framework offers excellent support for FAISS, including simple interfaces for saving and loading vector indexes to disk (save_local() and load_local()). This simplifies the development workflow and enables persistent local storage without requiring a server backend, making it ideal for lightweight deployments or offline environments.

4. **Local-First Design**
Since this application is intended to run locally or on user-hosted machines (such as schools or low-resource environments), FAISS is well-suited due to its low dependency footprint and complete local execution model. In contrast, some vector stores like Chroma often require persistent server processes or external backends to manage metadata and indexes.

5. **Comparison with Chroma**
While Chroma offers a developer-friendly interface and built-in metadata filtering (via DuckDB/SQLite), it is more suited for rapid prototyping or cloud-native applications that benefit from RESTful APIs and server persistence. However, for this use case—which emphasizes speed, simplicity, and offline usability—FAISS provides superior performance and better aligns with the project’s deployment goals.

In summary, FAISS was selected due to its **speed, reliability, local storage capabilities, and production-grade performance**, all of which are essential for building a scalable and responsive educational RAG application. Chroma remains a valuable alternative for future iterations if server-side querying or live metadata filtering becomes necessary.