# 🎓 Project-Based Learning Task: Build a Personalized Learning System using LLM + FAISS + Streamlit

## 🔰 Project Objective
Develop an intelligent learning assistant that:
- Accepts documents from users (PDF, DOCX, TXT)
- Extracts and indexes educational content
- Supports question answering, topic explanation, assignments, and more
- Classifies non-educational queries using LLM
- Delivers all of this through an intuitive **Streamlit web app**

---

## 🗂️ Task Breakdown

### ✅ Task 1: Environment Setup and Dependency Management
**Goal:** Prepare the working environment with required packages.

- Install required libraries using pip:  
  `pip install streamlit langchain faiss-cpu pdfplumber python-docx python-dotenv`

- Install Gemini LLM-specific packages:  
  `pip install langchain-google-genai`

- Set up a `.env` file to store your Gemini API key:  
  `GOOGLE_API_KEY=your-api-key-here`

---

### ✅ Task 2: Create `processor.py` – Text Extraction from Files

**Goal:** Extract plain text from `.pdf`, `.docx`, and `.txt`.

**Requirements:**
- Use `pdfplumber`, `docx`, and file I/O to extract content.
- Write the function `extract_text(file_path)` that detects the file extension and routes it to the correct handler.
- Test with different file formats.

---

### ✅ Task 3: Create `embedder.py` – FAISS Indexing and Search

**Goal:** Convert uploaded documents into semantic vector chunks.

**Requirements:**
- Use LangChain's `RecursiveCharacterTextSplitter` to chunk large text.
- Embed chunks using `GoogleGenerativeAIEmbeddings`.
- Store and load vector database using FAISS.
- Implement `create_faiss_index(text)` and `load_faiss_index()` functions.
- Add `get_similar_chunks(query, vector_db, embedding_model, top_k=5)` function.

---

### ✅ Task 4: Create `llm_handler.py` – Prompt Engineering + LLM Integration

**Goal:** Handle all educational generation logic.

**Requirements:**
- Use Gemini (`GoogleGenerativeAI`) to:
  - Classify educational vs. non-educational queries
  - Generate:
    - Topic Explanations
    - Assignments
    - Exam Questions
    - Interview Questions
- Implement `generate_response(...)` based on `help_type`.
- Implement `is_educational_with_llm(...)`.

---

### ✅ Task 5: Build `app.py` – Streamlit Interface

**Goal:** Create an interactive frontend for your educational assistant.

**Requirements:**
- Add a title and interaction type options using `st.selectbox`.
- Enable document upload, processing, and vector creation.
- Enable direct query mode for LLM-only assistance.
- Use FAISS for retrieval and send results + query to `generate_response()`.
- Display result using `st.markdown()` and error handling.

---

### ✅ Task 6: Directory Setup & Flow Integration

**Goal:** Organize the project structure and ensure smooth I/O.

**Requirements:**
- Create folders: `data/uploaded_docs/`, `data/faiss_index/`.
- Confirm file saving works correctly after upload.
- Confirm FAISS index is reused and not regenerated unnecessarily.
- Add useful success/warning/error messages.

---

### ✅ Task 7: Testing & Debugging

**Goal:** Validate the system functionality.

**Requirements:**
- Upload a sample educational document.
- Try asking for:
  - Explanation of a topic from the document
  - Assignment generation
  - Exam questions
- Try a non-educational query and verify it’s blocked.

---

## 🧠 Learning Outcomes
- LangChain for Embedding + Retrieval
- Gemini LLM for content generation
- File processing and chunking logic
- Vector search with FAISS
- Prompt engineering based on content type
- Full-stack Streamlit app development

---

## 🏁 Final Submission Checklist
Before submitting, ensure:
- All `.py` files are clean and commented
- `.env` file is NOT shared
- Project runs locally without errors
- One sample document + demo video (if possible) is attached
