<a href="https://colab.research.google.com/github/Rohanrathod7/StudyBuddy-Pro-AI-Powered-PDF-RAG-Study-Planner-Agent/blob/main/Notebook/StudyBuddy_01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 1.1 Project Overview
StudyBuddy Pro is an intelligent, multi-capability AI agent designed to help students transform unstructured study materials (PDF notes, syllabi, research papers) into actionable learning tasks and personalized study plans.  
It seamlessly integrates **Retrieval-Augmented Generation (RAG)** with a **Task Planner Agent** to enable context-grounded Q&A and multi-day study schedule generation.

Using Gemini as the central reasoning engine, and supported by custom tools for PDF processing, chunking, vector search, and task planning, the agent delivers an end-to-end student productivity solution — all running locally with zero cost.


## 1.2 End-to-End System Flow
### **1. User Input**
A user asks:
- A question about study content, *or*
- Requests a study plan, *or*
- Combines both (e.g., "Explain topic X and make a 3-day plan")

---

### **2. Intent Classifier Agent**
A lightweight Gemini classifier determines the intent:
- **PDF\_QA** → PDF question-answering  
- **PLAN** → study planning request  
- **BOTH** → requires both flows  

This guarantees the correct agents are activated.

---

### **3. PDF RAG Agent Pipeline**
Used for answering content-based questions.

1. **PDF Extraction Tool**  
   Extracts raw text from the uploaded PDF.

2. **Chunking Tool**  
   Splits text into overlapping chunks for more accurate semantic retrieval.

3. **Embedding Tool**  
   Converts chunks into vector embeddings (SentenceTransformers).

4. **FAISS Vector Store**  
   Enables fast semantic search over the document content.

5. **RAG Retrieval Tool**  
   Fetches the top relevant passages for the user's question.

6. **Gemini Answer Generation**  
   Gemini produces an answer grounded *only* in the retrieved context.

---

### **4. Study Planner Agent Pipeline**
Used for schedule creation and managing the user’s tasks.

1. **Task Extraction (optional)**  
   Gemini parses PDF topics into actionable study tasks.

2. **add_task Tool**  
   Stores tasks in session memory.

3. **list_tasks Tool**  
   Retrieves tasks for planning.

4. **generate_plan Tool**  
   Distributes tasks over n days respecting hour limits and user preferences.

5. **Gemini Polishing**  
   Converts raw schedules into a clear, friendly study plan.

---

### **5. Memory & State Handling**
- `TASKS` list persists user tasks across interactions.  
- `user_preferences` stores settings like preferred study hours.  
- Embeddings index maintains learning material understanding.

---

### **6. Final Response Assembly**
Depending on the intent, the system outputs:

- **Grounded RAG answer**  
- **Personalized multi-day study plan**  
- **Or a combined response**  

---



## 1.3 Problem Statement
Students often receive study materials as large, unstructured PDFs—syllabi, textbooks, research papers, lecture notes—which are difficult to navigate and convert into actionable study plans.  
This leads to:

- Poor understanding of key concepts  
- Misallocation of study time  
- Overwhelm during exams  
- Lack of personalized guidance  
- Difficulty in structuring long-term study goals  

**Goal:**  
Build an AI Agent that can:

1. **Understand** what’s in the student’s PDFs  
2. **Answer questions** grounded in their notes (RAG)  
3. **Convert topics** into structured tasks  
4. **Create personalized study schedules**  
5. **Retain memory** of tasks and preferences  

This bridges the gap between raw study material → efficient learning.

---

## 1.4 Experimental Setup & Evaluation Plan
### **1. System Components**
- **LLM:** Gemini 2.0 Flash (fast, cost-free on Colab/Kaggle)  
- **Vector Database:** FAISS CPU  
- **Embeddings Model:** SentenceTransformer `all-MiniLM-L6-v2`  
- **PDF Processing:** PyPDF  
- **Agent Logic:** Python-based multi-agent orchestration  
- **Tools:** Custom tools for extraction, embeddings, retrieval, task planning  

---

### **2. Functionality Evaluation**
#### **A. PDF RAG Accuracy**
- Test 10–15 comprehension questions about the uploaded PDF  
- Expected output: Correct answers grounded strictly in the retrieved context  
- Validation metric:  
  - *Context precision*  
  - *No hallucination*  
  - *Clarity of explanation*  

#### **B. Study Plan Quality**
- Provide sample task sets and test plan generation with varying constraints:  
  - Time availability  
  - Preferred hours  
  - Number of days  
- Evaluate based on:  
  - Logical distribution of work  
  - Completeness of schedule  
  - User constraints respected  
  - Readability of final output  

---

### **3. Performance Evaluation**
- **Embedding generation time**  
- **Retrieval speed** (FAISS search time)  
- **Gemini response latency**  
- **Overall workflow execution time**  

---

### **4. Stability & Reliability Tests**
- Multi-turn interactions  
- Memory persistence  
- Edge cases: empty PDFs, unusually large PDFs, low-content PDFs  
- Planner behavior with too many or too few tasks  

---

### **5. Observability Metrics**
- Logged tool calls  
- Retrieved chunk previews  
- Intent classifier outputs  
- Task distribution logs  

These logs help evaluate correctness and debugging clarity.

---

### **6. Success Criteria**
The agent is considered successful if:

- Answers are grounded in PDF content (no hallucination)  
- Study plan is reasonable, personalized, and executable  
- Tools work consistently and reliably  
- Multi-agent routing behaves correctly  
- Interactions are smooth in a multi-turn setting  

---

## 2. Install & Import Libraries

In [1]:
!pip install -q google-genai


In [2]:
!pip install -q google-genai sentence-transformers faiss-cpu pypdf


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.6/23.6 MB[0m [31m92.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m329.5/329.5 kB[0m [31m19.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [3]:
import os
from google.genai import Client, types

from sentence_transformers import SentenceTransformer
import faiss
from pypdf import PdfReader

import textwrap
import numpy as np


## 3. Configure Gemini

In [4]:
GEMINI_API_KEY = "GEMINI_API_KEY"  # in Kaggle: use environment or secrets
client = Client(api_key=GEMINI_API_KEY)
MODEL_NAME = "gemini-2.5-flash"


In [5]:
from google.genai import Client, types

client = Client(api_key=GEMINI_API_KEY)

def ask_gemini(prompt):
    res = client.models.generate_content(
        model="gemini-2.5-flash",   # Supported by API key
        contents=prompt
    )
    return res.text


In [6]:
from google.genai import Client, types

client = Client(api_key=GEMINI_API_KEY)

def ask_gemini(prompt: str, system_instruction: str = None):
    full_prompt = ""
    if system_instruction:
        full_prompt += system_instruction + "\n\n"
    full_prompt += prompt

    contents = [
        types.Content(role="user", parts=[types.Part(text=full_prompt)])
    ]

    res = client.models.generate_content(
        model=MODEL_NAME,
        contents=contents,
        config=types.GenerateContentConfig(
            temperature=0.4,
        )
    )
    return res.candidates[0].content.parts[0].text

## 4. PDF Upload & Text Extraction

In [7]:
from google.colab import files

uploaded = files.upload()   # select your PDF
pdf_path = list(uploaded.keys())[0]
pdf_path


Saving archiflow_ai_project_notes.pdf to archiflow_ai_project_notes.pdf


'archiflow_ai_project_notes.pdf'

In [8]:
def extract_text_from_pdf(path: str) -> str:
    reader = PdfReader(path)
    pages = []
    for page in reader.pages:
        text = page.extract_text()
        if text:
            pages.append(text)
    return "\n\n".join(pages)

raw_text = extract_text_from_pdf(pdf_path)
print(raw_text[:2000])  # preview


ArchiFlow AI – GenAI System Design Generator
1. Project Idea Overview
- Input: Natural language requirements or raw content (e.g., app idea, notes, assignment text).
- Output: Structured system design plus diagrams (architecture, flowcharts, ERD) generated via LLM.
2. MVP Scope (2–3 Days)
- Frontend: Next.js + React + Tailwind + basic UI.
- Backend: FastAPI with a single LLM-powered endpoint.
- Core Features:
  * Text → System Architecture (sections: overview, components, APIs, DB, deployment).
  * Text → Diagrams in Mermaid format (architecture & flow diagrams).
  * Editable Mermaid code block in UI so users can tweak diagrams.
  * Regenerate-per-section buttons (e.g., regenerate only DB or APIs).
  * Simple, clean deployment (Vercel for frontend, Render/Railway for backend).
3. Future Advanced Features (For Later)
- Full custom diagram editor (drag & drop, Figma-style) using React Flow / tldraw.
- Real-time collaboration with WebSockets + Y.js/Liveblocks/Supabase Realtime.
- User acc

## 5. Chunking, Embeddings & Vector Store (RAG Setup)

### 5.1 Chunk the text

In [9]:
def chunk_text(text, max_tokens=400, overlap=50):
    # Simple char-based chunking, good enough for this project
    words = text.split()
    chunks = []
    start = 0
    while start < len(words):
        end = start + max_tokens
        chunk = " ".join(words[start:end])
        chunks.append(chunk)
        start = end - overlap
        if start < 0:
            start = 0
    return chunks

chunks = chunk_text(raw_text, max_tokens=300, overlap=50)
len(chunks)


3

### 5.2 Build embeddings + FAISS index

In [10]:
embed_model = SentenceTransformer("all-MiniLM-L6-v2")

embeddings = embed_model.encode(chunks, show_progress_bar=True)
embeddings = np.array(embeddings).astype("float32")

dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(embeddings)

print("Index size:", index.ntotal)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Index size: 3


### 5.3 Retrieval helper

In [11]:
def retrieve_relevant_chunks(query: str, k: int = 5):
    query_emb = embed_model.encode([query]).astype("float32")
    distances, indices = index.search(query_emb, k)
    selected_chunks = [chunks[i] for i in indices[0]]
    return selected_chunks


## 6. RAG Question-Answering Agent

### 6.1 System prompt for RAG agent

In [12]:
RAG_SYSTEM_PROMPT = """
You are StudyBuddy RAG Agent.
You must ONLY answer using the provided context from the student's PDF notes/syllabus.
If the answer is not in the context, say you don't know.

Always:
- Explain in simple terms.
- Refer to context logically, but don't hallucinate details.
"""


### 6.2 Function to answer questions using RAG

In [13]:
def answer_with_rag(user_query: str) -> str:
    context_chunks = retrieve_relevant_chunks(user_query, k=5)
    context_text = "\n\n---\n\n".join(context_chunks)

    prompt = f"""
Context from student's PDF:
{context_text}

---
User question: {user_query}

Using ONLY the context above, answer the question clearly.
If the answer is missing, say: "I couldn't find this in your notes."
"""

    return ask_gemini(prompt, system_instruction=RAG_SYSTEM_PROMPT)


In [14]:
answer_with_rag("What is MVP Scope")


'The MVP Scope for the ArchiFlow AI project includes:\n\n*   **Frontend:** Built with Next.js, React, Tailwind, and a basic user interface.\n*   **Backend:** Uses FastAPI with a single LLM-powered endpoint.\n*   **Core Features:**\n    *   Converts text into System Architecture, including sections like overview, components, APIs, database, and deployment.\n    *   Generates diagrams in Mermaid format, specifically architecture and flow diagrams.\n    *   Provides an editable Mermaid code block in the UI for users to adjust diagrams.\n    *   Allows regeneration of specific sections (e.g., only the database or APIs).\n*   **Deployment:** Simple and clean deployment using Vercel for the frontend and Render/Railway for the backend.'

## 7. Task Planner Tools & Agent

### 7.1 Simple in-memory task store

In [15]:
from datetime import date, timedelta

TASKS = []

def add_task(title: str, est_hours: float = 1.0, due_day_offset: int = None):
    """Add a study task. due_day_offset = days from today (optional)."""
    due_date = date.today() + timedelta(days=due_day_offset) if due_day_offset is not None else None
    TASKS.append({
        "title": title,
        "est_hours": est_hours,
        "due_date": due_date
    })

def list_tasks():
    return TASKS


### 7.2 Helper to distribute tasks into a study plan

In [16]:
def generate_study_plan(num_days: int, hours_per_day: float = 2.0):
    """
    Very simple planner: assign tasks in order across days respecting daily hour budget.
    """
    plan = { (date.today() + timedelta(days=i)).isoformat(): [] for i in range(num_days) }

    remaining_hours = {
        day: hours_per_day for day in plan.keys()
    }

    for task in TASKS:
        needed = task["est_hours"]
        for day in plan.keys():
            if needed <= 0:
                break
            if remaining_hours[day] <= 0:
                continue
            assign_hours = min(remaining_hours[day], needed)
            plan[day].append({"title": task["title"], "hours": float(assign_hours)})
            remaining_hours[day] -= assign_hours
            needed -= assign_hours

    return plan


### 7.3 Let Gemini turn syllabus topics into tasks

In [17]:
TASK_SYSTEM_PROMPT = """
You are StudyBuddy Planner Agent.
Your job is to turn raw syllabus/notes text into a list of atomic study tasks.

Rules:
- Each task should be small (1–2 hours).
- Include the topic name and type (e.g., "Read", "Revise", "Solve problems").
- Return output as bullet points.
"""

def extract_tasks_from_pdf(sample_chunk_count: int = 5):
    # take a subset of chunks to avoid flooding
    sample_text = "\n\n".join(chunks[:sample_chunk_count])
    prompt = f"""
Here is a sample of the student's syllabus/notes:

{sample_text}

From this, list concrete study tasks the student should complete.
"""
    response = ask_gemini(prompt, system_instruction=TASK_SYSTEM_PROMPT)
    return response


## 7.4 Gemini-generated natural language study plan

In [18]:
def pretty_study_plan(plan_dict, preferences_text: str = ""):
    # Convert plan dict to readable text and let Gemini polish it
    raw_plan_lines = []
    for day, tasks in plan_dict.items():
        raw_plan_lines.append(f"Day {day}:")
        if not tasks:
            raw_plan_lines.append("  - Free / buffer")
        else:
            for t in tasks:
                raw_plan_lines.append(f"  - {t['title']} (~{t['hours']}h)")
    raw_plan_text = "\n".join(raw_plan_lines)

    prompt = f"""
User preferences: {preferences_text}

Raw plan:
{raw_plan_text}

Rewrite this as a friendly, structured 7-day study plan with headings and bullet points.
"""
    return ask_gemini(prompt, system_instruction="You are a helpful study planning assistant.")


## 8. Controller Agent (Combining RAG + Planner)

### 8.1 Intent classifier

In [19]:
INTENT_SYSTEM_PROMPT = """
You are an intent classifier for the StudyBuddy Agent.
You must classify the user's message into one of:
- PDF_QA  -> asking about concepts from the notes/PDF
- PLAN    -> asking for schedule, plan, tasks, or time management
- BOTH    -> if the user wants both understanding and planning
Return exactly one word: PDF_QA, PLAN, or BOTH.
"""

def classify_intent(message: str) -> str:
    res = ask_gemini(message, system_instruction=INTENT_SYSTEM_PROMPT)
    res_clean = res.strip().upper()
    if "BOTH" in res_clean:
        return "BOTH"
    if "PLAN" in res_clean:
        return "PLAN"
    return "PDF_QA"


### 8.2 Main agent function

In [20]:
def studybuddy_agent(message: str, num_days: int = 7, hours_per_day: float = 2.0, preferences: str = ""):
    intent = classify_intent(message)
    print(f"[DEBUG] Intent: {intent}")

    responses = []

    if intent in ("PDF_QA", "BOTH"):
        answer = answer_with_rag(message)
        responses.append("📘 **Answer from your notes:**\n" + answer)

    if intent in ("PLAN", "BOTH"):
        # for demo, we assume tasks already exist in TASKS (you can also auto-fill from extract_tasks_from_pdf)
        plan = generate_study_plan(num_days=num_days, hours_per_day=hours_per_day)
        pretty = pretty_study_plan(plan, preferences_text=preferences)
        responses.append("📅 **Personalized Study Plan:**\n" + pretty)

    return "\n\n---\n\n".join(responses)


## 9. Demo Section – Example Interactions

In [21]:
# 1. Ask a pure PDF question
print(studybuddy_agent("Explain MVP."))

# 2. Ask for planning
print(studybuddy_agent(
    "I have an exam in 10 days, please create a plan to cover all topics.",
    num_days=10,
    hours_per_day=3.0,
    preferences="I prefer studying in the evening and want revisions before the exam."
))

# 3. Combined request
print(studybuddy_agent(
    "Explain overfitting from my notes and also give me a plan to revise it over the next 3 days.",
    num_days=3,
    hours_per_day=1.5
))


[DEBUG] Intent: PDF_QA
📘 **Answer from your notes:**
The MVP, or Minimum Viable Product, is the initial version of the 'ArchiFlow AI' project that focuses on core features to get it stable and deployed first.

According to the notes, the MVP includes:
*   **Input:** Natural language requirements or raw content.
*   **Output:** Structured system design (overview, components, APIs, DB, deployment) and diagrams in Mermaid format (architecture & flow diagrams).
*   **User Interface:** A basic Next.js + React + Tailwind UI with an editable Mermaid code block for users to tweak diagrams.
*   **Functionality:** Buttons to regenerate specific sections (e.g., only DB or APIs).
*   **Deployment:** Simple, clean deployment using Vercel for the frontend and Render/Railway for the backend.

The key design principle related to MVP is to "Build MVP first: text → structured design → Mermaid diagrams → simple editing" and only add advanced features after the MVP is stable and deployed.
[DEBUG] Intent: 