<a href="https://www.kaggle.com/code/thanveerkhan/ai-tutor-bot?scriptVersionId=235085722" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
import os
import time
import google.generativeai as genai
from langchain.vectorstores import FAISS
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import PyPDFLoader
from kaggle_secrets import UserSecretsClient

In [2]:
genai.configure(api_key=UserSecretsClient().get_secret("GOOGLE_API_KEY"))
os.environ["GOOGLE_API_KEY"] = UserSecretsClient().get_secret("GOOGLE_API_KEY")

In [3]:
# 📄 Step 4: Load and embed a PDF
def load_and_embed_docs(pdf_path, save_path="vector_store"):
    loader = PyPDFLoader(pdf_path)
    pages = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    docs = text_splitter.split_documents(pages)

    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    vectordb = FAISS.from_documents(docs, embedding=embeddings)

    vectordb.save_local(save_path)
    return vectordb, len(docs)

In [4]:
# 📁 Step 5: Load vector DB from disk
def load_vector_store(path="vector_store"):
    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    return FAISS.load_local(path, embeddings, allow_dangerous_deserialization=True)

In [5]:
# 💬 Step 6: Ask a question with Gemini + RAG
def query_gemini_with_rag(vectordb, user_query, explain_eli5=False):
    relevant_docs = vectordb.similarity_search(user_query, k=3)
    context = "\n".join([doc.page_content for doc in relevant_docs])
    style = "Explain like I’m 5." if explain_eli5 else "Answer clearly like a tutor."

    prompt = f"""You are a helpful tutor. {style}
Use the following context to answer:\n{context}\n\nQuestion: {user_query}"""

    model = genai.GenerativeModel("models/gemini-1.5-flash")
    response = model.generate_content(prompt)
    return response.text

In [6]:
# 🧠 Step 7: Generate multiple choice quiz
def generate_quiz_questions(topic):
    prompt = f"""
You are an AI tutor. Create a multiple choice quiz with exactly 5 questions on the topic: "{topic}".
Each question must have four options labeled A), B), C), D).
At the end of each question, provide the correct answer on a separate line like: "Answer: B".

Format strictly like:
1. Question text?
A) Option A
B) Option B
C) Option C
D) Option D
Answer: B
Repeat for 5 questions only. Do not add explanations or extra text.
    """
    model = genai.GenerativeModel("gemini-1.5-flash")
    response = model.generate_content(prompt)
    return response.text


In [7]:
# ✅ Step 8: Evaluate quiz answers
def evaluate_quiz(questions, user_answers):
    correct = 0
    total = len(questions)
    feedback = []

    for i, (q, user_ans) in enumerate(zip(questions, user_answers)):
        correct_ans = q["answer"].strip()
        is_correct = user_ans.strip().startswith(correct_ans)

        if is_correct:
            correct += 1
            feedback.append(f"✅ Q{i+1}: Correct! ({user_ans})")
        else:
            feedback.append(f"❌ Q{i+1}: Incorrect. Your answer: {user_ans} | Correct: {correct_ans}")

    score_percent = round((correct / total) * 100, 2)
    summary = f"\nYour Score: {correct}/{total} ({score_percent}%)\n"
    return "\n".join(feedback) + summary


In [8]:
# loading data for RAG
import gdown

file_id = "1B-B6xKd0Zw2ShPnDU7Pu89muSbVo0Jvm"
url = f"https://drive.google.com/uc?id={file_id}"
output_path = "my_downloaded_pdf.pdf"

gdown.download(url, output_path, quiet=False)


Downloading...
From: https://drive.google.com/uc?id=1B-B6xKd0Zw2ShPnDU7Pu89muSbVo0Jvm
To: /kaggle/working/my_downloaded_pdf.pdf
100%|██████████| 1.22k/1.22k [00:00<00:00, 2.10MB/s]


'my_downloaded_pdf.pdf'

In [9]:
vectordb, chunks = load_and_embed_docs("my_downloaded_pdf.pdf")
print(f"Embedded {chunks} chunks.")

Embedded 1 chunks.


In [10]:
# Example: Ask a question using RAG
response = query_gemini_with_rag(vectordb, "What are the main ideas in the document?")
print(response)

Okay, let's break down the main ideas in this document about the water cycle.  There are two key takeaways:

1. **The Water Cycle's Process:** The document explains the *four main stages* of the water cycle: evaporation (water turning into vapor), condensation (vapor forming clouds), precipitation (water falling from clouds), and collection (water gathering in bodies of water and on land).  This describes *how* water moves.

2. **The Water Cycle's Importance:**  The text highlights that the water cycle is a *continuous process* crucial for distributing *fresh water* around the Earth. This explains *why* the water cycle is important.

So, in short, the main ideas are the *steps* of the water cycle and its *role in distributing fresh water*.  Does that make sense?  Do you have any other questions about the water cycle or this explanation?



In [11]:
# Example: Generate a quiz
quiz = generate_quiz_questions("Photosynthesis")
print(quiz)

1. What is the primary purpose of photosynthesis?
A) To release energy from glucose
B) To convert light energy into chemical energy
C) To break down water molecules
D) To produce oxygen for respiration
Answer: B

2. Which of the following is NOT a reactant in photosynthesis?
A) Carbon dioxide
B) Water
C) Glucose
D) Light energy
Answer: C

3.  Where does photosynthesis primarily occur in plants?
A) Roots
B) Stems
C) Leaves
D) Flowers
Answer: C

4. Chlorophyll is crucial for photosynthesis because it:
A) Absorbs carbon dioxide
B) Absorbs light energy
C) Releases oxygen
D) Transports water
Answer: B

5. What is the name of the carbohydrate produced during photosynthesis?
A) Protein
B) Lipid
C) Glucose
D) Nucleic acid
Answer: C

