<a href="https://colab.research.google.com/github/redanzo/GDSC---Colab-Gemini/blob/main/GDSC_03_25_2025_SOLUTION_Build_With_AI_Mastering_Gemini_API_for_Studies.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Step 1: Setting Up the Environment
# ----------------------------------
#### Install necessary libraries for PDF parsing, Gemini API, and file handling.

In [None]:
!pip install google-generativeai pymupdf Pillow

Collecting pymupdf
  Downloading pymupdf-1.25.4-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.4 kB)
Downloading pymupdf-1.25.4-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (20.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m20.0/20.0 MB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pymupdf
Successfully installed pymupdf-1.25.4


### Step 2: Importing Required Libraries
# ------------------------------------

In [None]:
import google.generativeai as genai
import fitz  # PyMuPDF for PDF parsing
from PIL import Image
import io
import json
import time
import random
from google.colab import files, userdata

### Step 3: Configuring Gemini
# -----------------------------------
##### Setup Gemini Key Secret in Colab: Click on the "Key" Icon on the sidebar to add your API Key.

In [None]:
from google.colab import userdata

# Ensure you have set up your API key in Colab's 'Secrets' (userdata).
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

In [None]:
text_model = genai.GenerativeModel('gemini-2.0-flash')
vision_model = genai.GenerativeModel('gemini-2.0-flash')

### Step 4: "Main Method"

In [None]:
def study_assistant():
    print("Welcome to the AI-Powered Study Assistant!")
    print("You can upload a PDF file (slides, notes, or presentations) to generate flashcards and quizzes.")

    while True:
        print("\nMenu:")
        print("1. Upload and process a PDF file")
        print("2. Exit")

        choice = input("Choose an option: ")

        if choice == '1':
            upload_and_process_pdf()
        elif choice == '2':
            print("Goodbye!")
            break
        else:
            print("Invalid choice. Please try again.")

### Step 5: Uploading and Processing PDF Files

In [None]:
def upload_and_process_pdf():
    from google.colab import files

    uploaded = files.upload()

    pdf_file_path = None
    for file_name, file_content in uploaded.items():
        if file_name.endswith('.pdf'):
            with open(file_name, "wb") as f:
                f.write(file_content)
            pdf_file_path = file_name
            print(f"✅ Uploaded: {file_name}")
            break

    if not pdf_file_path:
        print("❌ No PDF uploaded.")
        return

    # Step 1: Extract text and images
    text, images, page_data = extract_from_pdf(pdf_file_path)  # Fixed: Now unpacking 3 values
    print(f"📄 Extracted {len(text)} characters of text and {len(images)} images.\n")

    # Step 2: Analyze images (if any)
    image_insights = []
    if images:
        print("🖼️ Analyzing images with Gemini...")
        image_insights = analyze_images(images)

    # Step 3: Main interaction menu
    while True:
        print("\n🔧 What would you like to do?")
        print("1. Generate Flashcards")
        print("2. Generate Sample Questions")
        print("3. Show Image Analysis")
        print("4. Exit")

        choice = input("Enter your choice (1-5): ").strip()

        if choice == "1":
            flashcards = create_flashcards(text)
            print("\n🧠 Flashcards:\n")
            if isinstance(flashcards, list):
                for card in flashcards:
                    print(f"Front: {card['front']}\nBack: {card['back']}\n")
            else:
                print(flashcards)

        elif choice == "2":
            questions = generate_quiz_questions(text)
            print("\n📝 Sample Questions:\n")
            if isinstance(questions, list):
                for q in questions:
                    print(f"Q: {q['question']}\nA: {q['answer']}\n")
            else:
                print(questions)

        elif choice == "3":
            if not image_insights:
                print("No images to analyze.")
            else:
                print("\n📸 Image Analysis Results:\n")
                for i, result in enumerate(image_insights, 1):
                    print(f"Image {i}: {result}\n")

        elif choice == "4":
            print("👋 Exiting. Thanks!")
            break

        else:
            print("⚠️ Invalid choice. Please try again.")

### Step 6: Creating a Function to Extract Text and Images from PDF

In [None]:
def extract_from_pdf(pdf_path):
    text = ""         # Store all extracted text
    images = []       # Store all images found in the PDF
    page_data = []    # Store detailed data for each page (text + images)

    print("🔍 Reading PDF and extracting content...")

    # Open the PDF using PyMuPDF
    pdf_document = fitz.open(pdf_path)

    # Go through each page of the PDF
    for page_num in range(len(pdf_document)):
        page = pdf_document.load_page(page_num)

        # Extract all text from the current page
        page_text = page.get_text()
        text += page_text

        page_images = []  # Images from this specific page
        image_list = page.get_images(full=True)  # Get list of images on the page
        print(f"📄 Page {page_num + 1}: Found {len(image_list)} images.")

        # Loop through all images on the page
        for img_index, img in enumerate(image_list):
            xref = img[0]
            base_image = pdf_document.extract_image(xref)
            image_bytes = base_image["image"]

            try:
                # Convert the image bytes to a PIL Image object
                # PIL (Python Imaging Library, now called Pillow) is a tool that lets Python work with images — show them, resize them, edit them, etc.
                image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
                images.append(image)
                page_images.append(image)
            except Exception as e:
                print(f"⚠️ Failed to process image {img_index} on page {page_num + 1}: {e}")

        # Save page text and images in structured format
        page_data.append({
            "page_number": page_num + 1,
            "text": page_text.strip(),
            "images": page_images
        })

    print(f"✅ Finished extracting {len(text)} characters of text and {len(images)} images.")
    return text.strip(), images, page_data


### Step 7: Analyzing Images

In [None]:
def analyze_images(images):
    analyzed_results = []

    print(f"🔎 Sending {len(images)} images to Gemini for analysis...")

    # Loop through each image
    for idx, image in enumerate(images):
        # Prompt to describe the image
        prompt = (
            "You're an AI tutor. Carefully analyze this image from study material. "
            "Describe the image, explain its relevance to the subject matter, and identify any key concepts or diagrams. "
            "If it's not related to education, mention that too."
        )

        print(f"\n📸 Analyzing image {idx + 1}...")
        result = generate_response(prompt, images=[image])  # Send to Gemini
        analyzed_results.append(result)  # Store result

    print("✅ Image analysis complete.")
    return analyzed_results


### Step 8: Creating a Function to Generate AI Responses

In [None]:
def generate_response(prompt, images=None):
    try:
        if images:
            print("🧠 Generating multimodal response using Gemini Vision...")
            response = vision_model.generate_content([prompt] + images)
        else:
            print("🧠 Generating text-only response using Gemini Pro...")
            response = text_model.generate_content(prompt)
        return response.text
    except Exception as e:
        print("❌ Error while generating response:", e)
        return "Error: Could not generate response."

### Step 9: Converting Extracted Text into Flashcards

In [None]:
def create_flashcards(text):
    """
    Ask user for flashcard style and number, generate flashcards using Gemini API.
    Each flashcard has 'front' and 'back'.
    """

    # Ask user for style
    print("\n🎴 Choose flashcard style:")
    print("1. Vocabulary Only")
    print("2. Mixed (Q&A, fill-in-the-blank, etc.)")
    choice = input("Enter 1 or 2: ").strip()

    # Ask user for how many flashcards
    num = input("How many flashcards would you like to generate? ").strip()
    num = int(num) if num.isdigit() else 10  # Default to 10 if invalid input

    # Create prompt based on selected style
    if choice == "1":
        style = "vocab"
        prompt = (
            f"You're an AI tutor. From the following study material, extract {num} important vocabulary words or technical terms. "
            "For each, provide a short and simple definition. Format your response as a list of flashcards, each with:\n"
            "- front: The term or word\n"
            "- back: A simple definition\n\n"
            f"Study Material:\n{text}"
        )
    else:
        style = "mixed"
        prompt = (
            f"You're an AI tutor. Based on the following study material, create {num} flashcards in a variety of styles, such as:\n"
            "- Question & Answer\n"
            "- Fill in the Blank\n"
            "- True or False\n"
            "- Short Concept Checks\n\n"
            "Each flashcard should have:\n"
            "- front: A question or prompt\n"
            "- back: The answer or explanation\n\n"
            "Please return your answer as a list of flashcards like this:\n"
            "[\n"
            "  {\"front\": \"What is the capital of France?\", \"back\": \"Paris\"},\n"
            "  {\"front\": \"True or False: Water is an element.\", \"back\": \"False\"},\n"
            "  ... (total of 10 flashcards)\n"
            "]\n\n"
            f"Study Material:\n{text}"
        )

    # Call Gemini to generate flashcards
    response = generate_response(prompt)

    # Try to parse structured output
    try:
        flashcards = json.loads(response)
        if isinstance(flashcards, list) and all("front" in card and "back" in card for card in flashcards):
            return flashcards
    except Exception:
        pass

    # Fallback if parsing fails
    print("⚠️ Could not parse structured flashcards, returning raw text.")
    return response

### Step 10: Generating Dynamic Quiz Questions

In [None]:
def generate_quiz_questions(text):
    """
    Ask user for number of quiz questions, generate using Gemini.
    Returns a list of {"question": ..., "answer": ...}
    """

    # Ask user for number of questions
    num = input("How many questions would you like to generate? ").strip()
    num = int(num) if num.isdigit() else 5  # Default to 5 if invalid

    # Prompt for Gemini
    prompt = (
        f"You're a helpful AI tutor. Create {num} multiple choice or short-answer quiz questions "
        "based on the following study material. Each question must have a correct answer.\n\n"
        "Format your response like this:\n"
        "[\n"
        "  {\"question\": \"What is the capital of France?\", \"answer\": \"Paris\"},\n"
        f"  ... (total of {num} questions)\n"
        "]\n\n"
        f"Study Material:\n{text}"
    )

    # Generate with Gemini
    response = generate_response(prompt)

    # Try parsing Gemini response as JSON
    try:
        questions = json.loads(response)
        if isinstance(questions, list) and all("question" in q and "answer" in q for q in questions):
            return questions
    except Exception:
        pass

    # Fallback if parsing fails
    print("⚠️ Failed to parse structured questions. Returning raw Gemini output.")
    return response

## Final Step: Run Code

In [None]:
study_assistant()

Welcome to the AI-Powered Study Assistant!
You can upload a PDF file (slides, notes, or presentations) to generate flashcards and quizzes.

Menu:
1. Upload and process a PDF file
2. Exit
Choose an option: 1


Saving Lecture-Notes-The-Mental-Process.pdf to Lecture-Notes-The-Mental-Process.pdf
✅ Uploaded: Lecture-Notes-The-Mental-Process.pdf
🔍 Reading PDF and extracting content...
📄 Page 1: Found 0 images.
📄 Page 2: Found 0 images.
📄 Page 3: Found 0 images.
✅ Finished extracting 10168 characters of text and 0 images.
📄 Extracted 10166 characters of text and 0 images.


🔧 What would you like to do?
1. Generate Flashcards
2. Generate Sample Questions
3. Show Image Analysis
4. Exit
Enter your choice (1-5): 3
No images to analyze.

🔧 What would you like to do?
1. Generate Flashcards
2. Generate Sample Questions
3. Show Image Analysis
4. Exit
Enter your choice (1-5): 1

🎴 Choose flashcard style:
1. Vocabulary Only
2. Mixed (Q&A, fill-in-the-blank, etc.)
Enter 1 or 2: 2
How many flashcards would you like to generate? 3
🧠 Generating text-only response using Gemini Pro...
⚠️ Could not parse structured flashcards, returning raw text.

🧠 Flashcards:

```json
[
  {
    "front": "What are the two key men