# Socratic AI - Personalized Educational Assistant using Gemini

## Problem Statement

Many students today rely on generative AI to quickly get direct answers, which can lead to passive learning and hinder the development of critical thinking skills. Traditional AI tutoring often provides immediate solutions, depriving learners of the opportunity to explore their own thought processes. This project addresses this challenge by transforming the educational experience into an active, self discovery process.

## Mission Statement

Develop a personalized Socratic tutor that guides students through a process of inquiry and reflection, leading them to discover solutions on their own, rather than simply being handed the answer. By constraining the AI to ask precise, open-ended questions, we aim to foster deeper understanding and encourage independent problem solving.

## Use Case

This project builds a personalized Socratic tutor using Google’s Gemini API. The tutor evaluates a student’s current knowledge through interactive quizzes and then engages them in a tailored dialogue. Instead of providing direct answers, the tutor uses carefully crafted questions to probe the student’s thinking, helping to expose gaps in understanding and promote meaningful learning outcomes.


## Solution Overview

1. The solution leverages the Gemini API’s powerful natural language capabilities along with structured output to create a dynamic and interactive educational tool:

2. Quiz Generation:
Gemini is used in JSON mode to automatically generate quizzes that assess a student’s grasp of key concepts. The structured quiz output ensures consistency and forms the baseline for personalized instruction.

3. Socratic Dialogue:
After the quiz, the tutor engages the student through a Socratic dialogue. This process employs constrained AI behavior—carefully tuned system instructions ensure that the tutor asks short, guided questions rather than delivering lengthy explanations. This constraint helps channel the student’s thinking toward exploring underlying principles rather than relying on straightforward answers.

4. Context Maintenance:
The system maintains the conversation history to retain context across multiple turns. This personalized context allows the AI tutor to adapt its line of questioning based on the student’s previous responses and demonstrated understanding.

5. Evaluation and Feedback:
An evaluation module uses a two-step process to generate detailed feedback on the tutoring interaction. The tutor’s response is first analyzed in verbose natural language and then distilled into a concise rating. This helps ensure that the feedback aligns with the educational goals and Socratic method.

## Imports and Setup

This cell imports the necessary libraries and sets up the Gemini API client. It also retrieves the API key from Kaggle secrets.

In [1]:
!pip uninstall google-genai -y
!pip install google-genai==1.7.0

Found existing installation: google-genai 0.2.2
Uninstalling google-genai-0.2.2:
  Successfully uninstalled google-genai-0.2.2
Collecting google-genai==1.7.0
  Downloading google_genai-1.7.0-py3-none-any.whl.metadata (32 kB)
Collecting anyio<5.0.0,>=4.8.0 (from google-genai==1.7.0)
  Downloading anyio-4.9.0-py3-none-any.whl.metadata (4.7 kB)
Downloading google_genai-1.7.0-py3-none-any.whl (144 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading anyio-4.9.0-py3-none-any.whl (100 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: anyio, google-genai
  Attempting uninstall: anyio
    Found existing installation: anyio 3.7.1
    Uninstalling anyio-3.7.1:
      Successfully uninstalled anyio-3.7.1
Successfully installed anyio-4.9.0 google-genai-1.7.0


In [2]:
from google import genai
from google.genai import types
from IPython.display import HTML, Markdown, display
from google.api_core import retry
from kaggle_secrets import UserSecretsClient

import typing_extensions as typing
import json
import pydantic
import enum

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
client = genai.Client(api_key=GOOGLE_API_KEY)

print(genai.__version__)

1.7.0


## Data Structure for Chat History

These classes define the structure for storing chat turns and their content. The `SimplePart` class is a simple wrapper for text content, and the `ChatTurn` class represents a single turn in the conversation, including the role (user or AI) and the content.

These classes are used to maintain a structured history of the conversation.

In [3]:
# Define a simple Part-like class to wrap text.
class SimplePart:
    def __init__(self, text):
        self.text = text

# Update the ChatTurn class to use SimplePart for parts.
class ChatTurn:
    def __init__(self, role, content):
        self.role = role
        self.content = content
        self.parts = [SimplePart(content)]
    
    def __repr__(self):
        return f"ChatTurn(role={self.role!r}, content={self.content!r})"
        
# Color class console output
class Colors:
    RED = '\033[91m'
    GREEN = '\033[92m'
    YELLOW = '\033[93m'
    BLUE = '\033[94m'
    RESET = '\033[0m'

## Socratic Prompt - System Intructions

This cell defines the core prompt that guides the AI's behavior. It outlines the AI's role as a Socratic tutor, provides guidelines for interaction, and includes example phrases.

In [4]:
socratic_prompt = """
Role: You are a thoughtful and patient tutor whose goal is to help the student master concepts through independent problem solving. Your job is to encourage critical thinking, provide detailed feedback, and promote metacognitive awareness.

Guidelines:

1.  Never immediately provide the answer to the student's question.
2.  First, ask the student to clearly explain what steps they've already taken and precisely where they're getting stuck.
3.  Provide incremental hints or small nudges, not complete steps. Guide the student to think critically about the next logical action.
4.  Error Analysis and Feedback:
    -If the student makes a mistake, point out gently where and why the misunderstanding occurred.
    -Explain the underlying misconceptions and provide clear, concise explanations.
    -Provide the student with resources that can help them understand the problem better.
5.  Metacognitive Prompts:
    -Frequently prompt students to summarize their current understanding before moving forward.
    -Ask students to reflect on their learning process and identify their strengths and weaknesses.
    -Ask questions that promote self-evaluation, such as:
        -"What strategies did you use to solve this problem?"
        -"What could you have done differently?"
        -"How confident are you in your understanding of this concept?"
        -"What are some areas where you feel you need more practice?"
6.  Adaptive Questioning:
    -Adapt the difficulty of your questions based on the student's responses.
    -If the student demonstrates a strong understanding, introduce more challenging questions.
    -If the student is struggling, provide simpler questions and additional support.
7.  If the student directly asks for the final solution, respectfully decline and instead redirect them by providing another hint or asking guiding questions.
8.  Be encouraging and supportive throughout your interactions. Reinforce effort, progress, and persistence.

Example phrases you can use:

-   "Can you explain your thinking so far?"
-   "That's an interesting approach; what might you try next?"
-   "You're on the right track, but check your previous step carefully, do you see anything unusual?"
-   "Let’s slow down here. What information from the problem haven't you used yet?"
-   "What are some strategies you used to come to that conclusion?"
-   "Where do you think your understanding is breaking down?"
-   "Explain this concept as if you were teaching it to a friend."

Remember: your primary goal is to help the student develop problem-solving skills, confidence, and a deeper understanding, rather than simply providing solutions.
"""
response = client.models.generate_content(
    model="gemini-2.0-flash",
    config=types.GenerateContentConfig(
        system_instruction=socratic_prompt),
    contents="Hello there"
)

## ask_agent Function

This cell is responsible for sending user messages to the Gemini API, maintaining the conversation history, and returns the AI's responses.

In [5]:
history = []
def ask_agent(user_message: str, model: str = "gemini-2.0-flash"):
    """
    Send a user's message to the Gemini, maintain history and multi-turn chat,
    then return the AI's response.
    """
    # Append the user's message as a structured dict
    history.append(ChatTurn(role="user", content=user_message))
    
    # Create a new chat session (or update) with the current history
    chat = client.chats.create(model=model, history=history)
    
    # Send the message to the model
    agent_response = chat.send_message(user_message)
    
    # Convert the response to text
    response_text = agent_response.text if hasattr(agent_response, "text") else str(agent_response)
    
    # Append the AI's response to the history
    history.append(ChatTurn(role="model", content=response_text))
    
    # Return the entire response object (so you can access .text, etc.)
    return agent_response

## Typing Definitions for Quiz

This cell defines the Pydantic models for quiz questions and quizzes, ensuring structured JSON output.

In [6]:
class QuizQuestion(pydantic.BaseModel):
    question: str
    options: list[str]
    answer: str

class Quiz(pydantic.BaseModel):
    quiz: list[QuizQuestion]

## Quiz Generation

This function generates a quiz in JSON format using Gemini's JSON mode. It takes a topic as input and returns a list of quiz questions.

In [7]:
def generate_quiz(topic: str) -> list[QuizQuestion]:
    """
    Uses Gemini's one shot 'generate_content' and JSON mode approach to produce a quiz in JSON format.
    Returns a list of quiz questions.
    """

    # 7 questions for now, but need to find a better way to evaluate?
    quiz_generation_prompt = f"""
    Generate a short quiz (7 questions) to assess the student's understanding of basic concepts about {topic}.
    Let them know that your are first going to generate a short quiz to determin there understanding of the topic to help them further.
    Structure the quiz in JSON format with the following schema:

    {{
        "quiz": [
            {{
                "question": "...", 
                "options": ["...", "...", "...", "..."], 
                "answer": "..."
            }}
        ]
    }}

    Each question must be multiple choice with exactly 4 options.
    """

    # Manually construct the schema (had issues with JSON format earlier, solution for now)
    quiz_schema = {
        "type": "object",
        "properties": {
            "quiz": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "question": {"type": "string"},
                        "options": {"type": "array", "items": {"type": "string"}},
                        "answer": {"type": "string"}
                    },
                    "required": ["question", "options", "answer"]
                }
            }
        },
        "required": ["quiz"]
    }

    response = client.models.generate_content(
        model='gemini-2.0-flash',
        config=types.GenerateContentConfig(
            temperature=0.1,
            response_mime_type="application/json",
            response_schema=quiz_schema, # Using the manual schema
        ),
        contents=quiz_generation_prompt
    )

    # Parse the JSON response
    quiz_data = json.loads(response.text)
    return [QuizQuestion(**q) for q in quiz_data["quiz"]]


## administer_quiz Function

This cell is responsible for displaying the quiz questions in the console, collecting user answers, and returning a list of the user's responses.

In [8]:
def administer_quiz(quiz: list[QuizQuestion]) -> list[str]:
    """
    Display each question, gather user input for each.
    """
    user_answers = []
    
    print(f"{Colors.GREEN}Hi! We will  start with a short quiz to gauge your current understanding:\n{Colors.RESET}")
    
    for i, q in enumerate(quiz):
        print(f"{Colors.GREEN}Q{i+1}: {q.question}{Colors.RESET}")
        for j, opt in enumerate(q.options):
            print(f"{Colors.GREEN}    {j+1}. {opt}{Colors.RESET}")
        choice = input(f"{Colors.BLUE}Enter the number of your choice: {Colors.RESET}")
        
        try:
            idx = int(choice) - 1
            user_answers.append(q.options[idx])
        except:
            user_answers.append("INVALID")
        print()
    return user_answers

## evaluate_quiz Function

This cell is responsible for evaluating the user's quiz answers and returning a score and a list of per-question results.

In [9]:
def evaluate_quiz(quiz: list[QuizQuestion], user_answers: list[str]):
    """
    Compare the user's answers to the correct ones. 
    Return a score and a list of per-question results.
    """
    score = 0
    results = []
    for i, question in enumerate(quiz):
        if i < len(user_answers) and user_answers[i] == question.answer:
            score += 1
            results.append(f"Question {i+1}: Correct!")
        else:
            results.append(f"Question {i+1}: Incorrect! Correct answer: {question.answer}")
    return score, results

## Socratic Dialogue Evaluation Prompt and Rating Enum

This cell defines the prompt used to evaluate the AI's performance in a Socratic dialogue. It outlines the evaluation criteria, rating rubric, and evaluation steps.

The `SocraticRating` enum class provides a structured way to represent the evaluation ratings.

In [10]:
# Define the evaluation prompt for Socratic Dialogue
SOCRATIC_EVAL_PROMPT = """
Instruction: You are an expert evaluator assessing the quality of responses generated by an AI-based Socratic tutor. The purpose of a Socratic tutor is to help learners arrive at solutions through guided questioning and critical thinking, promoting self-discovery rather than directly providing the answer.

Please ignore any irrelevant topics. Only evaluate the following input and AI response with respect to the subject: "{subject}".

Carefully read the user input and the AI generated response provided. Evaluate the quality of the response based on the criteria outlined below. Provide step by step explanations for your rating, strictly using the given Rating Rubric.

Metric Definition: You will evaluate the response quality specifically based on its effectiveness as a Socratic dialogue. The response should encourage active thinking, self-discovery, and critical analysis, guiding learners to derive answers independently rather than explicitly providing solutions.

Criteria:

- Socratic Engagement:    
    - Does the response primarily use questions and prompts that stimulate critical thinking and encourage the learner to reflect deeply?
    - Does the response avoid directly providing the answer, guiding the learner toward self-discovery instead?
        
- Relevance and Groundedness:
    - Does the response clearly relate to the user's input and the learning goal (specifically regarding "{subject}")?
    - Does the response avoid introducing irrelevant or external information?
        
- Clarity and Understandability:
    - Is the response phrased clearly and understandably, making it easy for the learner to grasp and engage with the questions posed?
        
- Depth and Thoughtfulness:
    - Does the response show thoughtful consideration of the learner’s level of understanding, adapting complexity accordingly?
    - Does it encourage deeper analysis rather than superficial engagement?

Rating Rubric:
- 5 (Excellent): The response thoroughly embodies the Socratic method, strongly encourages critical thinking, is highly relevant, clear, and thoughtfully engages the learner.
    
- 4 (Good): The response effectively uses Socratic questioning, is relevant, clear, and generally thoughtful, though it might have minor areas for improved engagement or depth.
    
- 3 (Moderate): The response shows basic Socratic engagement and relevance but lacks clarity or sufficient depth to fully encourage meaningful reflection or self-discovery.
    
- 2 (Poor): The response is minimally Socratic, provides too much direct guidance or answers explicitly, lacks clarity, or has limited relevance.
    
- 1 (Very Poor): The response fails to utilize the Socratic method, directly provides answers without prompting self-discovery, is unclear, irrelevant, or not useful in stimulating reflection.

Evaluation Steps:
STEP 1: Carefully analyze the AI generated response according to each criterion (Socratic Engagement, Relevance and Groundedness, Clarity and Understandability, Depth and Thoughtfulness).
STEP 2: Assign a rating from the rubric and provide a clear step by step explanation justifying your evaluation.
"""


# A structured enum class for Socratic Dialogue ratings
class SocraticRating(enum.Enum):
    EXCELLENT = '5'
    GOOD = '4'
    ADEQUATE = '3'
    POOR = '2'
    VERY_POOR = '1'

## Function to Evaluate Socratic Dialogue

This function evaluates the AI's responses in a Socratic dialogue using the Gemini API. It takes the user's message, AI's response, and the subject as input, and returns a verbose evaluation and a structured rating.

This function uses a two-step approach:

1.  Generates a detailed evaluation based on the `SOCRATIC_EVAL_PROMPT`.
2.  Converts the final score from the evaluation into a `SocraticRating` enum.

In [11]:
def eval_socratic_dialogue(user_message, ai_response, subject):
    """Evaluates the AI's response in a Socratic dialogue using generate_content."""
    
    formatted_prompt = SOCRATIC_EVAL_PROMPT.format(
        subject=subject,
        user_message=user_message,
        ai_response=ai_response
    )
    
    # Step 1: Generate verbose evaluation
    first_response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=formatted_prompt,
        config=types.GenerateContentConfig()
    )
    verbose_eval = first_response.text
    
    # Step 2: Convert final score to enum
    second_prompt = (
        "You produced the following evaluation text:\n\n"
        f"{verbose_eval}\n\n"
        "Now, please convert the final score into an enum from 1 to 5. "
        "Use only the rating rubric's standard enumerations: 1, 2, 3, 4, or 5."
    )
    structured_output_config = types.GenerateContentConfig(
        response_mime_type="text/x.enum",
        response_schema=SocraticRating
    )
    second_response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=second_prompt,
        config=structured_output_config
    )
    structured_eval = second_response.parsed
    
    return verbose_eval, structured_eval

## Main Workflow - start_learning_session

This cell orchestrates the entire learning session: generating the quiz, administering it, evaluating the results, and initiating the Socratic dialogue.

In [12]:
def start_learning_session(topic: str, doQuiz: bool, turn_on_verbose: bool):
    """
    - Generates a quiz for the given topic (If wanted)
    - Administers the quiz (collects user answers) (If wanted)
    - Evaluates the quiz (If wanted)
    - Appends a summary of quiz results to the conversation (If wanted)
    - Begins a Socratic dialogue with ask_agent
    """
    
    if(doQuiz):
        # Step 1: Generate the quiz
        quiz = generate_quiz(topic)

        # Step 2: Present quiz to user & collect answers
        user_answers = administer_quiz(quiz)

        # Step 3: Evaluate the quiz
        score, results = evaluate_quiz(quiz, user_answers)
        summary = f"Quiz Results ({score}/{len(quiz)})\n" + "\n".join(results)

    
        # Step 4: Append quiz summary to chat
        history.append(ChatTurn(role="user", content=summary))

        # Now let's begin the Socratic conversation referencing the quiz result
        prompt_for_agent = (
            f"Based on this quiz result, let's begin the Socratic dialogue on {topic}.\n"
            f"Quiz summary:\n{summary}.\n"
            "This is where the user currently stands with their knowledge on the topic. "
            "We can use this to identify potential weaknesses and address them in a socratic way. "
            "Please keep your responses concise, focusing on short, guided questions. "
            "Start by trying to understand what the user knows already about the topic, "
            "unless the user explicitly requests more detail."
        )

        # Send the initial prompt
        initial_response = ask_agent(prompt_for_agent)
        print(f"{Colors.GREEN}AI: {initial_response.text}{Colors.RESET}")

    else:    
        prompt_for_agent = (
           f"We are ready to start our socratic dialogue with following topic {topic}"
        )

        initial_response = ask_agent(prompt_for_agent)
        print(f"{Colors.GREEN}AI: {initial_response.text}{Colors.RESET}")

    # Conversation loop
    while True:
        user_input = input(f"{Colors.BLUE}User Input:{Colors.RESET} ")
        if user_input.lower() in ["exit", "quit", "bye"]:
            print(f"{Colors.GREEN}AI:Goodbye!{Colors.RESET}")
            break
        
        # Get the AI response
        agent_response = ask_agent(user_input)
        print(f"{Colors.GREEN}AI: {agent_response.text}{Colors.RESET}")

        # Evaluate the AI response
        verbose_eval, structured_eval = eval_socratic_dialogue(
            user_message=user_input,
            ai_response=agent_response.text,
            subject=topic
        )

        # Display evaluation
        print(f"\n{Colors.RED}Evaluation:{Colors.RESET}")
        if(turn_on_verbose):
            print(f"{Colors.RED}{verbose_eval}{Colors.RESET}")
        print(f"{Colors.RED}Rating: {structured_eval.value}{Colors.RESET}\n")


## Example Learning Sessions

This cell demonstrates how to start a learning session with different topics. The `start_learning_session` function initiates the quiz, dialogue, and evaluation process.

**Important:** The cell below initiates an interactive session which uses input(). This will fail during non-interactive Kaggle "Save Version" runs. So I have left this part commented out. Uncomment these lines when running the notebook interactively yourself.

In [13]:
# --- IMPORTANT ---
# The line below initiates an interactive session which uses input().
# This will fail during non-interactive Kaggle 'Save Version' runs.
# Uncomment the line ONLY when running the notebook interactively yourself.
# -----------------

# start_learning_session("How does an MP (McCulloch-Pitts) neuron work?", True, False)