# Feedback & Explanation Generation

**Project:** Synthetic Data Creation: Survey and Synthesis  
**Method Group:** Educational and Pedagogical Data Synthesis  
**Sub-method:** Feedback & Explanation Generation (FEG)  
**Author:** Prajna Penmetsa

**Goal:**  
Generate personalized feedback and explanatory responses for synthetic learner profiles.  
- Using the Gemini 2.5 Flash REST API, the model analyzes each learner‚Äôs misconceptions, reasoning, and correctness to produce structured feedback that targets conceptual gaps and encourages progression.


In [1]:
from dotenv import load_dotenv
import os, json, requests, time
from tqdm import tqdm
import pandas as pd

load_dotenv()
API_KEY = os.getenv("GEMINI_API_KEY")
assert API_KEY, "‚ùå GEMINI_API_KEY not found. Please check your .env file."

MODEL = "gemini-2.5-flash"
URL = f"https://generativelanguage.googleapis.com/v1beta/models/{MODEL}:generateContent?key={API_KEY}"

def call_gemini(prompt):
    payload = {"contents": [{"parts": [{"text": prompt}]}]}
    r = requests.post(URL, json=payload)
    if r.ok:
        return r.json()["candidates"][0]["content"]["parts"][0]["text"]
    else:
        print("‚ùå Error:", r.status_code, r.text)
        return None

os.makedirs("outputs/feg", exist_ok=True)

In [2]:
# Load learner profiles from SLM output
with open("../synthetic-learner-modeling/outputs/slm/synthetic_learners.json", "r", encoding="utf-8") as f:
    learners = json.load(f)

print(f"‚úÖ Loaded {len(learners)} learner profiles.")
print("Sample:", learners[0]["student_name"], "-", learners[0]["learning_level"])

‚úÖ Loaded 5 learner profiles.
Sample: Alice - beginner


In [4]:
def make_feedback_prompt(learner):
    name = learner["student_name"]
    level = learner["learning_level"]
    misconceptions = "; ".join(learner["misconceptions"])
    responses = "\n".join([
        f"Q: {r['question']}\nA: {r['student_answer']}\nReasoning: {r['reasoning']}\nCorrectness: {r['correctness']}"
        for r in learner["responses"]
    ])

    prompt = f"""
You are an intelligent tutoring system providing personalized feedback to a student.

Student Name: {name}
Learning Level: {level}
Common Misconceptions: {misconceptions}

Student Responses:
{responses}

Based on the above, generate structured feedback in JSON format:

{{
  "student_name": "{name}",
  "conceptual_feedback": "Explain the correct concept clearly and address the student's misunderstanding.",
  "motivational_feedback": "A short encouraging message.",
  "example_explanation": "Give a simple example or analogy related to the student‚Äôs misconception.",
  "next_practice_recommendation": "Suggest a short exercise or step to reinforce understanding."
}}

Ensure explanations are domain-accurate, pedagogically sound, and concise.
Return only the JSON array for this student.
"""
    return prompt

In [5]:
feedback_data = []

for learner in tqdm(learners, desc="Generating feedback"):
    prompt = make_feedback_prompt(learner)
    response = call_gemini(prompt)
    if response:
        feedback_data.append({"student_name": learner["student_name"], "feedback_raw": response})
    time.sleep(2)  # polite pacing

Generating feedback: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 5/5 [01:12<00:00, 14.44s/it]


In [6]:
import re

parsed_feedback = []

for fb in feedback_data:
    match = re.search(r"\{.*\}", fb["feedback_raw"], re.DOTALL)
    if match:
        try:
            feedback_json = json.loads(match.group(0))
            parsed_feedback.append(feedback_json)
        except Exception as e:
            print(f"‚ö†Ô∏è Parse error for {fb['student_name']}: {e}")

# Save results
with open("outputs/feg/feedback_generated.json", "w", encoding="utf-8") as f:
    json.dump(parsed_feedback, f, indent=2, ensure_ascii=False)

print(f"‚úÖ Parsed and saved structured feedback for {len(parsed_feedback)} learners.")

‚úÖ Parsed and saved structured feedback for 5 learners.


In [7]:
for fb in parsed_feedback[:3]:
    print(f"üßë‚Äçüè´ Feedback for {fb['student_name']}:")
    print("Conceptual:", fb["conceptual_feedback"])
    print("Motivational:", fb["motivational_feedback"])
    print("Example:", fb["example_explanation"])
    print("Next Step:", fb["next_practice_recommendation"])
    print("-" * 80)

üßë‚Äçüè´ Feedback for Alice:
Conceptual: Great job identifying the parts of a fraction in your second answer, Alice! When we compare fractions like 1/3 and 1/5, it's easy to think bigger numbers mean bigger pieces. However, the bottom number (the denominator) tells us how many *equal* pieces the whole is divided into. The more pieces you divide something into, the smaller each individual piece becomes.
Motivational: You're showing good thinking, Alice, and it's completely normal to have these kinds of questions when learning fractions!
Example: Let's use your cake idea! Imagine you have two identical cakes. If you cut one cake into 3 equal slices, and the other into 5 equal slices. Each slice from the cake cut into 3 pieces (like 1/3) will be a much bigger chunk than each slice from the cake cut into 5 pieces (like 1/5). So, 1/3 is actually larger.
Next Step: For your next step, try drawing two identical rectangles. Divide one into 2 equal parts and shade 1 part (1/2). Divide the ot

### Observations & Results

**1. Structure and Validity**  
- All five feedback entries followed the specified JSON schema with consistent fields:  
  (`student_name`, `conceptual_feedback`, `motivational_feedback`, `example_explanation`, `next_practice_recommendation`).  
- No formatting or parsing issues were detected.  
- Each feedback instance directly aligned with the learner‚Äôs misconceptions and reasoning patterns from SLM, showing strong contextual continuity.

**2. Pedagogical Precision and Differentiation**  
- **Personalization:** Each feedback piece references the learner by name and tailors explanations to their exact misconceptions.  
  - *Alice:* Addressed denominator‚Äìnumerator comparison through a concrete cake analogy.  
  - *Ben:* Clarified why denominators stay constant when adding fractions.  
  - *David:* Corrected division-of-fractions misconception with ‚Äúinvert and multiply.‚Äù  
- **Conceptual Depth:** Explanations move beyond correctness to highlight *why* reasoning errors occur, often grounding concepts in visuals or analogies (e.g., ‚Äútwo identical cakes,‚Äù ‚Äúapple analogy,‚Äù ‚Äúpizza serving model‚Äù).  
- **Progression-Aware Feedback:** The tone and complexity scale with learner level ‚Äî from foundational guidance for beginners to metacognitive prompts for advanced learners like *Emily*.

**3. Motivational and Affective Design**  
- Every entry includes a supportive motivational statement, balancing cognitive correction with emotional reinforcement.  
- The tone is consistent with modern tutoring principles ‚Äî encouraging, empathetic, and specific.

**4. Pedagogical Completeness**  
- Each JSON record provides a full pedagogical loop:
  - *Diagnosis* (learner misconception ‚Üí conceptual feedback)  
  - *Explanation* (example or analogy)  
  - *Action* (next practice recommendation)  
- This aligns with formative assessment frameworks used in adaptive tutoring systems, enabling immediate feedback-driven learning cycles.

**5. Evaluation Summary**

| Metric | Observation |
|:--|:--|
| Structural fidelity | Excellent ‚Äì valid JSON for all outputs |
| Contextual alignment | High ‚Äì feedback maps cleanly to learner misconceptions |
| Pedagogical clarity | Strong ‚Äì conceptual and actionable explanations |
| Motivational tone | Consistent ‚Äì empathetic and supportive across profiles |
| Adaptivity | High ‚Äì scaled feedback by learning level |

**6. Overall Insight**  
The generated feedback exemplifies *pedagogically intelligent synthetic data*: each entry mirrors realistic teacher responses that are both corrective and motivational.  
This confirms that large models can not only model **learner behavior** (SLM) but also simulate **teacher feedback loops**, producing structured, domain-accurate, and emotionally balanced instructional data suitable for integration into adaptive tutoring pipelines.

### Run Metadata
- Date: November 4th, 2025  
- Model: `gemini-2.5-flash`  
- Endpoint: `v1beta REST API`  
- Input Source: `synthetic_learners.json` (from SLM)  
- Learner Profiles Processed: 5  
- Output File: `outputs/feg/feedback_generated.json`  
- Temperature: default (~0.9)  
- Author: Prajna Penmetsa