<a href="https://colab.research.google.com/github/Latifa2025-star/311calls/blob/main/Transforming_Math_Education_Adapting_Word_Problems_for_Diverse_Learners_Using_Transformer_Models_Nov29.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Adapting Grade-School Math Word Problems with Transformers

**Course:** Data Analytics  
**Goal of this notebook:**  
We use a real math word-problem dataset (GSM8K) and large language models (LLMs) to:

1. Load and explore a small set of elementary math word problems.  
2. ‚ÄúTweak‚Äù each problem for three learner profiles:  
   - **ADHD**  
   - **ELL (English Language Learner)**  
   - **Intellectual Disability (ID)**  
3. Generate step-by-step math solutions and then **adapt the explanations** for each profile.  
4. Present the results in clear, teacher-friendly visual layouts that can be used in a classroom or IEP meeting.

By the end of this notebook you will see, for any selected math problem:

- The **original GSM8K problem**  
- The **adapted versions** of the problem for ADHD / ELL / ID  
- The **original GSM8K solution**  
- The **adapted explanations** for each learner profile


In [1]:
# Step 1 ‚Äì Import core libraries and configure display

import os
import json
import pandas as pd
import numpy as np

# For charts and plots (later)
import matplotlib.pyplot as plt

# For NLP tools (tokenization, etc.)
import nltk

# For pretty HTML display inside the notebook
from IPython.display import HTML, display

print("‚úÖ Environment ready:")
print("- pandas version:", pd.__version__)
print("- numpy version:", np.__version__)


‚úÖ Environment ready:
- pandas version: 2.2.2
- numpy version: 2.0.2


## Step 2 ‚Äì Load the Grade-School Math (GSM8K) Dataset

In this step we load a real-world dataset of grade-school math word problems called **GSM8K**.

- The file we use is: `grade-school-math-master.zip` (downloaded earlier from GitHub).
- Inside the ZIP there is a folder called `grade_school_math/data/` that contains the problems in JSONL format.
- We will upload the ZIP into Colab, extract it, and quickly check that the files are in the right place.

This dataset will be the **starting point for all our adaptations**.


In [2]:
# Step 2.1 ‚Äì Upload and extract the GSM8K ZIP file

from google.colab import files
import os, zipfile

print("üëâ Please choose the file: grade-school-math-master.zip from your computer.")

uploaded = files.upload()  # opens a file picker

# Get the uploaded filename (we expect just one)
zip_name = list(uploaded.keys())[0]
print(f"\n‚úÖ Uploaded file: {zip_name}")

# Create a folder to hold the extracted data
base_dir = "gsm8k"
os.makedirs(base_dir, exist_ok=True)

# Extract the ZIP into gsm8k/
with zipfile.ZipFile(zip_name, 'r') as zf:
    zf.extractall(base_dir)

print("\n‚úÖ Extraction complete!")
print("Contents of the top-level folder:")
print(os.listdir(base_dir))


üëâ Please choose the file: grade-school-math-master.zip from your computer.


Saving grade-school-math-master.zip to grade-school-math-master.zip

‚úÖ Uploaded file: grade-school-math-master.zip

‚úÖ Extraction complete!
Contents of the top-level folder:
['grade-school-math-master']


## Step 3 ‚Äî Inspect the Original GSM8K Word Problems (Raw)

Before cleaning or transforming anything, we first look at the data **exactly as it appears** in the GSM8K dataset.

Each row contains:
- `question` ‚Äì the original math word problem text  
- `answer` ‚Äì the original worked-out solution (often chain-of-thought)

In the table below, we show the first few problems **without any cleaning**, so we can see the raw format of the dataset.


In [8]:
# Step 3 ‚Äî Raw preview of original GSM8K problems + solutions

raw_preview = df[["question", "answer"]].head(3).copy()
raw_preview.insert(0, "Problem #", range(1, len(raw_preview) + 1))

raw_style = """
<style>
.raw-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', Arial, sans-serif;
    font-size: 15px;
}

/* Header */
.raw-table thead th {
    background: #2D9CDB;
    color: white;
    padding: 8px;
    text-align: left;
}

/* Cells */
.raw-table td {
    padding: 10px 8px;
    border-bottom: 1px solid #ddd;
    vertical-align: top;
}

/* Problem # column */
.raw-table td:first-child,
.raw-table th:first-child {
    width: 90px;
    text-align: center;
}

/* Make question and answer visually distinct */
.raw-table td:nth-child(2) {
    background-color: #F7FAFF;
    font-weight: 500;
}
.raw-table td:nth-child(3) {
    background-color: #FFFFFF;
}
</style>
"""

html_raw = raw_preview.to_html(
    classes="raw-table",
    escape=True,   # show text exactly as in the original file
    index=False,
)

display(HTML(raw_style + html_raw))


Problem #,question,answer
1,"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72
2,"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?","Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.\nWorking 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10.\n#### 10"
3,"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?","In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.\nBetty's grandparents gave her 15 * 2 = $<<15*2=30>>30.\nThis means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more.\n#### 5"


In [10]:
# Step 3 ‚Äî Raw preview of original GSM8K problems + solutions (clean and readable)

raw_preview = df[["question", "answer"]].head(3).copy()
raw_preview.insert(0, "Problem #", range(1, len(raw_preview) + 1))

# Convert \n to <br> for readable HTML
raw_preview["answer"] = raw_preview["answer"].str.replace("\n", "<br>", regex=False)

custom_style = """
<style>
.raw-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', Arial, sans-serif;
    font-size: 15px;
    table-layout: fixed;
}

/* Header row */
.raw-table thead th {
    background: #2D9CDB;
    color: #ffffff !important;
    padding: 12px;
    text-align: left;
    font-size: 16px;
    font-weight: 700;
}

/* Problem # column header */
.raw-table thead th:first-child {
    text-align: center;
}

/* Problem # cells */
.raw-table td:first-child {
    width: 90px;
    text-align: center;
    font-weight: 700;
    color: #1a1a1a !important;     /* DARK TEXT */
    background: #E8F1FF !important; /* LIGHT BLUE BACKGROUND */
    border-right: 1px solid #d0d0d0;
}

/* Question cells */
.raw-table td:nth-child(2) {
    background: #F7FAFF;
    font-weight: 500;
    color: #222;
    padding: 14px 16px;
}

/* Answer cells */
.raw-table td:nth-child(3) {
    background: #ffffff;
    color: #333;
    padding: 14px 16px;
    line-height: 1.5;
    white-space: normal;
    border-left: 1px solid #e0e0e0;
}
</style>
"""

from IPython.display import HTML
display(HTML(custom_style + raw_preview.to_html(classes="raw-table", escape=False, index=False)))



Problem #,question,answer
1,"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",Natalia sold 48/2 = <<48/2=24>>24 clips in May. Natalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May. #### 72
2,"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?","Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute. Working 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10. #### 10"
3,"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?","In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50. Betty's grandparents gave her 15 * 2 = $<<15*2=30>>30. This means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more. #### 5"


In [11]:
raw_preview = df[["question", "answer"]].head(3).copy()
raw_preview.insert(0, "Problem #", range(1, len(raw_preview) + 1))

# Clean answer formatting
raw_preview["answer"] = raw_preview["answer"].str.replace("\n", "<br>", regex=False)

custom_style = """
<style>
.raw-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', Arial, sans-serif;
    font-size: 15px;
    table-layout: fixed;
}

/* HEADER */
.raw-table thead th {
    background: #2D9CDB;
    color: white !important;
    padding: 10px;
    font-weight: 700;
    border: 1px solid #d0d0d0;
}

/* Narrow Problem # column */
.raw-table th:first-child,
.raw-table td:first-child {
    width: 60px !important;          /* ‚Üê FIXED NARROW WIDTH */
    max-width: 60px !important;
    min-width: 60px !important;
    text-align: center;
    font-weight: 700;
    color: #000;
    background: #E8F1FF !important;
    border-right: 1px solid #cccccc;
}

/* Question column */
.raw-table td:nth-child(2) {
    padding: 14px;
    background: #F7FAFF;
    color: #222;
    line-height: 1.45;
}

/* Answer column */
.raw-table td:nth-child(3) {
    padding: 14px;
    background: #ffffff;
    color: #333;
    line-height: 1.45;
    white-space: normal;
}

/* Add subtle row borders */
.raw-table td {
    border-bottom: 1px solid #e0e0e0;
}
</style>
"""

from IPython.display import HTML
display(HTML(custom_style + raw_preview.to_html(classes="raw-table", escape=False, index=False)))


Problem #,question,answer
1,"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",Natalia sold 48/2 = <<48/2=24>>24 clips in May. Natalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May. #### 72
2,"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?","Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute. Working 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10. #### 10"
3,"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?","In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50. Betty's grandparents gave her 15 * 2 = $<<15*2=30>>30. This means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more. #### 5"


In [12]:
raw_preview = df[["question", "answer"]].head(3).copy()
raw_preview.insert(0, "Problem #", range(1, len(raw_preview) + 1))

# Clean answer formatting
raw_preview["answer"] = raw_preview["answer"].str.replace("\n", "<br>", regex=False)

custom_style = """
<style>
.raw-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', Arial, sans-serif;
    font-size: 15px;
    table-layout: fixed;
}

/* HEADER */
.raw-table thead th {
    background: #2D9CDB;
    color: white !important;
    padding: 10px;
    font-weight: 700;
    border: 1px solid #d0d0d0;
}

/* Narrow Problem # column ‚Äî now BLUE */
.raw-table th:first-child,
.raw-table td:first-child {
    width: 60px !important;
    max-width: 60px !important;
    min-width: 60px !important;
    text-align: center;
    font-weight: 700;
    color: white !important;           /* change text to white for contrast */
    background: #2D9CDB !important;    /* ‚Üê BLUE background */
    border-right: 1px solid #cccccc;
}

/* Question column */
.raw-table td:nth-child(2) {
    padding: 14px;
    background: #F7FAFF;
    color: #222;
    line-height: 1.45;
}

/* Answer column */
.raw-table td:nth-child(3) {
    padding: 14px;
    background: #ffffff;
    color: #333;
    line-height: 1.45;
    white-space: normal;
}

/* Add subtle row borders */
.raw-table td {
    border-bottom: 1px solid #e0e0e0;
}
</style>
"""

from IPython.display import HTML
display(HTML(custom_style + raw_preview.to_html(classes="raw-table", escape=False, index=False)))


Problem #,question,answer
1,"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",Natalia sold 48/2 = <<48/2=24>>24 clips in May. Natalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May. #### 72
2,"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?","Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute. Working 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10. #### 10"
3,"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?","In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50. Betty's grandparents gave her 15 * 2 = $<<15*2=30>>30. This means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more. #### 5"


## Step 4 ‚Äî Create a Cleaned Version of the Word Problems

The original `question` text sometimes contains:
- Extra spaces or line breaks  
- Slightly inconsistent punctuation  

For modeling and adaptation, we create a cleaner version called `question_clean` where we:
- Normalize spaces  
- Strip leading/trailing whitespace  
- Keep the **math content and numbers exactly the same**

Below we create `question_clean` and preview the first few cleaned problems in a neat table.


In [14]:
import re
from IPython.display import HTML, display

# --- Create cleaned question text ---

def clean_text(text: str) -> str:
    # collapse multiple spaces/newlines into a single space
    text = re.sub(r"\s+", " ", text)
    # trim spaces at start/end
    text = text.strip()
    # ensure a space after periods (basic formatting)
    text = re.sub(r"\.\s*", ". ", text)
    return text

# Apply cleaning
df["question_clean"] = df["question"].apply(clean_text)

# --- Build a nice preview table for the cleaned questions ---

clean_preview = df[["question_clean"]].head(5).copy()
clean_preview.insert(0, "Problem #", range(1, len(clean_preview) + 1))

clean_style = """
<style>
.clean-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', Arial, sans-serif;
    font-size: 15px;
    table-layout: fixed;
}

/* header */
.clean-table thead th {
    background: #4C7FF0;
    color: #ffffff;
    padding: 10px;
    font-weight: 700;
    border: 1px solid #d0d0d0;
}

/* narrow number column */
.clean-table th:first-child,
.clean-table td:first-child {
    width: 60px;
    max-width: 60px;
    min-width: 60px;
    text-align: center;
    font-weight: 700;
    color: #ffffff;
    background: #4C7FF0;
}

/* cleaned question column */
.clean-table td:nth-child(2) {
    padding: 12px 14px;
    background: #F7FAFF;
    color: #222;
    line-height: 1.45;
    text-align: left;       /* ‚Üê LEFT ALIGN ALL LINES */
    white-space: normal;    /* allow wrapping */
}

/* row borders */
.clean-table td {
    border-bottom: 1px solid #e0e0e0;
}
</style>
"""

html_clean = clean_preview.to_html(
    classes="clean-table",
    escape=False,
    index=False,
)

display(HTML(clean_style + html_clean))


Problem #,question_clean
1,"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
2,"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?"
3,"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?"
4,"Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?"
5,James writes a 3-page letter to 2 different friends twice a week. How many pages does he write a year?


## Step 5 ‚Äî Connect to OpenAI GPT-4o (Securely)

We use the OpenAI API to adapt each math word problem and explanation for:

- ADHD
- English Language Learners (ELL)
- Intellectual Disability (ID)

For security, the API key is **not written in the notebook**.  
Instead, it is stored as a Colab *secret* named `OPENAI_API_KEY`, and we read it at runtime.


In [15]:
from google.colab import userdata
from openai import OpenAI

# Read the API key from Colab secrets (you already created OPENAI_API_KEY)
api_key = userdata.get("OPENAI_API_KEY")

if api_key is None:
    raise ValueError(
        "No API key found. Go to the üîë Secrets panel and create one named 'OPENAI_API_KEY'."
    )

# Create the OpenAI client
client = OpenAI(api_key=api_key)

print("‚úÖ OpenAI client is ready to use!")


‚úÖ OpenAI client is ready to use!


## Step 6 ‚Äî Define GPT-4o Functions for Adapting Problems and Explanations

We now create two helper functions:

1. `adapt_problem_with_gpt(question, profile)`  
   - Input: a math word problem and a learner profile (ADHD, ELL, or Intellectual Disability).  
   - Output: a **rewritten version of the problem**, keeping:
     - all numbers the same  
     - the math meaning the same  
     - only the language and scaffolding changed.

2. `adapt_explanation_with_gpt(question, teacher_solution, profile)`  
   - Input: the original problem, the original step-by-step solution, and the learner profile.  
   - Output: a **step-by-step explanation** tailored to that student profile,
     using clear and supportive language.

Both functions use the `gpt-4o` model through the OpenAI API.
We will call them later for specific problems and display the results in a fancy layout.


In [16]:
def adapt_problem_with_gpt(question: str, profile: str) -> str:
    """
    Rewrite the given math word problem for a specific learner profile (ADHD, ELL, ID)
    without changing the numbers or the math meaning.
    """
    prompt = f"""
You are an expert special education math teacher.

Rewrite the following math word problem for a student with this profile: **{profile}**.

Rules:
- KEEP all numbers exactly the same.
- KEEP the math meaning and operations the same.
- DO NOT solve the problem.
- Use short, clear sentences.
- Add gentle scaffolding or clarifying phrases appropriate for the profile.

Original problem:
\"\"\"{question}\"\"\"

Write ONLY the adapted problem text (no extra comments).
"""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.4,
        max_tokens=250,
    )

    return response.choices[0].message.content.strip()


def adapt_explanation_with_gpt(question: str,
                               teacher_solution: str,
                               profile: str) -> str:
    """
    Rewrite the step-by-step solution so that it is easier to understand
    for a specific learner profile (ADHD, ELL, ID).
    """
    prompt = f"""
You are an expert special education math teacher.

A student with this profile: **{profile}** is working on a word problem.

Here is the original problem:
\"\"\"{question}\"\"\"

Here is the teacher's step-by-step solution (chain-of-thought):
\"\"\"{teacher_solution}\"\"\"

Your task:
- Explain the solution in a way that fits the student profile.
- Use simple, encouraging language.
- Keep the same final numeric answer.
- Break the explanation into clear, numbered or bulleted steps.
- Do NOT change the numbers or the math operations.

Write ONLY the adapted explanation for the student.
"""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.4,
        max_tokens=350,
    )

    return response.choices[0].message.content.strip()

print("‚úÖ Adaptation functions defined (problems + explanations).")


‚úÖ Adaptation functions defined (problems + explanations).


## Step 7 ‚Äî Full Adaptation Demo for One Math Problem

In this step we:

1. Choose a problem from the GSM8K subset.  
2. Use GPT-4o to:
   - Adapt the **word problem** for:
     - ADHD  
     - English Language Learners (ELL)  
     - Intellectual Disability (ID)
   - Adapt the **step-by-step solution** (explanation) for the same three profiles.
3. Present everything in a single, well-organized view:

- Original problem  
- Adapted problems (3 profiles)  
- Original GSM8K solution  
- Adapted explanations (3 profiles)

> **Note:** This cell makes 6 API calls (3 for the problems, 3 for the explanations),
> so it uses some of your OpenAI credit.


In [17]:
from IPython.display import HTML, display

# -------------------------------
# 1. Choose which problem to demo
# -------------------------------
PROBLEM_INDEX = 0   # 0 = first problem, 1 = second, etc.

row = df.iloc[PROBLEM_INDEX]
orig_question = row["question"]
clean_question = row["question_clean"]
orig_answer = row["answer"]  # chain-of-thought + final answer

print(f"Generating adaptations for problem index {PROBLEM_INDEX}...\n(This will make 6 GPT-4o calls.)")

# -------------------------------
# 2. Generate adapted problems
# -------------------------------
adhd_problem = adapt_problem_with_gpt(clean_question, "ADHD")
ell_problem = adapt_problem_with_gpt(clean_question, "English Language Learner (ELL)")
id_problem  = adapt_problem_with_gpt(clean_question, "Intellectual Disability")

# -------------------------------
# 3. Generate adapted explanations
# -------------------------------
# Clean the original answer for readability in HTML
orig_answer_html = orig_answer.replace("\n", "<br>")

adhd_expl = adapt_explanation_with_gpt(clean_question, orig_answer, "ADHD")
ell_expl  = adapt_explanation_with_gpt(clean_question, orig_answer, "English Language Learner (ELL)")
id_expl   = adapt_explanation_with_gpt(clean_question, orig_answer, "Intellectual Disability")

# Turn explanations into HTML-friendly text
adhd_expl_html = adhd_expl.replace("\n", "<br>")
ell_expl_html  = ell_expl.replace("\n", "<br>")
id_expl_html   = id_expl.replace("\n", "<br>")

# -------------------------------
# 4. Build fancy HTML layout
# -------------------------------
css = """
<style>
.adapt-container {
    font-family: 'Segoe UI', Arial, sans-serif;
    margin: 10px 0 30px 0;
}

/* main title bar */
.adapt-header {
    background: linear-gradient(90deg, #4C7FF0, #7F5AF0);
    color: white;
    padding: 14px 18px;
    font-size: 20px;
    font-weight: 700;
    border-radius: 10px 10px 0 0;
}

/* section title */
.section-title {
    margin-top: 18px;
    margin-bottom: 6px;
    font-size: 16px;
    font-weight: 700;
    color: #1f2933;
    display: flex;
    align-items: center;
    gap: 8px;
}

/* small pill label */
.pill {
    display: inline-block;
    padding: 2px 8px;
    border-radius: 999px;
    font-size: 11px;
    font-weight: 600;
    color: #ffffff;
    background: #4C7FF0;
}

/* card */
.card {
    background: #ffffff;
    border-radius: 10px;
    padding: 14px 16px;
    box-shadow: 0 1px 4px rgba(15, 23, 42, 0.12);
    margin-bottom: 8px;
}

/* muted text under titles */
.helper-text {
    font-size: 12px;
    color: #6b7280;
    margin-bottom: 6px;
}

/* layout for three-column sections */
.three-col {
    display: grid;
    grid-template-columns: repeat(3, 1fr);
    gap: 10px;
}

/* profile badges */
.badge {
    display: inline-block;
    padding: 2px 8px;
    border-radius: 999px;
    font-size: 11px;
    font-weight: 600;
    color: white;
}

.badge-adhd { background: #F97316; }
.badge-ell  { background: #10B981; }
.badge-id   { background: #6366F1; }

/* text in cards */
.card p {
    margin: 4px 0;
    line-height: 1.45;
    font-size: 14px;
    color: #111827;
    text-align: left;
}

/* code-like area for original solution */
.solution-box {
    font-family: "Consolas", "Courier New", monospace;
    font-size: 13px;
    background: #F9FAFB;
    border-radius: 8px;
    padding: 10px 12px;
    white-space: normal;
    line-height: 1.45;
}
</style>
"""

html = f"""
<div class="adapt-container">

  <div class="adapt-header">
    üìò Adapted Math Problem #{PROBLEM_INDEX + 1}
  </div>

  <!-- Original Problem -->
  <div class="section-title">
    üìå Original Problem <span class="pill">from GSM8K</span>
  </div>
  <div class="card">
    <p>{clean_question}</p>
  </div>

  <!-- Adapted Problems -->
  <div class="section-title">
    ‚úèÔ∏è Adapted Problem Text for Each Learner Profile
  </div>
  <div class="helper-text">
    The math meaning and numbers are kept the same. Only wording, clarity, and scaffolding are adjusted.
  </div>

  <div class="three-col">
    <div class="card">
      <span class="badge badge-adhd">ADHD</span>
      <p>{adhd_problem}</p>
    </div>
    <div class="card">
      <span class="badge badge-ell">ELL</span>
      <p>{ell_problem}</p>
    </div>
    <div class="card">
      <span class="badge badge-id">Intellectual Disability</span>
      <p>{id_problem}</p>
    </div>
  </div>

  <!-- Original GSM8K Solution -->
  <div class="section-title">
    üìö Original Dataset Solution (Teacher / GSM8K)
  </div>
  <div class="helper-text">
    This is the chain-of-thought style solution that comes with the dataset.
  </div>
  <div class="card solution-box">
    {orig_answer_html}
  </div>

  <!-- Adapted Explanations -->
  <div class="section-title">
    üí° Adapted Solution Explanations for Each Profile
  </div>
  <div class="helper-text">
    All three explanations follow the same math steps and final answer, but the language is tailored to each learner.
  </div>

  <div class="three-col">
    <div class="card">
      <span class="badge badge-adhd">ADHD Explanation</span>
      <p>{adhd_expl_html}</p>
    </div>
    <div class="card">
      <span class="badge badge-ell">ELL Explanation</span>
      <p>{ell_expl_html}</p>
    </div>
    <div class="card">
      <span class="badge badge-id">ID Explanation</span>
      <p>{id_expl_html}</p>
    </div>
  </div>

</div>
"""

display(HTML(css + html))


Generating adaptations for problem index 0...
(This will make 6 GPT-4o calls.)


In [18]:
from IPython.display import HTML, display, clear_output
import ipywidgets as widgets

# ---------------------------
# Helper: run adaptation for a given problem number (1‚Äìlen(df))
# ---------------------------
def run_adaptation(problem_number: int):
    idx = problem_number - 1
    if idx < 0 or idx >= len(df):
        raise ValueError(f"Problem number must be between 1 and {len(df)}")

    row = df.iloc[idx]
    clean_question = row["question_clean"]
    orig_answer = row["answer"]

    # For HTML display of the original solution
    orig_answer_html = orig_answer.replace("\n", "<br>")

    print(f"Generating adaptations for Problem #{problem_number} (index {idx})...")
    print("This makes 6 GPT-4o calls (3 problems + 3 explanations). Please wait a few seconds.\n")

    # ---- GPT calls: adapted problems ----
    adhd_problem = adapt_problem_with_gpt(clean_question, "ADHD")
    ell_problem  = adapt_problem_with_gpt(clean_question, "English Language Learner (ELL)")
    id_problem   = adapt_problem_with_gpt(clean_question, "Intellectual Disability")

    # ---- GPT calls: adapted explanations ----
    adhd_expl = adapt_explanation_with_gpt(clean_question, orig_answer, "ADHD")
    ell_expl  = adapt_explanation_with_gpt(clean_question, orig_answer, "English Language Learner (ELL)")
    id_expl   = adapt_explanation_with_gpt(clean_question, orig_answer, "Intellectual Disability")

    # To HTML (line breaks)
    adhd_expl_html = adhd_expl.replace("\n", "<br>")
    ell_expl_html  = ell_expl.replace("\n", "<br>")
    id_expl_html   = id_expl.replace("\n", "<br>")

    # -------------------------------
    # Fancy CSS: bigger fonts + colors
    # -------------------------------
    css = """
    <style>
    .adapt-container {
        font-family: 'Segoe UI', Arial, sans-serif;
        margin: 10px 0 40px 0;
    }

    /* main title bar */
    .adapt-header {
        background: linear-gradient(90deg, #2563EB, #7C3AED);
        color: white;
        padding: 16px 20px;
        font-size: 22px;
        font-weight: 700;
        border-radius: 14px 14px 0 0;
        box-shadow: 0 2px 6px rgba(15, 23, 42, 0.25);
    }

    /* section title */
    .section-title {
        margin-top: 20px;
        margin-bottom: 6px;
        font-size: 18px;
        font-weight: 700;
        color: #111827;
        display: flex;
        align-items: center;
        gap: 8px;
    }

    /* small pill label */
    .pill {
        display: inline-block;
        padding: 3px 10px;
        border-radius: 999px;
        font-size: 12px;
        font-weight: 600;
        color: #ffffff;
        background: #0EA5E9;
    }

    /* card */
    .card {
        background: #ffffff;
        border-radius: 12px;
        padding: 16px 18px;
        box-shadow: 0 2px 6px rgba(15, 23, 42, 0.15);
        margin-bottom: 10px;
    }

    /* muted text under titles */
    .helper-text {
        font-size: 14px;
        color: #6b7280;
        margin-bottom: 6px;
    }

    /* layout for three-column sections */
    .three-col {
        display: grid;
        grid-template-columns: repeat(3, 1fr);
        gap: 12px;
    }

    /* profile badges */
    .badge {
        display: inline-block;
        padding: 3px 10px;
        border-radius: 999px;
        font-size: 12px;
        font-weight: 600;
        color: white;
        margin-bottom: 6px;
    }

    .badge-adhd { background: #F97316; }     /* orange */
    .badge-ell  { background: #10B981; }     /* green  */
    .badge-id   { background: #6366F1; }     /* indigo */

    /* text in cards */
    .card p {
        margin: 4px 0;
        line-height: 1.55;
        font-size: 16px;
        color: #111827;
        text-align: left;
    }

    /* code-like area for original solution */
    .solution-box {
        font-family: "Consolas", "Courier New", monospace;
        font-size: 15px;
        background: #F9FAFB;
        border-radius: 10px;
        padding: 12px 14px;
        white-space: normal;
        line-height: 1.5;
        color: #111827;
    }

    /* make layout stack on very small screens */
    @media (max-width: 1100px) {
        .three-col {
            grid-template-columns: 1fr;
        }
    }
    </style>
    """

    html = f"""
    <div class="adapt-container">

      <div class="adapt-header">
        üìò Adapted Math Problem #{problem_number}
      </div>

      <!-- Original Problem -->
      <div class="section-title">
        üìå Original Problem <span class="pill">from GSM8K</span>
      </div>
      <div class="card">
        <p>{clean_question}</p>
      </div>

      <!-- Adapted Problems -->
      <div class="section-title">
        ‚úèÔ∏è Adapted Problem Text for Each Learner Profile
      </div>
      <div class="helper-text">
        The math meaning and numbers stay the same. Only wording, clarity, and scaffolding are adjusted.
      </div>

      <div class="three-col">
        <div class="card">
          <span class="badge badge-adhd">ADHD Version</span>
          <p>{adhd_problem}</p>
        </div>
        <div class="card">
          <span class="badge badge-ell">ELL Version</span>
          <p>{ell_problem}</p>
        </div>
        <div class="card">
          <span class="badge badge-id">Intellectual Disability Version</span>
          <p>{id_problem}</p>
        </div>
      </div>

      <!-- Original GSM8K Solution -->
      <div class="section-title">
        üìö Original Dataset Solution (Teacher / GSM8K)
      </div>
      <div class="helper-text">
        This is the chain-of-thought style solution that comes with the GSM8K dataset.
      </div>
      <div class="card solution-box">
        {orig_answer_html}
      </div>

      <!-- Adapted Explanations -->
      <div class="section-title">
        üí° Adapted Solution Explanations for Each Profile
      </div>
      <div class="helper-text">
        All three explanations follow the same math steps and final answer, but the language is tailored to the learner.
      </div>

      <div class="three-col">
        <div class="card">
          <span class="badge badge-adhd">ADHD Explanation</span>
          <p>{adhd_expl_html}</p>
        </div>
        <div class="card">
          <span class="badge badge-ell">ELL Explanation</span>
          <p>{ell_expl_html}</p>
        </div>
        <div class="card">
          <span class="badge badge-id">ID Explanation</span>
          <p>{id_expl_html}</p>
        </div>
      </div>

    </div>
    """

    display(HTML(css + html))


# ---------------------------
# Widgets: choose problem + button
# ---------------------------
problem_input = widgets.BoundedIntText(
    value=1,
    min=1,
    max=len(df),
    description='Problem #',
    style={'description_width': '90px'},
    layout=widgets.Layout(width='220px')
)

run_button = widgets.Button(
    description='Generate Adaptations',
    button_style='primary',   # blue style
    layout=widgets.Layout(width='230px')
)

output_area = widgets.Output()

def on_run_clicked(b):
    with output_area:
        clear_output()
        run_adaptation(problem_input.value)

run_button.on_click(on_run_clicked)

controls = widgets.HBox([problem_input, run_button])

display(widgets.VBox([
    widgets.HTML("<h3 style='font-family:Segoe UI, Arial; margin-bottom:6px;'>üîç Choose a problem (1‚Äì90) to view its adapted versions:</h3>"),
    controls,
    output_area
]))


VBox(children=(HTML(value="<h3 style='font-family:Segoe UI, Arial; margin-bottom:6px;'>üîç Choose a problem (1‚Äì9‚Ä¶

In [20]:
from IPython.display import HTML, display
import pandas as pd

# Select how many problems you want to display (e.g., 5)
n = 5
demo_df = df[["question", "answer"]].head(n).copy()

# Convert to HTML with custom styling
html = (
    demo_df.to_html(
        escape=False,
        index=False,
        justify="left"
    )
    .replace("<table", "<table style='width:100%; font-size:16px; border-collapse:collapse;'")
    .replace("<th>", "<th style='padding:12px; text-align:left; border-bottom:2px solid #444;'>")
    .replace("<td>", "<td style='padding:12px; text-align:left; vertical-align:top; border-bottom:1px solid #ddd;'>")
)

display(HTML(html))


question,answer
"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72
"Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?","Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.\nWorking 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10.\n#### 10"
"Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?","In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.\nBetty's grandparents gave her 15 * 2 = $<<15*2=30>>30.\nThis means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more.\n#### 5"
"Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?","Maila read 12 x 2 = <<12*2=24>>24 pages today.\nSo she was able to read a total of 12 + 24 = <<12+24=36>>36 pages since yesterday.\nThere are 120 - 36 = <<120-36=84>>84 pages left to be read.\nSince she wants to read half of the remaining pages tomorrow, then she should read 84/2 = <<84/2=42>>42 pages.\n#### 42"
James writes a 3-page letter to 2 different friends twice a week. How many pages does he write a year?,He writes each friend 3*2=<<3*2=6>>6 pages a week\nSo he writes 6*2=<<6*2=12>>12 pages every week\nThat means he writes 12*52=<<12*52=624>>624 pages a year\n#### 624


### Step A ‚Äî Add an Easier Elementary Dataset (ASDiv)

To get simpler, early‚Äìelementary word problems, we add a second dataset:

- **ASDiv (Arithmetic Story Problems)** ‚Äì public research dataset with many
  easy one‚Äìstep problems (add, subtract, simple multiply/divide).

**What I did before this step:**

1. Opened the GitHub repo: `https://github.com/chaochun/nlu-asdiv-dataset`
2. Went into the `dataset` folder.
3. Downloaded the file `ASDiv.xml` to my computer.

Now we upload that file into Colab so we can parse it and build a
DataFrame similar to GSM8K.


In [21]:
# Step A.1 ‚Äî Upload ASDiv.xml into Colab

from google.colab import files

print("‚û°Ô∏è Please upload the file ASDiv.xml that you downloaded from GitHub.")
uploaded_asdiv = files.upload()

print("\n‚úÖ Uploaded files:")
for name, info in uploaded_asdiv.items():
    print(f"- {name} ({len(info)} bytes)")


‚û°Ô∏è Please upload the file ASDiv.xml that you downloaded from GitHub.


Saving ASDiv.xml to ASDiv.xml

‚úÖ Uploaded files:
- ASDiv.xml (947401 bytes)


## Step A ‚Äî Load ASDiv (Real K‚Äì5 Dataset) and Focus on Grade 1‚Äì2 Problems

Now we switch to a second dataset: **ASDiv** (Academia Sinica Diverse Math Word Problems).

- Format (in `ASDiv.xml`):

  ```xml
  <Problem ID="nluds-0001" Grade="1" Source="http://www.k5learning.com">
      <Body>Seven red apples and two green apples are in the basket.</Body>
      <Question>How many apples are in the basket?</Question>
      <Solution-Type>Addition</Solution-Type>
      <Answer>9 (apples)</Answer>
      <Formula>7+2=9</Formula>
  </Problem>


In [24]:

import xml.etree.ElementTree as ET
import pandas as pd
from IPython.display import HTML, display

# ---------- Helper to strip XML namespaces ----------
def strip_ns(tag: str) -> str:
    """Remove XML namespace, keep the local tag name."""
    if "}" in tag:
        return tag.split("}", 1)[1]
    return tag

# ---------- 1. Parse ASDiv.xml ----------
xml_path = "ASDiv.xml"   # file you uploaded

tree = ET.parse(xml_path)
root = tree.getroot()

records = []

# Walk through ALL elements and pick those whose tag ends with "Problem"
for elem in root.iter():
    if strip_ns(elem.tag) == "Problem":
        rec = {}
        # Attributes (ID, Grade, Source, etc.)
        for attr in elem.attrib:
            rec[attr.lower()] = elem.attrib.get(attr, "")

        # Child elements (Body, Question, Answer, etc.)
        for child in elem:
            name = strip_ns(child.tag).lower().replace("-", "_")
            text = (child.text or "").strip()
            rec[name] = text

        records.append(rec)

df_asdiv = pd.DataFrame(records)

print(f"‚úÖ Loaded {len(df_asdiv)} ASDiv problems")

if not df_asdiv.empty:
    print("\nGrades distribution:")
    display(df_asdiv["grade"].value_counts().sort_index())
else:
    print("‚ùå No problems found. Check that ASDiv.xml is the correct file.")

# ---------- 2. Focus on Grade 1‚Äì2 ----------
easy_df = df_asdiv[df_asdiv["grade"].isin(["1", "2"])].reset_index(drop=True)
print(f"\n‚úÖ Grade 1‚Äì2 subset: {len(easy_df)} problems")

# Create combined text column for display
easy_df["full_problem"] = (
    easy_df.get("body", "").fillna("") + " " +
    easy_df.get("question", "").fillna("")
).str.strip()

preview = easy_df[["grade", "full_problem", "answer"]].head(5).copy()
preview.insert(0, "Problem #", range(1, len(preview) + 1))

# ---------- 3. Pretty HTML table ----------
style = """
<style>
.easy-table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
    font-size: 18px;  /* bigger font for easy reading */
    table-layout: fixed;
}

/* header */
.easy-table thead th {
    background: linear-gradient(90deg, #4F46E5, #06B6D4);
    color: white;
    padding: 10px;
    font-weight: 700;
    border: 1px solid #d0d0d0;
}

/* Problem # column */
.easy-table th:first-child,
.easy-table td:first-child {
    width: 80px;
    max-width: 80px;
    min-width: 80px;
    text-align: center;
    font-weight: 700;
    color: #1E293B;
    background: #E0ECFF;
}

/* Grade column */
.easy-table td:nth-child(2) {
    width: 80px;
    max-width: 80px;
    min-width: 80px;
    text-align: center;
    font-weight: 600;
    color: #2563EB;
    background: #F1F5F9;
}

/* Problem text */
.easy-table td:nth-child(3) {
    padding: 12px 16px;
    background: #F8FAFC;
    color: #111827;
    line-height: 1.6;
    word-wrap: break-word;
}

/* Answer column */
.easy-table td:nth-child(4) {
    padding: 12px 16px;
    background: #FFFFFF;
    color: #047857;
    font-weight: 600;
    white-space: normal;
}

/* row borders */
.easy-table td {
    border-bottom: 1px solid #e2e8f0;
}
</style>
"""

html = style + preview.to_html(
    classes="easy-table",
    escape=False,
    index=False
)

HTML(html)



‚úÖ Loaded 2305 ASDiv problems

Grades distribution:


Unnamed: 0_level_0,count
grade,Unnamed: 1_level_1
1,195
2,340
3,808
4,301
5,146
6,515



‚úÖ Grade 1‚Äì2 subset: 535 problems


Problem #,grade,full_problem,answer
1,1,Seven red apples and two green apples are in the basket. How many apples are in the basket?,9 (apples)
2,1,Ellen has six more balls than Marin. Marin has nine balls. How many balls does Ellen have?,15 (balls)
3,1,Janet has nine oranges and Sharon has seven oranges. How many oranges do Janet and Sharon have together?,16 (oranges)
4,1,Allan brought two balloons and Jake brought four balloons to the park. How many balloons did Allan and Jake have in the park?,6 (balloons)
5,1,Adam has five more apples than Jackie. Jackie has nine apples. How many apples does Adam have?,14 (apples)


### Step 3 ‚Äî Adapt an Easy (Grade 1‚Äì2) Word Problem for ADHD, ELL, and ID

In this step we:

1. Take one problem from the **Grade 1‚Äì2 ASDiv subset** (`easy_df`).
2. Send the *original problem + answer* to GPT-4o-mini.
3. Ask it to return:
   - A clear, **teacher-style step-by-step solution**.
   - An **ADHD-friendly** version of the problem and explanation.
   - An **ELL-friendly** version of the problem and explanation.
   - An **Intellectual Disability (ID)** friendly version of the problem and explanation.
4. Display everything in a **single, colorful, easy-to-read layout**.

You can type a problem number (from **1** to **535**) and the notebook will show all the adaptations for that problem.


In [25]:
import json
from IPython.display import HTML, display

# ---------- 1. Function: ask GPT to adapt the problem ----------

def adapt_asdiv_with_gpt(problem_text: str, answer_text: str, grade: str = "1"):
    """
    Uses GPT-4o-mini to generate:
      - teacher_solution
      - ADHD / ELL / ID adapted problems & explanations.

    Returns a dict with keys:
      teacher_solution,
      adhd_problem, adhd_explanation,
      ell_problem, ell_explanation,
      id_problem, id_explanation
    """
    system_msg = (
        "You are an expert elementary math teacher and special-education specialist. "
        "You write clear, kind, step-by-step explanations for young students in grades 1‚Äì2. "
        "You ALWAYS keep the math meaning and numbers the same as the original problem."
    )

    user_prompt = f"""
Here is an elementary math word problem and its correct answer.

GRADE: {grade}
PROBLEM:
\"\"\"{problem_text}\"\"\"

CORRECT ANSWER (do not change the result, but you may restate it in words):
\"\"\"{answer_text}\"\"\"

Please do the following:

1. **Teacher Solution (Step-by-step)**
   - Write a clear, friendly, step-by-step solution that a teacher might show on the board.
   - Use simple sentences, and include the key math operations.

2. **ADHD Version**
   - Rewrite the PROBLEM text for a student with ADHD.
   - Use short sentences, clear structure, and keep the student focused on the important numbers.
   - Then give a short, step-by-step EXPLANATION that matches the original answer.

3. **ELL Version**
   - Rewrite the PROBLEM for an English Language Learner.
   - Use simple vocabulary and short sentences.
   - Avoid idioms or confusing phrasing.
   - Then give a short explanation that is easy to read and still leads to the same final answer.

4. **Intellectual Disability (ID) Version**
   - Rewrite the PROBLEM with very simple language.
   - Break ideas into small chunks.
   - Then give a step-by-step explanation with **very small steps** (1 idea per sentence).

IMPORTANT RULES:
- Do NOT change the math situation or the numbers.
- Do NOT introduce pictures in the text; just describe them with words if needed.
- The final numeric answer must agree with the given correct answer.

Return your result as **valid JSON** with this exact structure:

{{
  "teacher_solution": "...",
  "adhd_problem": "...",
  "adhd_explanation": "...",
  "ell_problem": "...",
  "ell_explanation": "...",
  "id_problem": "...",
  "id_explanation": "..."
}}
"""

    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0.4,
        max_tokens=900,
        response_format={"type": "json_object"},
        messages=[
            {"role": "system", "content": system_msg},
            {"role": "user", "content": user_prompt},
        ],
    )

    content = resp.choices[0].message.content
    try:
        data = json.loads(content)
    except json.JSONDecodeError:
        # Fallback: wrap whole content as teacher_solution if parsing fails
        data = {
            "teacher_solution": content,
            "adhd_problem": problem_text,
            "adhd_explanation": "",
            "ell_problem": problem_text,
            "ell_explanation": "",
            "id_problem": problem_text,
            "id_explanation": "",
        }
    return data


# ---------- 2. Function: pretty HTML layout for one problem ----------

def show_adapted_asdiv_problem(problem_number: int):
    """
    problem_number: 1-based index into easy_df (Grade 1‚Äì2 subset).
    """
    n = len(easy_df)
    if problem_number < 1 or problem_number > n:
        raise ValueError(f"Problem number must be between 1 and {n}")

    row = easy_df.iloc[problem_number - 1]

    grade = row.get("grade", "")
    original_problem = row.get("full_problem", "").strip()
    original_answer = row.get("answer", "").strip()

    # Call GPT to get adaptations
    adaptations = adapt_asdiv_with_gpt(
        problem_text=original_problem,
        answer_text=original_answer,
        grade=grade,
    )

    teacher_solution = adaptations.get("teacher_solution", "").strip()
    adhd_problem = adaptations.get("adhd_problem", "").strip()
    adhd_explanation = adaptations.get("adhd_explanation", "").strip()
    ell_problem = adaptations.get("ell_problem", "").strip()
    ell_explanation = adaptations.get("ell_explanation", "").strip()
    id_problem = adaptations.get("id_problem", "").strip()
    id_explanation = adaptations.get("id_explanation", "").strip()

    # ---------- HTML + CSS ----------
    css = """
    <style>
    .asdiv-wrapper {
        font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
        color: #0f172a;
        line-height: 1.6;
        font-size: 18px;
    }
    .asdiv-header {
        background: linear-gradient(90deg, #4F46E5, #06B6D4);
        color: white;
        padding: 14px 20px;
        border-radius: 12px 12px 0 0;
        font-size: 22px;
        font-weight: 700;
        display: flex;
        justify-content: space-between;
        align-items: center;
    }
    .asdiv-header span.grade-pill {
        background: rgba(15,23,42,0.25);
        padding: 6px 12px;
        border-radius: 999px;
        font-size: 14px;
    }
    .asdiv-section {
        background: #ffffff;
        border-radius: 0 0 12px 12px;
        box-shadow: 0 10px 25px rgba(15,23,42,0.08);
        margin-bottom: 24px;
        padding: 18px 22px;
    }
    .asdiv-section-title {
        font-size: 19px;
        font-weight: 700;
        margin-bottom: 10px;
        display: flex;
        align-items: center;
        gap: 8px;
    }
    .badge-original {
        background: #DBEAFE;
        color: #1D4ED8;
        padding: 2px 10px;
        border-radius: 999px;
        font-size: 13px;
        font-weight: 600;
    }
    .badge-teacher {
        background: #FEF3C7;
        color: #92400E;
        padding: 2px 10px;
        border-radius: 999px;
        font-size: 13px;
        font-weight: 600;
    }
    .badge-adhd {
        background: #FEE2E2;
        color: #B91C1C;
        padding: 2px 10px;
        border-radius: 999px;
        font-size: 13px;
        font-weight: 600;
    }
    .badge-ell {
        background: #E0F2FE;
        color: #0369A1;
        padding: 2px 10px;
        border-radius: 999px;
        font-size: 13px;
        font-weight: 600;
    }
    .badge-id {
        background: #E0F2F1;
        color: #047857;
        padding: 2px 10px;
        border-radius: 999px;
        font-size: 13px;
        font-weight: 600;
    }

    .asdiv-grid-3 {
        display: grid;
        grid-template-columns: repeat(3, minmax(0, 1fr));
        gap: 18px;
        margin-top: 4px;
    }
    .asdiv-card {
        background: #f9fafb;
        border-radius: 12px;
        padding: 14px 16px;
        border: 1px solid #e5e7eb;
    }
    .asdiv-card-title {
        font-weight: 700;
        margin-bottom: 6px;
        font-size: 17px;
        display: flex;
        align-items: center;
        gap: 6px;
    }
    .asdiv-card-body {
        font-size: 17px;
        white-space: pre-wrap;
    }
    .asdiv-answer-pill {
        display: inline-block;
        margin-top: 8px;
        padding: 4px 10px;
        border-radius: 999px;
        background: #DCFCE7;
        color: #166534;
        font-size: 14px;
        font-weight: 600;
    }
    .asdiv-subnote {
        font-size: 14px;
        color: #64748b;
        margin-top: 6px;
    }
    </style>
    """

    html = f"""
    {css}
    <div class="asdiv-wrapper">
      <div class="asdiv-header">
        <div>üìò Adapted Grade 1‚Äì2 Problem #{problem_number}</div>
        <span class="grade-pill">Grade {grade}</span>
      </div>

      <!-- Original problem + answer -->
      <div class="asdiv-section">
        <div class="asdiv-section-title">
          <span>üßæ Original Problem</span>
          <span class="badge-original">from ASDiv</span>
        </div>
        <div>{original_problem}</div>
        <div class="asdiv-answer-pill">Correct answer: {original_answer}</div>
      </div>

      <!-- Teacher solution -->
      <div class="asdiv-section">
        <div class="asdiv-section-title">
          <span>üßë‚Äçüè´ Teacher Solution (Step-by-step)</span>
          <span class="badge-teacher">generated by GPT</span>
        </div>
        <div style="white-space: pre-wrap;">{teacher_solution}</div>
      </div>

      <!-- Adapted problem texts -->
      <div class="asdiv-section">
        <div class="asdiv-section-title">
          <span>üß© Adapted Problem Texts for Each Learner Profile</span>
        </div>
        <div class="asdiv-subnote">
          The math meaning and numbers are kept the same. Only wording, structure, and support level change.
        </div>
        <div class="asdiv-grid-3">
          <div class="asdiv-card">
            <div class="asdiv-card-title">‚ö° ADHD Version <span class="badge-adhd">ADHD</span></div>
            <div class="asdiv-card-body">{adhd_problem}</div>
          </div>
          <div class="asdiv-card">
            <div class="asdiv-card-title">üåé ELL Version <span class="badge-ell">ELL</span></div>
            <div class="asdiv-card-body">{ell_problem}</div>
          </div>
          <div class="asdiv-card">
            <div class="asdiv-card-title">üß† Intellectual Disability Version <span class="badge-id">ID</span></div>
            <div class="asdiv-card-body">{id_problem}</div>
          </div>
        </div>
      </div>

      <!-- Adapted explanations -->
      <div class="asdiv-section">
        <div class="asdiv-section-title">
          <span>üßÆ Adapted Solution Explanations</span>
        </div>
        <div class="asdiv-subnote">
          All three explanations follow the same math steps and final answer, but language and scaffolding are tailored.
        </div>
        <div class="asdiv-grid-3">
          <div class="asdiv-card">
            <div class="asdiv-card-title">‚ö° ADHD Explanation <span class="badge-adhd">ADHD</span></div>
            <div class="asdiv-card-body">{adhd_explanation}</div>
          </div>
          <div class="asdiv-card">
            <div class="asdiv-card-title">üåé ELL Explanation <span class="badge-ell">ELL</span></div>
            <div class="asdiv-card-body">{ell_explanation}</div>
          </div>
          <div class="asdiv-card">
            <div class="asdiv-card-title">üß† ID Explanation <span class="badge-id">ID</span></div>
            <div class="asdiv-card-body">{id_explanation}</div>
          </div>
        </div>
      </div>
    </div>
    """

    display(HTML(html))


# ---------- 3. Ask the user which problem to show ----------

total_easy = len(easy_df)
print(f"Grade 1‚Äì2 subset has {total_easy} problems.")

try:
    user_choice = int(input(f"Enter a problem number between 1 and {total_easy}: ").strip())
except ValueError:
    user_choice = 1

# Clamp to range
user_choice = max(1, min(total_easy, user_choice))

print(f"\nüîç Showing Grade 1‚Äì2 problem #{user_choice}...\n")
show_adapted_asdiv_problem(user_choice)


Grade 1‚Äì2 subset has 535 problems.
Enter a problem number between 1 and 535: 55

üîç Showing Grade 1‚Äì2 problem #55...



In [27]:
# üîÅ 1. Small GPT helper that returns plain text
def gpt_text(prompt, temperature=0.3, max_tokens=300):
    resp = client.chat.completions.create(
        model="gpt-4o-mini",   # or gpt-4o if you prefer
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return resp.choices[0].message.content.strip()


# üîÅ 2. Adaptation functions (teacher + profiles)
def make_teacher_solution(problem_text, answer_text, grade_label):
    prompt = f"""
You are a kind Grade {grade_label} math teacher.

Write a short, clear, STEP-BY-STEP explanation for this word problem.
- Use very simple sentences.
- Start each step on its OWN LINE with a number like: "1. ...", "2. ...", etc.
- Do NOT put multiple steps on the same line.
- Keep the math consistent with the correct answer.
- Do NOT show any LaTeX or special symbols, just normal text.

Problem:
{problem_text}

Correct answer (for your reference only): {answer_text}
"""
    return gpt_text(prompt, temperature=0.25, max_tokens=350)


def make_adapted_problem(problem_text, profile, grade_label):
    prompt = f"""
You are adapting a Grade {grade_label} math word problem for a student with this profile:
{profile}

Rewrite the problem:
- KEEP all numbers and math the SAME.
- KEEP the question type the SAME.
- Use 2‚Äì4 short, simple sentences.
- Put each sentence on its OWN LINE (use line breaks).
- Do NOT include solution steps.

Original problem:
{problem_text}
"""
    return gpt_text(prompt, temperature=0.4, max_tokens=220)


def make_adapted_explanation(problem_text, answer_text, profile, grade_label):
    prompt = f"""
You are explaining a Grade {grade_label} math word problem to a student with this profile:
{profile}

Write a friendly, step-by-step explanation:
- Use 4‚Äì7 short steps.
- Start each step on its OWN LINE with "1.", "2.", "3.", etc.
- Do NOT put multiple steps on the same line.
- Use very simple language.
- End with the final answer.

Problem:
{problem_text}

Correct answer: {answer_text}
"""
    return gpt_text(prompt, temperature=0.35, max_tokens=380)


# üîÅ 3. Small helper to preserve line breaks inside HTML
def to_html(text: str) -> str:
    if text is None:
        return ""
    return (
        str(text)
        .strip()
        .replace("&", "&amp;")
        .replace("<", "&lt;")
        .replace(">", "&gt;")
        .replace("\n", "<br>")
    )


# üîÅ 4. Main renderer for one Grade 1‚Äì2 problem
from IPython.display import HTML, display

def show_grade_12_problem(problem_num: int):
    """
    Show one adapted Grade 1‚Äì2 ASDiv problem with:
    - Original problem + answer
    - Teacher solution
    - Adapted problem texts (ADHD, ELL, ID)
    - Adapted explanations (ADHD, ELL, ID)
    """
    if problem_num < 1 or problem_num > len(easy_df):
        print(f"‚ùå Please choose a number between 1 and {len(easy_df)}.")
        return

    row = easy_df.iloc[problem_num - 1]
    grade_label = row["grade"]
    full_problem = row["full_problem"]
    answer_text = row["answer"]

    print(f"üîé Showing Grade 1‚Äì2 problem #{problem_num} (Grade {grade_label})...\n")

    # ---- Generate all GPT texts ----
    teacher_sol = make_teacher_solution(full_problem, answer_text, grade_label)

    adhd_problem = make_adapted_problem(full_problem, "ADHD (needs very clear, short instructions and focus on key numbers).", grade_label)
    ell_problem  = make_adapted_problem(full_problem, "English Language Learner (simple vocabulary, clear sentence structure).", grade_label)
    id_problem   = make_adapted_problem(full_problem, "Intellectual Disability (very short sentences, lots of clarity and repetition).", grade_label)

    adhd_expl = make_adapted_explanation(full_problem, answer_text, "ADHD", grade_label)
    ell_expl  = make_adapted_explanation(full_problem, answer_text, "English Language Learner", grade_label)
    id_expl   = make_adapted_explanation(full_problem, answer_text, "Intellectual Disability", grade_label)

    # ---- HTML + CSS (bigger fonts, clean layout, line breaks preserved) ----
    css = """
    <style>
    .grade-card {
        font-family: system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
        font-size: 18px;
        color: #0f172a;
        line-height: 1.6;
    }
    .grade-header {
        background: linear-gradient(90deg, #1d4ed8, #22c55e);
        color: white;
        padding: 12px 20px;
        border-radius: 10px 10px 0 0;
        font-weight: 700;
        display: flex;
        justify-content: space-between;
        align-items: center;
    }
    .grade-badge {
        background: rgba(255,255,255,0.18);
        padding: 4px 10px;
        border-radius: 999px;
        font-size: 14px;
    }
    .grade-section {
        background: #ffffff;
        border: 1px solid #e5e7eb;
        border-top: none;
        padding: 16px 20px;
    }
    .grade-section + .grade-section {
        border-top: 1px solid #e5e7eb;
    }
    .sec-title {
        font-weight: 700;
        margin-bottom: 6px;
    }
    .sec-subtitle {
        font-size: 13px;
        color: #6b7280;
        margin-bottom: 10px;
    }
    .pill {
        display: inline-flex;
        align-items: center;
        gap: 6px;
        padding: 3px 10px;
        border-radius: 999px;
        font-size: 12px;
        background: #e5f2ff;
        color: #1d4ed8;
        margin-left: 8px;
    }
    .pill span {
        font-weight: 600;
    }
    .card-row {
        display: grid;
        grid-template-columns: repeat(3, minmax(0, 1fr));
        gap: 14px;
        margin-top: 8px;
    }
    .mini-card {
        background: #f8fafc;
        border-radius: 10px;
        padding: 10px 12px;
        border: 1px solid #e2e8f0;
        font-size: 16px;
    }
    .mini-title {
        font-weight: 700;
        font-size: 14px;
        margin-bottom: 4px;
        display: flex;
        align-items: center;
        gap: 6px;
    }
    .tag-adhd { color: #ea580c; }
    .tag-ell  { color: #16a34a; }
    .tag-id   { color: #7c3aed; }
    .answer-pill {
        display: inline-block;
        margin-top: 6px;
        padding: 2px 8px;
        border-radius: 999px;
        font-size: 13px;
        background: #ecfdf5;
        color: #15803d;
        font-weight: 600;
    }
    </style>
    """

    html = f"""
    <div class="grade-card">

      <div class="grade-header">
        <div>üìò Adapted Grade 1‚Äì2 Problem #{problem_num}</div>
        <div class="grade-badge">Grade {grade_label}</div>
      </div>

      <div class="grade-section">
        <div class="sec-title">üìñ Original Problem <span class="pill"><span>from</span> ASDiv</span></div>
        <div>{to_html(full_problem)}</div>
        <div class="answer-pill">Correct answer: {to_html(answer_text)}</div>
      </div>

      <div class="grade-section">
        <div class="sec-title">üë©‚Äçüè´ Teacher Solution (Step-by-step) <span class="pill" style="background:#fef3c7;color:#92400e;">generated by GPT</span></div>
        <div>{to_html(teacher_sol)}</div>
      </div>

      <div class="grade-section">
        <div class="sec-title">üß© Adapted Problem Texts for Each Learner Profile</div>
        <div class="sec-subtitle">The math meaning and numbers are kept the same. Only wording, structure, and support level change.</div>

        <div class="card-row">
          <div class="mini-card">
            <div class="mini-title"><span class="tag-adhd">‚ö° ADHD Version</span></div>
            <div>{to_html(adhd_problem)}</div>
          </div>

          <div class="mini-card">
            <div class="mini-title"><span class="tag-ell">üü¢ ELL Version</span></div>
            <div>{to_html(ell_problem)}</div>
          </div>

          <div class="mini-card">
            <div class="mini-title"><span class="tag-id">üß† Intellectual Disability Version</span></div>
            <div>{to_html(id_problem)}</div>
          </div>
        </div>
      </div>

      <div class="grade-section">
        <div class="sec-title">üìù Adapted Solution Explanations</div>
        <div class="sec-subtitle">All three explanations solve the same problem and reach the same final answer, but the language is tailored to each learner.</div>

        <div class="card-row">
          <div class="mini-card">
            <div class="mini-title"><span class="tag-adhd">‚ö° ADHD Explanation</span></div>
            <div>{to_html(adhd_expl)}</div>
          </div>

          <div class="mini-card">
            <div class="mini-title"><span class="tag-ell">üü¢ ELL Explanation</span></div>
            <div>{to_html(ell_expl)}</div>
          </div>

          <div class="mini-card">
            <div class="mini-title"><span class="tag-id">üß† ID Explanation</span></div>
            <div>{to_html(id_expl)}</div>
          </div>
        </div>
      </div>

    </div>
    """

    display(HTML(css + html))


# üîÅ 5. Ask which problem to show
print(f"Grade 1‚Äì2 subset has {len(easy_df)} problems.")
choice = int(input(f"Enter a problem number between 1 and {len(easy_df)}: "))
show_grade_12_problem(choice)


Grade 1‚Äì2 subset has 535 problems.
Enter a problem number between 1 and 535: 530
üîé Showing Grade 1‚Äì2 problem #530 (Grade 2)...

