# Phase 4 — Integrating Generative AI  
### SWE485 — Student Performance Predictor  

**Simulation Output Version (No API Required)**

This notebook completes the required outcomes of **Phase 4**:

- Two prompt templates  
- Simulation of Generative-AI-like outputs  
- Template comparison and quality analysis  
- Clear justification for the final selected template  
- Compatible with Phase 3 performance clusters  

---


In [13]:
import pandas as pd

# Load dataset (make sure StudentsPerformance.csv is uploaded to Colab)
df = pd.read_csv('/content/StudentsPerformance.csv')

df.head()


Unnamed: 0,gender,race/ethnicity,parental level of education,lunch,test preparation course,math score,reading score,writing score
0,female,group B,bachelor's degree,standard,none,72,72,74
1,female,group C,some college,standard,completed,69,90,88
2,female,group B,master's degree,standard,none,90,95,93
3,male,group A,associate's degree,free/reduced,none,47,57,44
4,male,group C,some college,standard,none,76,78,75


## Step 1 — Reuse Clusters from Phase 3  

We define simple performance clusters based on **math score**:

- High Performance → math score ≥ 80  
- Moderate Performance → 60 ≤ math score < 80  
- Low Performance → math score < 60  

Then we map each cluster ID to a descriptive label.


In [15]:
def assign_cluster(score):
    if score >= 80:
        return 0  # High Performance
    elif score >= 60:
        return 1  # Moderate Performance
    else:
        return 2  # Low Performance

# Apply clustering based on math score
df['cluster'] = df['math score'].apply(assign_cluster)

cluster_labels = {
    0: 'High Performance',
    1: 'Moderate Performance',
    2: 'Low Performance'
}

df['cluster_label'] = df['cluster'].map(cluster_labels)

df[['math score', 'reading score', 'writing score', 'cluster_label']].head()


Unnamed: 0,math score,reading score,writing score,cluster_label
0,72,72,74,Moderate Performance
1,69,90,88,Moderate Performance
2,90,95,93,High Performance
3,47,57,44,Low Performance
4,76,78,75,Moderate Performance


## Step 2 — Define Two Prompt Templates  

We create **two different prompt templates** that would be sent to a Generative AI model:

- **Template A:** General motivational recommendations based on the performance group.  
- **Template B:** A more detailed weekly study plan that uses actual math/reading/writing scores.


In [17]:
def prompt_template_a(row):
    return f"General study recommendations for a student in group: {row['cluster_label']}."

def prompt_template_b(row):
    return (
        f"Weekly plan based on > Math:{row['math score']} "
        f"Reading:{row['reading score']} "
        f"Writing:{row['writing score']} "
        f"group:{row['cluster_label']}"
    )

# Show example prompts for the first 3 students
for idx, row in df.head(3).iterrows():
    print("Student index:", idx)
    print("Template A:", prompt_template_a(row))
    print("Template B:", prompt_template_b(row))
    print("-" * 70)


Student index: 0
Template A: General study recommendations for a student in group: Moderate Performance.
Template B: Weekly plan based on > Math:72 Reading:72 Writing:74 group:Moderate Performance
----------------------------------------------------------------------
Student index: 1
Template A: General study recommendations for a student in group: Moderate Performance.
Template B: Weekly plan based on > Math:69 Reading:90 Writing:88 group:Moderate Performance
----------------------------------------------------------------------
Student index: 2
Template A: General study recommendations for a student in group: High Performance.
Template B: Weekly plan based on > Math:90 Reading:95 Writing:93 group:High Performance
----------------------------------------------------------------------


## Step 3 — Simulated Generative AI Outputs  

Instead of calling a real Generative AI API,  
we **simulate** realistic responses for each performance cluster:

- High Performance  
- Moderate Performance  
- Low Performance  

The simulated outputs represent what a Generative AI model (e.g., GPT) *could* return.


In [18]:
simulated_results = [
    {
        "cluster": "High Performance",
        "Template A": "• Continue challenging tasks\n• Maintain consistency\n• Join enrichment programs",
        "Template B": "• Math: Advanced challenges 4 days/week\n• Reading: Deep analysis twice/week\n• Writing: Weekly essay practice"
    },
    {
        "cluster": "Moderate Performance",
        "Template A": "• Improve time management\n• Attend tutoring sessions\n• Daily organized practice",
        "Template B": "• Math: Basics revision daily\n• Reading: Summary exercises\n• Writing: Guided writing tasks"
    },
    {
        "cluster": "Low Performance",
        "Template A": "• Ask for teacher help\n• Bit-size study sessions\n• Build confidence",
        "Template B": "• Math: Remedial practice daily\n• Reading: Guided reading daily\n• Writing: Grammar foundations"
    }
]

simulated_df = pd.DataFrame(simulated_results)
simulated_df


Unnamed: 0,cluster,Template A,Template B
0,High Performance,• Continue challenging tasks\n• Maintain consi...,• Math: Advanced challenges 4 days/week\n• Rea...
1,Moderate Performance,• Improve time management\n• Attend tutoring s...,• Math: Basics revision daily\n• Reading: Summ...
2,Low Performance,• Ask for teacher help\n• Bit-size study sessi...,• Math: Remedial practice daily\n• Reading: Gu...


## Step 4 — Template Comparison & Analysis  

### Template A  
- Simple and motivational  
- Easy to generate  
- Good for quick, general advice  

 **Limitation:**  
- Not highly personalized  
- Does not use the student's exact scores  

---

### Template B  
- Uses actual **math, reading, and writing scores**  
- Generates **specific, actionable steps** (e.g., how many days per week)  
- Better at guiding **Moderate** and **Low** performance students toward improvement  

 **Conclusion:**  
Template B is **superior** in terms of detail, personalization, and relevance to the project goal  
(*improving student performance through concrete study plans*).


## Step 5 — Final Selected Template  

- **Chosen Template:** Template B — *Subject-based Weekly Plan*  
- **Reason:**  
  - More personalized (uses real scores)  
  - Provides concrete, weekly actions for each subject  
  - Better supports the overall objective of the system:  
    > Helping students improve their performance with clear, tailored advice  

