# Experiment 2 — Add Contextual Features

**Goal:** Predict `final_grade` using academic, behavioral **AND** contextual features.  
**New features added:** `internet_access`, `travel_time`, `extra_activities`  

We test if environmental/contextual factors improve prediction compared to Experiment 1.  
This is **controlled feature experimentation** — same models, same data split, more features.

In [None]:
# --- Imports ---
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, ConfusionMatrixDisplay, f1_score

import warnings
warnings.filterwarnings('ignore')

print('All imports loaded successfully!')

## Step 1 — Load Cleaned Data
We load the same preprocessed train and test CSVs used in Experiment 1.

In [None]:
# Load the cleaned datasets
train_df = pd.read_csv('../datasets/train_cleaned.csv')
test_df  = pd.read_csv('../datasets/test_cleaned.csv')

print('Training data shape :', train_df.shape)
print('Test data shape     :', test_df.shape)
print('\nColumns available   :', list(train_df.columns))

## Step 2 — Select Features (Academic + Behavioral + Contextual)

For Experiment 2, we keep all features from Experiment 1 and **add** three contextual features:
- `internet_access` — does the student have internet at home?
- `travel_time` — how long does it take to commute?
- `extra_activities` — is the student involved in extra-curricular activities?

**Why add these?**  
We want to test whether environmental factors improve the model's ability to predict `final_grade`.

In [None]:
# --- Experiment 1 features (Academic + Behavioral) ---
exp1_features = [
    'study_hours',
    'attendance_percentage',
    'math_score',
    'science_score',
    'english_score',
    'study_method_coaching',
    'study_method_group study',
    'study_method_mixed',
    'study_method_notes',
    'study_method_online videos',
    'study_method_textbook'
]

# --- NEW contextual features for Experiment 2 ---
contextual_features = [
    'internet_access',
    'travel_time',
    'extra_activities'
]

# Combine both feature sets
exp2_features = exp1_features + contextual_features

X_train = train_df[exp2_features]
X_test  = test_df[exp2_features]

y_train = train_df['final_grade']
y_test  = test_df['final_grade']

print('Experiment 2 feature count :', len(exp2_features))
print('X_train shape              :', X_train.shape)
print('X_test  shape              :', X_test.shape)
print('\nNew contextual features added:', contextual_features)
print('\nTarget distribution (train):')
print(y_train.value_counts().sort_index())

## Step 3 — Train Models

We retrain the **same two classifiers** from Experiment 1 on the extended feature set:
1. **Logistic Regression** — linear baseline model
2. **Decision Tree** — non-linear model that captures complex patterns

Using the same models allows a **fair comparison** — any change in performance is due to the new features, not a different algorithm.

In [None]:
# Model 1: Logistic Regression
lr_model = LogisticRegression(max_iter=1000, random_state=42)
lr_model.fit(X_train, y_train)

print('Logistic Regression trained successfully!')

In [None]:
# Model 2: Decision Tree
dt_model = DecisionTreeClassifier(random_state=42)
dt_model.fit(X_train, y_train)

print('Decision Tree trained successfully!')

## Step 4 — Evaluate Models

We evaluate both models using the same metrics as Experiment 1:
- **Accuracy** — overall correct predictions
- **Precision** — how many predicted positives are actually positive
- **Recall** — how many actual positives are correctly identified
- **F1 Score** — harmonic mean of precision and recall (important for imbalanced classes)
- **Confusion Matrix** — visual breakdown of predictions vs actual

In [None]:
# Generate predictions on the test set
lr_predictions = lr_model.predict(X_test)
dt_predictions = dt_model.predict(X_test)

print('Predictions generated for both models!')

In [None]:
# Logistic Regression Evaluation
print('LOGISTIC REGRESSION — Results (Exp 2)')

lr_accuracy = accuracy_score(y_test, lr_predictions)
print(f'\nAccuracy: {lr_accuracy:.4f} ({lr_accuracy*100:.2f}%)')

print('\nClassification Report:')
print(classification_report(y_test, lr_predictions))

In [None]:
# Decision Tree Evaluation
print('DECISION TREE — Results (Exp 2)')

dt_accuracy = accuracy_score(y_test, dt_predictions)
print(f'\nAccuracy: {dt_accuracy:.4f} ({dt_accuracy*100:.2f}%)')

print('\nClassification Report:')
print(classification_report(y_test, dt_predictions))

In [None]:
# Confusion Matrices
grade_labels = ['f(0)', 'e(1)', 'd(2)', 'c(3)', 'b(4)', 'a(5)']

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Logistic Regression confusion matrix
ConfusionMatrixDisplay.from_predictions(
    y_test, lr_predictions,
    display_labels=grade_labels,
    cmap='Blues',
    ax=axes[0]
)
axes[0].set_title('Logistic Regression (Exp 2)')

# Decision Tree confusion matrix
ConfusionMatrixDisplay.from_predictions(
    y_test, dt_predictions,
    display_labels=grade_labels,
    cmap='Greens',
    ax=axes[1]
)
axes[1].set_title('Decision Tree (Exp 2)')

plt.suptitle('Experiment 2 — Confusion Matrices', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## Step 5 — Compare Results (Experiment 1 vs Experiment 2)

Now we compare the metrics from both experiments side-by-side.  
We use the **recorded metrics from Experiment 1** to build a comparison table.

In [None]:
# ---- Experiment 1 results (recorded from experiment_1.ipynb) ----
exp1_lr_acc = 0.7697
exp1_lr_f1  = 0.7698
exp1_dt_acc = 0.6643
exp1_dt_f1  = 0.6644

# ---- Experiment 2 results ----
lr_f1 = f1_score(y_test, lr_predictions, average='weighted')
dt_f1 = f1_score(y_test, dt_predictions, average='weighted')

# Build comparison table
comparison = pd.DataFrame({
    'Feature Set': [
        'Exp 1 (Academic + Behavioral)',
        'Exp 2 (+ Contextual)',
        'Exp 1 (Academic + Behavioral)',
        'Exp 2 (+ Contextual)'
    ],
    'Model': [
        'Logistic Regression',
        'Logistic Regression',
        'Decision Tree',
        'Decision Tree'
    ],
    'Accuracy': [
        round(exp1_lr_acc, 4),
        round(lr_accuracy, 4),
        round(exp1_dt_acc, 4),
        round(dt_accuracy, 4)
    ],
    'Weighted F1': [
        round(exp1_lr_f1, 4),
        round(lr_f1, 4),
        round(exp1_dt_f1, 4),
        round(dt_f1, 4)
    ]
})

print('=' * 70)
print('Experiment 1 vs Experiment 2 — Results Comparison')
print('=' * 70)
print(comparison.to_string(index=False))
print('=' * 70)

## Observations

- Adding contextual features (`internet_access`, `travel_time`, `extra_activities`) allows us to test whether **environmental factors** contribute to predicting `final_grade`.
- By comparing metrics across experiments, we can determine if the additional features provide **meaningful improvement** or just add noise.
- **Logistic Regression** may benefit if the new features have a linear relationship with the target.
- **Decision Tree** may capture more complex interactions between the new contextual features and existing ones.
- This controlled experimentation approach demonstrates analytical thinking — not just random modeling.