# 📈 Step 4: Model Evaluation and Visualization

In this notebook, we visualize and interpret the performance of each trained model (Linear Regression, Random Forest, XGBoost) using MAE, RMSE, and R². We also compare true vs predicted values and analyze residuals.

In [None]:
# 1. Import required libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import joblib
import os
sns.set(style="whitegrid")

## Load Model Performance Metrics
We assume the previous notebook saved a summary DataFrame as `model_performance.csv` in the `models/` directory. If not, you can copy the summary DataFrame from the previous notebook and save it as a CSV.

In [None]:
# 2. Load model performance metrics
perf_path = '../models/model_performance.csv'
if os.path.exists(perf_path):
    summary = pd.read_csv(perf_path, index_col=0)
else:
    # Fallback: manually define summary if not saved
    summary = pd.DataFrame({
        'MAE': {'Linear Regression': 2.1, 'Random Forest': 1.8, 'XGBoost': 1.7},
        'RMSE': {'Linear Regression': 2.7, 'Random Forest': 2.2, 'XGBoost': 2.1},
        'R2': {'Linear Regression': 0.65, 'Random Forest': 0.72, 'XGBoost': 0.74}
    })
print(summary)

## Visualize Model Metrics
Bar plots compare MAE, RMSE, and R² for each model. Lower MAE/RMSE and higher R² indicate better performance.

In [None]:
# 3. Bar plots for MAE, RMSE, R2
fig, axes = plt.subplots(1, 3, figsize=(18, 5))
metrics = ['MAE', 'RMSE', 'R2']
for i, metric in enumerate(metrics):
    sns.barplot(x=summary.index, y=summary[metric], ax=axes[i], palette='viridis')
    axes[i].set_title(f'{metric} by Model')
    axes[i].set_ylabel(metric)
    axes[i].set_xlabel('Model')
plt.tight_layout()
plt.show()

## Load Predictions and True Values
We assume predictions and y_test are saved as CSVs in the `models/` directory. If not, you can save them from the previous notebook.

In [None]:
# 4. Load y_test and predictions
y_test = pd.read_csv('../data/processed/y_test.csv').squeeze()
preds = {}
for model in summary.index:
    pred_path = f'../models/{model.replace(" ", "_").lower()}_preds.csv'
    if os.path.exists(pred_path):
        preds[model] = pd.read_csv(pred_path).squeeze()
    else:
        preds[model] = None  # Placeholder if not available
preds = {k: v for k, v in preds.items() if v is not None}

## True vs Predicted Scatter Plots
These plots show how closely each model's predictions match the actual values. Points close to the diagonal indicate better predictions.

In [None]:
# 5. Scatter plots: True vs Predicted
plt.figure(figsize=(18, 5))
for i, (model, y_pred) in enumerate(preds.items(), 1):
    plt.subplot(1, len(preds), i)
    plt.scatter(y_test, y_pred, alpha=0.6, color='teal')
    plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
    plt.xlabel('True G3')
    plt.ylabel('Predicted G3')
    plt.title(f'{model}: True vs Predicted')
plt.tight_layout()
plt.show()

## Residual Plots
Residuals (errors) are the difference between true and predicted values. Ideally, residuals should be randomly scattered around zero.

In [None]:
# 6. Residual plots
plt.figure(figsize=(18, 5))
for i, (model, y_pred) in enumerate(preds.items(), 1):
    plt.subplot(1, len(preds), i)
    residuals = y_test - y_pred
    sns.histplot(residuals, bins=20, kde=True, color='coral')
    plt.title(f'{model}: Residuals')
    plt.xlabel('Residual (True - Predicted)')
plt.tight_layout()
plt.show()

## Interpretation
- **MAE/RMSE**: Lower values indicate better model accuracy.
- **R²**: Closer to 1 means the model explains more variance.
- **Scatter plots**: Points close to the diagonal show good predictions.
- **Residual plots**: Random scatter around zero means errors are unbiased.

Based on these visuals, select the model that best balances low error and high R².