**CELL 1: Imports**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error, mean_squared_error


**CELL 2: Load CSVs (LOCAL)**

In [7]:
y_true_xgb = pd.read_csv("outputs/y_test_xgb.csv").iloc[:, 0].values
xgb_pred   = pd.read_csv("outputs/xgb_predictions.csv").iloc[:, 0].values

y_true_lstm = pd.read_csv("outputs/y_test_lstm.csv").iloc[:, 0].values
lstm_pred   = pd.read_csv("outputs/lstm_predictions.csv").iloc[:, 0].values

**CELL 3: Align Lengths**

In [8]:
min_len = min(len(y_true_xgb), len(y_true_lstm))

y_true = y_true_xgb[:min_len]
xgb_pred = xgb_pred[:min_len]
lstm_pred = lstm_pred[:min_len]


**CELL 4: Evaluation Function**

In [10]:

def evaluate(y_true, y_pred):
    mae = mean_absolute_error(y_true, y_pred)
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))
    return mae, rmse


**CELL 5: Evaluate Models**

In [12]:
xgb_mae, xgb_rmse = evaluate(y_true, xgb_pred)
lstm_mae, lstm_rmse = evaluate(y_true, lstm_pred)

print(f"XGBoost → MAE: {xgb_mae:.3f}, RMSE: {xgb_rmse:.3f}")
print(f"LSTM     → MAE: {lstm_mae:.3f}, RMSE: {lstm_rmse:.3f}")


XGBoost → MAE: 0.877, RMSE: 2.417
LSTM     → MAE: 23.363, RMSE: 55.561


**CELL 6: Comparison Table**

In [14]:
results = pd.DataFrame({
    "Model": ["XGBoost", "LSTM"],
    "MAE": [xgb_mae, lstm_mae],
    "RMSE": [xgb_rmse, lstm_rmse]
})

results


Unnamed: 0,Model,MAE,RMSE
0,XGBoost,0.87729,2.4165
1,LSTM,23.362868,55.561087


### Model Evaluation Summary

- XGBoost achieved significantly lower MAE and RMSE compared to LSTM.
- Tree-based models effectively captured nonlinear relationships using engineered features.
- LSTM underperformed due to reliance on raw sequential data without exogenous inputs.
- Based on accuracy and stability, XGBoost was selected as the production model.
