# Phase 3 — Baseline Modeling

Before building complex models, we establish simple baselines.

A baseline answers a critical question:

> Is our advanced model truly learning patterns, or is it just slightly better than a naive guess?

Without baselines, improvement claims are meaningless.

---

## Why Baselines Matter

Baselines provide:

- A performance reference point
- A sanity check against overfitting
- A minimum benchmark future models must beat
- Context for interpreting RMSE values

In time series forecasting, even simple heuristics can be surprisingly strong.

In [1]:
import sys
from pathlib import Path

ROOT = Path.cwd().parent
SRC = ROOT / "src"
if str(SRC) not in sys.path:
    sys.path.append(str(SRC))



**Note on feature selection:**  
The `End_Hour` feature is intentionally excluded at this stage because it is highly correlated with `Start_Hour` (typically representing the next hour). Including both would add redundancy without providing meaningful additional signal for baseline models. A minimal and interpretable feature set is preferred in this phase. The inclusion of `End_Hour` is revisited in later phases when using more flexible models and feature importance analysis.



In [2]:
from sklearn.dummy import DummyRegressor
from sklearn.linear_model import Ridge

from energy_forecast.io import load_data
from energy_forecast.split import time_split
from energy_forecast.evaluate import root_mean_squared_error


In [3]:
df = load_data("../data/Energy Production Dataset.csv", date_col="Date")
df.shape

(51864, 9)

In [4]:
train_df, val_df, test_df = time_split(df, time_col="Date")  # defaults to 70/15/15
print("train:", len(train_df), "val:", len(val_df), "test:", len(test_df))

train: 36304 val: 7780 test: 7780


---

## Baseline 1 — Mean Predictor

Strategy:
Predict the **mean of the training target** for all future observations.

Why this matters:
- Extremely simple
- Ignores time structure
- Acts as the minimum acceptable benchmark

If a model cannot beat the mean predictor, it has learned nothing useful.



In [5]:
TARGET = "Production"

def numeric_X(d):
    return d.drop(columns=[TARGET]).select_dtypes(include=["number"])

X_train, y_train = numeric_X(train_df), train_df[TARGET]
X_val, y_val = numeric_X(val_df), val_df[TARGET]
X_test, y_test = numeric_X(test_df), test_df[TARGET]



---

## Baseline Model 2 — Ridge Regression (Linear Baseline)

After establishing the mean predictor as a minimal benchmark, we introduce a simple linear model:

**Ridge Regression**

Why Ridge?

- Captures linear relationships between features and production
- Includes L2 regularization to prevent coefficient explosion
- Provides a structured yet interpretable baseline
- Often surprisingly competitive in tabular forecasting



In [6]:
# Mean baseline
mean_model = DummyRegressor(strategy="mean")
mean_model.fit(X_train, y_train)
mean_val_pred = mean_model.predict(X_val)
mean_test_pred = mean_model.predict(X_test)

# Ridge baseline
ridge = Ridge(alpha=1.0, random_state=42)
ridge.fit(X_train, y_train)
ridge_val_pred = ridge.predict(X_val)
ridge_test_pred = ridge.predict(X_test)

print("RMSE (root_mean_squared_error)")
print("Mean  - Val :", root_mean_squared_error(y_val, mean_val_pred))
print("Mean  - Test:", root_mean_squared_error(y_test, mean_test_pred))
print("Ridge - Val :", root_mean_squared_error(y_val, ridge_val_pred))
print("Ridge - Test:", root_mean_squared_error(y_test, ridge_test_pred))


RMSE (root_mean_squared_error)
Mean  - Val : 4474.629669570113
Mean  - Test: 4213.391316725995
Ridge - Val : 4434.160660379344
Ridge - Test: 4192.933525782805




### Observations:
| Model | Validation RMSE | Test RMSE |
|-------|------------------|------------|
| Mean Baseline | 4474.63 | 4213.39 |
| Ridge Regression | 4434.16 | 4192.93 |

- Ridge improves upon the mean baseline on validation.
- This indicates the dataset contains learnable structure beyond simple averaging.
- The improvement, while modest, confirms linear relationships exist in the features.

The gap between validation and test errors remains stable, suggesting no obvious overfitting.




### Why This Matters

If Ridge had failed to outperform the mean baseline, it would indicate weak feature signal or ineffective preprocessing.

Since it improves performance, we now have a stronger benchmark to beat in later phases.


---

## Phase 3 — Summary

We established two structured baselines:

1. Mean predictor (minimum benchmark)
2. Ridge Regression (regularized linear model)

Ridge outperforms the mean baseline, confirming the presence of learnable structure in the dataset.

This provides a meaningful benchmark before moving into:
- preprocessing pipelines
- lag/rolling feature engineering
- advanced ensemble models
