# Hyperparameter Tuning
<img src="../images/validation_comic.jpg" width="550"/>

### How to Optimize Model Parameters?
Determining the optimal model parameters is a critical aspect of hyperparameter tuning. Most machine learning models possess a set of parameters that need to be defined before the training process commences. Take, for example, ElasticNet regression, which has both the $l1$ and $l2$ ($\alpha$) norm penalties.

One approach to parameter tuning involves haphazardly experimenting with different values, hoping to stumble upon settings that yield improved performance. However, a smarter and more systematic method is available, known as model validation.

### The Significance of Model Validation
In the realm of supervised learning, data is typically divided into two distinct subsets:
- **Training:** This phase allows the model to comprehend the underlying data patterns.
- **Testing:** This stage is designed to ensure that our model generalizes well on new, unseen data.

In situations where the dataset's size permits, introducing a third set can be invaluable:

- **Validation:** Here, the model can explore and determine the most suitable hyperparameters while also identifying signs of overfitting.

The validation set is instrumental in evaluating the performance of our trained models under various hyperparameter configurations. Furthermore, it serves as a valuable tool for detecting and addressing overfitting, where the model excels in training data but fails to generalize to new data. By employing this approach, we ensure that our testing set remains entirely independent from any decisions regarding our model, thereby safeguarding against potential data snooping

### Cross Validation
<div style="display: flex; align-items: center;">
  <img src="../images/cross_validation_diagram_1.png" height="150"/>
  <div style="width: 50px;"></div>
  <img src="../images/cross_validation_diagram_2.ppm" height="150"/>
</div>
<br>

Working with smaller datasets can be challenging. In such scenarios, we cannot afford to set aside a dedicated validation subset, as every bit of training data is precious.

This is where cross-validation comes to the rescue. Cross-validation involves the following steps:

1. Dividing the data into "folds", which are smaller, distinct subsets.
2. Reserving one fold as the testing set while training the model on the remaining folds.
3. Repeating this process, using each fold as a testing set once.
4. Averaging the model's performance over all testing folds.

## Demo

In [1]:
# Imports
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import linear_model
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_squared_error
from sklearn.linear_model import ElasticNet
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from sklearn.preprocessing import StandardScaler
import warnings
warnings.filterwarnings("ignore")

Let's start by fitting an ElasticNet regression model with default parameters.

In [3]:
# Load housing data and split it into train/test
np.random.seed(0)
data = fetch_california_housing()
X = data['data']
y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8)

# Fit the models on the training data
ols = ElasticNet()
ols.fit(X_train, y_train)

# Evaluate the model on the testing data
print("------ Deafault Model ------")
y_pred = ols.predict(X_test)
print("R^2: %.2f" % r2_score(y_test, y_pred))
print("MSE: %.2f" % mean_squared_error(y_test, y_pred))

------ Deafault Model ------
R^2: 0.41
MSE: 0.77


Now, let's perform cross validation in order to find the optimal ElasticNet parameters.

In [4]:
# Define search space for hyperparameters
param_grid = {
    'polynomialfeatures__degree': [1],
    'elasticnet__alpha': [0.001, 0.01, 0.1, 1, 10],
    'elasticnet__l1_ratio': [0, 0.25, 0.5, 0.75, 1]
}

# Create model pipeline
model = make_pipeline(
    PolynomialFeatures(),
    StandardScaler(),
    ElasticNet()
)

# Grid search cross validation
search = RandomizedSearchCV(model, param_grid, cv=5, scoring='neg_mean_squared_error', verbose=1)
search.fit(X_train, y_train)
best_model = search.best_estimator_

# Find best parameters
best_params = best_model.named_steps['elasticnet'].get_params()
print('\nBest polynomial degree:', search.best_estimator_.named_steps['polynomialfeatures'].degree)
print('Best alpha:', best_params['alpha'])
print('Best l1 ratio:', best_params['l1_ratio'])

# Evaluate the best model on the test set
print("\n------ Tuned Model ------")
y_pred = best_model.predict(X_test)
print("New R^2: %.2f" % r2_score(y_test, y_pred))
print("New MSE: %.2f" % mean_squared_error(y_test, y_pred))

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Best polynomial degree: 1
Best alpha: 0.001
Best l1 ratio: 1

------ Tuned Model ------
New R^2: 0.59
New MSE: 0.53
