In [5]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression, Lasso, LogisticRegression, Ridge, ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score, r2_score, roc_curve
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform

In [3]:
# load  and preprocess data data
data = pd.read_csv('../data/credit-data/prepared_data.csv', index_col=0)

# extract labels
y = data['Credit_Score'].to_numpy()
X = data.drop(columns=['Credit_Score']).to_numpy()
y[y == 2] = 1
# split data into train- and test-set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
x_scaler = StandardScaler()

X_train = x_scaler.fit_transform(X_train)
X_test = x_scaler.transform(X_test)

# Hyperparameter Optimization
Machine Learning models come with hyperparameters that have to be set before we are able to train the model and which have influence on the final model performance. For example, a Random Forest has to know how many trees will be included in the ensemble, a Linear Regression has to know the learning rate GD will update the weights with and so on. In general, hyperparameters cannot be optimized during optimization of the model-parameters themselves. This is simply because hyperparameters *shape* the basic model and dictate how optimization algorithms should behave during optimization, thus optimizing hyperparameters and model-parameters jointly is not an easy task to do (and is by the way part of a huge new research field called AutoML).

In this workshop we will focus on some basic methods of Hyperparameter Optimization (HO) that enables you to get rid of manually setting hyperparameters you are not sure about.
Let's get started!

## The Objective
The objective in HO can be formalized as follows:
Given a hyperparameter-search space $\mathcal{H}$ and a task $T$ associated with some evaluation metric $e_T$, we aim to solve
\begin{equation}
 \arg \min_{h \in \mathcal{H}} e_T(h)
\end{equation}
This formlization is very general, but it has to be that general and is one of the reasons why HO is such a hard task to perform. For instance, $\mathcal{H}$ is often very heterogenous, e.g. even picking the right model to perform $T$ can be considered to be a hyperparameter we can optimize for. In the following we will consider a simpler problem with more assumptions:

We will assume we have given a model $M$ with hyperparameter-space $\mathcal{H}_M$, a task $\langle \mathbf{X}, \mathbf{y}, l\rangle$ where $\mathbf{X}$ are features, $\mathbf{y}$ are labels and $l$ is an evaluation metric (e.g. the loss of our model). This means we only consider supervised learning problems with a given model and want to optimize w.r.t. one evaluation metric.

Specifically, we will look at our Logistic Regression model again and try to find better hyperparameters for it! Our search-space will contain the learning rate, regularization parameter, number of epochs of training, 

In [18]:
model = LogisticRegression()
parameter_space = dict(C=uniform(loc=0, scale=4), penalty=['l1', 'l2', 'elasticnet'], max_iter=[50, 100, 200, 500])
clf = RandomizedSearchCV(model, parameter_space, n_iter=30, scoring='f1')
search = clf.fit(X, y) # we pass X and y instead of X_train and y_train here because sklearn will do cross-validation for us automatically
search.best_params_

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

{'C': 3.3691314689318728, 'max_iter': 500, 'penalty': 'l2'}

Let's see how it performs on our test set

In [19]:
lr = LogisticRegression(**search.best_params_)
lr.fit(X_train, y_train)
y_hat = lr.predict(X_test)
acc = accuracy_score(y_hat, y_test)
acc

0.7681159420289855

Indeed, we were able to achieve a slightly better accuracy (around +1.5%)!

In [20]:
f1 = f1_score(y_hat, y_test)
f1

0.8482758620689655

> **Task**
> 
> Try to incorporate more hyperparameters in the Random Search (RS) and play around with the parameters of the RS itself! Can you beat the result above?

## Conclusion
Hyperparameter Tuning is a very important step in your ML-pipeline and you should invest a good amount of time to identify proper hyperparameters! You can do it manually (which can be costly, but if you know parameters that will probably work well it's fine) or you can employ automatic methods that make life easier for you. Of course, Random Search is one of the most basic techniques you can think of and it might not work well for high-dimensional hyperparameter-search spaces (we randomly draw samples from a search-space). Also, if your model reaches a certain complexity, uninformed methods like Random Search wastes many computation resources. In such cases different methods like Bayesian Optimization (BO) might be worth a consideration since such methods don't sample completely at random, they incorporate the results of prior hyperparameters and adjust a distribution over the hyperparameter-search space with the goal to converge faster. 

However, since HO is not the main topic of this workshop, we won't dive in any further. If you are interested in such things, please stand by, we plan a workshop on AutoML as well!