# Lesson 4.05 - Hyperparameter Tuning

#### What is the difference between hyperparameters and statistical parameters?
    
- Statistical parameters are quantities that a model can learn or estimate. Examples include $\beta_0$ and $\beta_1$ in a linear model.
- Hyperparameters are quantities our model cannot learn, but affect the fit of our model. Examples include $k$ in $k$-nearest neighbors.

In [1]:
#import all required libraries
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

In [2]:
df = pd.read_csv('data/modifiedIris2Classes.csv')

In [3]:
df.shape

(100, 5)

## Splitting Data into Training and Test Sets

In [4]:
X_train, X_test, y_train, y_test = train_test_split(df[['petal length (cm)']], df['target'], random_state=0)

## Standardize the Data
Logistic Regression is impacted by scale so you need to scale the features in the data before using Logistic Regresison.  

Scikit-Learn's `StandardScaler` helps standardize the dataset’s features for better performance by changing the values so that the distribution standard deviation from the mean equals one. More info can be found [here](https://towardsdatascience.com/scale-standardize-or-normalize-with-scikit-learn-6ccc7d176a02).

In [5]:
scaler = StandardScaler()

# Fit on training set only.
scaler.fit(X_train)

# Apply transform to both the training set and the test set.
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

## K-Nearest Neighbour

In [6]:
# Instantiate KNN with default k value of 3
classifier = KNeighborsClassifier(n_neighbors = 3)
classifier.fit(X_train, y_train)
classifier_y_pred = classifier.predict(X_test)

In [7]:
print(confusion_matrix(y_test, classifier_y_pred))
print(classification_report(y_test, classifier_y_pred))

[[ 9  4]
 [ 0 12]]
              precision    recall  f1-score   support

           0       1.00      0.69      0.82        13
           1       0.75      1.00      0.86        12

    accuracy                           0.84        25
   macro avg       0.88      0.85      0.84        25
weighted avg       0.88      0.84      0.84        25



## What are "hyperparameters?"

1. Built-in quantities of models that we can use to fine-tune our results. For example what value of $k$ do we select?

2. These are quantities our model **cannot** learn... **we must decide on these ourselves**!

3. These are different from statistical parameters, which are quantities a model _can_ learn.

4. Different values for hyperparameters can result in substantially different models.

## GridSearch
**One method of searching for the optimal set of hyperparameters is called GridSearch.**

1. GridSearch gets its name from the fact that we are searching over a "grid" of hyperparameters. 
2. For example, imagine the `n_neighbors` hyperparameters as the columns and `weightings` as the rows. This makes a grid.
3. We check the accuracy for all combinations of hyperparameters on the grid.

#### More information on GridSearch's functionality and limitations can be found at this [link](https://towardsdatascience.com/gridsearchcv-for-beginners-db48a90114ee)

In [8]:
# Set up experiment with taining and test data for X and Y values
X_train2, X_test2, y_train2, y_test2 = train_test_split(df[['petal length (cm)']], df['target'], random_state=0)
scaler = StandardScaler()
scaler.fit(X_train2)
X_train2 = scaler.transform(X_train2)
X_test2 = scaler.transform(X_test2)

In [9]:
# Instantiate Basic Knn with initial k value
knn = KNeighborsClassifier(n_neighbors = 3)
knn.fit(X_train2, y_train2)

KNeighborsClassifier(n_neighbors=3)

In [10]:
# Evaluate and display mean accuracy of Basic Knn on test data
knn.score(X_test2, y_test2)

0.84

In [17]:
# Set up Gridsearch with preferred n_neighbours range and preferred CV value
params = {"n_neighbors":[3,4,5,6,7,8,9,10]}
model = GridSearchCV(knn, params, cv=5)

In [18]:
model.fit(X_train2, y_train2)
model.best_params_

{'n_neighbors': 9}

In [13]:
# Evaluate / Display best mean cross-validated accuracy achieved.
# Note: Model score was established based on training subset (not full data)
model.best_score_

0.96

In [14]:
# Knn with 9 neighbors
gridsearch_knn = KNeighborsClassifier(n_neighbors = 9)
gridsearch_knn.fit(X_train2, y_train2)

KNeighborsClassifier(n_neighbors=9)

In [15]:
gridsearch_y_pred = gridsearch_knn.predict(X_test2)
print(gridsearch_y_pred)

[0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 0 1 0 0 1 0 1 0]


In [16]:
print(confusion_matrix(y_test2, gridsearch_y_pred))
print(classification_report(y_test2, gridsearch_y_pred))

[[10  3]
 [ 0 12]]
              precision    recall  f1-score   support

           0       1.00      0.77      0.87        13
           1       0.80      1.00      0.89        12

    accuracy                           0.88        25
   macro avg       0.90      0.88      0.88        25
weighted avg       0.90      0.88      0.88        25

