#Lecture 8 - Hyperparameter tuning


## 1. Introduction
In this notebook, we will briefly show how to tune a hyperparameter by brute force. More complex hyperparameter tuning should leverage sk-learn's built-in methods... (see Discussion)

<i>Note that portions of this Notebook are based on https://scikit-learn.org/0.15/auto_examples/plot_classifier_comparison.html and https://scikit-learn.org/stable/_downloads/plot_randomized_search.ipynb</i>

The classifier we will be using is:
- K-Nearest-Neighbour <i>(yes, neighbour has a 'u'...)</i>

The data we will be using is the "digits" dataset (several descriptors of images of hand-written digits between 0-9.


In [0]:
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits
from sklearn.neighbors import KNeighborsClassifier
import numpy as np


## 2. The Data

Let's load the data


In [26]:
# Create digits dataset:
digits = load_digits()
X, y = digits.data, digits.target

print('Shape of dataset: ', X.shape)
print('First sample: ', X[1,])

# Split the data into train/validation/test:
X_train, X_TV, y_train, y_TV = train_test_split(X, y, test_size=.4)
X_val, X_test, y_val, y_test = train_test_split(X_TV, y_TV, test_size=.5)

print('Size of train/validation/test is: {0}/{1}/{2}'.format(X_train.shape[0], X_val.shape[0], X_test.shape[0]))

Shape of dataset:  (1797, 64)
First sample:  [ 0.  0.  0. 12. 13.  5.  0.  0.  0.  0.  0. 11. 16.  9.  0.  0.  0.  0.
  3. 15. 16.  6.  0.  0.  0.  7. 15. 16. 16.  2.  0.  0.  0.  0.  1. 16.
 16.  3.  0.  0.  0.  0.  1. 16. 16.  6.  0.  0.  0.  0.  1. 16. 16.  6.
  0.  0.  0.  0.  0. 11. 16. 10.  0.  0.]
Size of train/validation/test is: 1078/359/360


Loop through all possible values of K and see which one gives the best performance on the validation set.

In [23]:
best_score = 0.0
best_K = 0
for K in range(1,11):
  clf = KNeighborsClassifier(K);

  # Train and test our classifier
  clf.fit(X_train, y_train) # create classification model using training data
  score = clf.score(X_val, y_val) # evaluate classifier on the validation set
  print('K={0} --> score = {1:.3f}'.format(K, score))

  # Keep track of the optimal K
  if (score > best_score):
    best_score = score
    best_K = K

print('Best K = {0} --> best_score={1:.3f}'.format(best_K, best_score))

K=1 --> score = 0.981
K=2 --> score = 0.972
K=3 --> score = 0.983
K=4 --> score = 0.969
K=5 --> score = 0.972
K=6 --> score = 0.975
K=7 --> score = 0.972
K=8 --> score = 0.967
K=9 --> score = 0.967
K=10 --> score = 0.961
Best K = 3 --> best_score=0.983


Now test your final hyperparameter choice on the test set:

In [24]:
clf = KNeighborsClassifier(best_K);
clf.fit(X_train, y_train) # create classification model using training data
test_score = clf.score(X_test, y_test) # evaluate classifier on the validation set
print('Best K = {0} --> test_score={1:.3f}'.format(best_K, test_score))

Best K = 3 --> test_score=0.978


## Discussion

Other classifiers have more hyperparameters and a greater (infinite) range of possible values. The brute force method above becomes computationally expensive, the more hyperparameter combinations we need to evaluate. 

sk-learn has built-in functions for performing grid search or random search. See https://scikit-learn.org/stable/modules/grid_search.html for details.