# 6. Hyperparameter Tuning
All the machine learning models contain hyperparameters than you can tune (as if they were knobs on a guitar). By changing the hyperparameters, the model be specified at a different setting. Different datasets require different hyperparameter settings for optimal performance. For instance, a [decision tree classifier][1] has many hyper-parameters, some of which are:

* **criterion** : Measures quality of the split - default is mean squared error, can also be mean absolute error
* **max_depth** : The maximum depth of the tree. Default is to continue making splits until node is pure
* **min_samples_split** : The minimum number of samples required to split an internal node - default is 2
* **min_samples_leaf** : The minimum number of samples required to be at a leaf node. Default is 1

[1]: http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier

## Hyperparameters vs Parameters
Hyperparameters are all the knobs that you have control over. You set them during **instantiation**. They do NOT change during training. Many of these models have "normal" parameters that are fit during **training**. These parameters are completely different than hyperparameters. The algorithm will find the optimal value of these parameters during training based on the evaluation metric that it uses. For instance, in a linear regression, the normal parameters are the slope and intercept.

Hyperparameters never change after you have instantiated your model. In a decision tree, a hyperparameter such as **`max_depth`** dictates the maximum depth of the tree. The algorithm must stop growing the tree when it reaches that depth. The location of where and how the splits take place is learned through the model and can be thought of as a normal parameter.

## Overfitting
Previously, we saw a decision tree severely overfit the data by having a test score significantly worse than a training score. Let's recreate this below:

In [None]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

In [None]:
heart = pd.read_csv('../data/heart.csv')
heart.head()

In [None]:
# import and instantiate
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_val_score
dtc = DecisionTreeClassifier()

X = heart[['max_hr', 'rest_bp']].values
y = heart['disease'].values

In [None]:
dtc.fit(X, y)
dtc.score(X, y)

In [None]:
scores = cross_val_score(dtc, X, y, cv=5)
scores

In [None]:
scores.mean()

## Inspecting the decision tree
Scikit-learn returns the tree object to us as the **`tree_`** attribute. Let's assign it to its own variable and explore it.

In [None]:
tree = dtc.tree_

In [None]:
tree.n_features

In [None]:
tree.max_depth

In [None]:
tree.node_count

## Change hyperparameters to reduce overfitting
We can reduce the amount of overfitting by specifying different hyperparameters during model instantiation. For instance, we specify a `max_depth` of 7 below which increases our cross validation score.

In [None]:
dtc = DecisionTreeClassifier(max_depth=7)
scores = cross_val_score(dtc, X, y, cv=5)
scores

In [None]:
scores.mean()

In [None]:
scores.std()

# Exercises

### Problem 1
<span  style="color:green; font-size:16px">Continue to tune the decision tree by changing the `max_depth` parameter. Find the value of `max_depth` that achieves the highest score.</span>

### Problem 2
<span  style="color:green; font-size:16px">Make a plot of `max_depth` vs cross validation score.</span>