<br>
<p style="text-align: left;"><img src='https://s3.amazonaws.com/weclouddata/images/logos/sunlife_logo.png' width='35%'></p>
<p style="text-align:left;"><font size='10'><b> Machine Learning - Hyperparameter Tuning</b></font> </p><font color='#559E54' size=6>Instructor Copy</font> </p>
<h2 align='left' > Sunlife Data Science Training </h2>

<h4 align='left'>  Prepared by: <img src='https://s3.amazonaws.com/weclouddata/images/logos/wcd_logo.png' width='15%'>

---

## $\Delta$ 1. Introducting K-Fold Cross Valiation

### Steps for cross-validation:
> - Dataset is split into K "folds" of equal size  
> - Each fold acts as the testing set 1 time, and acts as the training set K-1 times 
> - Average testing performance is used as the estimate of out-of-sample performance
> - Also known as cross-validated performance

### Benefits of cross-validation:
> - More reliable estimate of out-of-sample performance than train/test split
> - Reduce the variance of a single trial of a train/test split

### CV Use Cases
> - Selecting tuning parameters
> - Choosing between models
> - Selecting features

### Drawbacks of cross-validation:
> - Can be computationally expensive
> - Especially when the data set is very large or the model is slow to train

---

## $\Delta$ 2. Hyperparameter tuning using `cross_val_score`

In [1]:
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score
import matplotlib.pyplot as plt
%matplotlib inline

  return f(*args, **kwds)
  return f(*args, **kwds)


### Load Dataset

In [2]:
# read in the iris data
iris = load_iris()

# create X (features) and y (response)
X = iris.data
y = iris.target

print('X matrix dimensionality:', X.shape)
print('Y vector dimensionality:', y.shape)

X matrix dimensionality: (150, 4)
Y vector dimensionality: (150,)


In [3]:
X

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       [4.4, 2.9, 1.4, 0.2],
       [4.9, 3.1, 1.5, 0.1],
       [5.4, 3.7, 1.5, 0.2],
       [4.8, 3.4, 1.6, 0.2],
       [4.8, 3. , 1.4, 0.1],
       [4.3, 3. , 1.1, 0.1],
       [5.8, 4. , 1.2, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.4, 3.9, 1.3, 0.4],
       [5.1, 3.5, 1.4, 0.3],
       [5.7, 3.8, 1.7, 0.3],
       [5.1, 3.8, 1.5, 0.3],
       [5.4, 3.4, 1.7, 0.2],
       [5.1, 3.7, 1.5, 0.4],
       [4.6, 3.6, 1. , 0.2],
       [5.1, 3.3, 1.7, 0.5],
       [4.8, 3.4, 1.9, 0.2],
       [5. , 3. , 1.6, 0.2],
       [5. , 3.4, 1.6, 0.4],
       [5.2, 3.5, 1.5, 0.2],
       [5.2, 3.4, 1.4, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [5.4, 3.4, 1.5, 0.4],
       [5.2, 4.1, 1.5, 0.1],
       [5.5, 4.2, 1.4, 0.2],
       [4.9, 3

In [4]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

### K-Fold CV with Fixed Parameter 

In [5]:
# Instantiate model
knn = KNeighborsClassifier(n_neighbors=1)

In [6]:
knn

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=1, p=2,
           weights='uniform')

In [7]:
# Run 5-fold CV
scores = cross_val_score(knn, X, y, cv=5, scoring='accuracy')
print(scores)

[0.96666667 0.96666667 0.93333333 0.93333333 1.        ]


**Note:**
> **`sklearn.model_selection.cross_val_score`** takes care of splitting X and y into the 10 folds that's why we pass X and y entirely instead of X_train and y_train


### Calculate CV Accurary

In [8]:
# Use Average Accuracy as an estimate of out-of-sample performance
print(scores.mean())
print(scores.std())

0.96
0.024944382578492935


### $\Omega$ Parameter Search (finding optimal K for KNN)

In [9]:
# search for an optimal value of K for KNN

# list of integers 1 to 30
# we will try K from 1 to 30
k_range = range(1, 31)

# list of scores from k_range
k_scores = []

# Loop through a range of k values
for k in k_range:
    # In each iteration, run KNeighborsClassifier with k neighbours
    knn = KNeighborsClassifier(n_neighbors=k)
    
    # Calculate cross_val_score for the model with K
    scores = cross_val_score(knn, X, y, cv=10, scoring='accuracy')
    
    # Save the mean of scores for k neighbors to k_scores list
    k_scores.append(scores.mean())
print(k_scores)

[0.96, 0.9533333333333334, 0.9666666666666666, 0.9666666666666666, 0.9666666666666668, 0.9666666666666668, 0.9666666666666668, 0.9666666666666668, 0.9733333333333334, 0.9666666666666668, 0.9666666666666668, 0.9733333333333334, 0.9800000000000001, 0.9733333333333334, 0.9733333333333334, 0.9733333333333334, 0.9733333333333334, 0.9800000000000001, 0.9733333333333334, 0.9800000000000001, 0.9666666666666666, 0.9666666666666666, 0.9733333333333334, 0.96, 0.9666666666666666, 0.96, 0.9666666666666666, 0.9533333333333334, 0.9533333333333334, 0.9533333333333334]


### $\Omega$ Plot the parameter search results

In [10]:
# plot the value of K for KNN (x-axis) versus 
# the cross-validated accuracy (y-axis)

%matplotlib inline

import bokeh.plotting as bp
import matplotlib.pyplot as plt
from bokeh.layouts import gridplot

W = 590
H = 350
bp.output_notebook()

# create a new plot with a log axis type
p = bp.figure(title = 'CV Parameter Search', 
              plot_width=600, 
              plot_height=300)

p.line(k_range, k_scores, line_width=2)
p.circle(k_range, k_scores, fill_color="white", size=8)
p.title.text_font_size = '16pt'
p.yaxis.axis_label_text_font_size = "14pt"
p.xaxis.axis_label_text_font_size = "14pt"

bp.show(p)

#### NOTE:
> In this case, K = 13 seems to be an optimal parameter

### $\Omega$ Refit the model with Best Parameter on full dataset

In [11]:
k = 13
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X, y)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=13, p=2,
           weights='uniform')

---

## $\Delta$ 3. More efficient parameter tuning using GridSearchCV

> - Grid Search allows you to define a grid of parameters that will be searched using K-fold cross-validation. 
> - This is like an automated version of the "for loop" above

In [12]:
from sklearn.model_selection import GridSearchCV

### Define the parameter search range

In [13]:
# define the parameter values that should be searched
k_range = list(range(1, 31))
print(k_range)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]


### Create the parameter grid
> - **key**: parameter name  
> - **value**: list of values that should be searched for that parameter

In [14]:
# create a parameter grid: map the parameter names to the values that should be searched
# simply a python dictionary

# single key-value pair for param_grid
param_grid = dict(n_neighbors=k_range)
print(param_grid)

{'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]}


### Instantiate the GridSearch

- After instantiatino, the grid object is ready to do 10-fold cross validation on a KNN model using classification accuracy as the evaluation metric
- In addition, there is a parameter grid to repeat the 10-fold cross validation process 30 times
  - Each time, the n_neighbors parameter should be given a different value from the list
  - You can set n_jobs = -1 to run computations in parallel (if supported by your computer and OS)
    - This is also called parallel programming

In [15]:
grid = GridSearchCV(estimator=knn, 
                    param_grid=param_grid, 
                    cv=10, 
                    scoring='accuracy',
                    n_jobs=-1,
                    return_train_score=True
                    )

In [16]:
grid

GridSearchCV(cv=10, error_score='raise-deprecating',
       estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=13, p=2,
           weights='uniform'),
       fit_params=None, iid='warn', n_jobs=-1,
       param_grid={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='accuracy', verbose=0)

### Fit the grid with data

In [17]:
# fit the grid with data
grid.fit(X, y)

GridSearchCV(cv=10, error_score='raise-deprecating',
       estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=13, p=2,
           weights='uniform'),
       fit_params=None, iid='warn', n_jobs=-1,
       param_grid={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='accuracy', verbose=0)

**NOTE:**
> Remember this is running 10-fold validation 30 times
>   - KNN model is being fit and predictions are being made 30 x 10 = 300 times

### Explore the grid search result

In [18]:
grid.cv_results_

{'mean_fit_time': array([0.00093858, 0.00135679, 0.00126793, 0.00112655, 0.00105577,
        0.00126953, 0.0018748 , 0.00146697, 0.00164549, 0.00129714,
        0.00118144, 0.00121338, 0.00186205, 0.00053711, 0.00197139,
        0.00129259, 0.0015106 , 0.00111425, 0.00137889, 0.00137594,
        0.00047858, 0.00134265, 0.00070739, 0.00100017, 0.00117671,
        0.00107377, 0.0009311 , 0.00096292, 0.00048521, 0.00109348]),
 'mean_score_time': array([0.00795512, 0.00240386, 0.00368965, 0.00408711, 0.0025862 ,
        0.00339339, 0.00520167, 0.00426409, 0.0037277 , 0.00464356,
        0.00523746, 0.00339489, 0.00331059, 0.00528564, 0.00403969,
        0.0043236 , 0.00306098, 0.00283244, 0.00243537, 0.0023592 ,
        0.0025619 , 0.00298307, 0.00502772, 0.00440855, 0.00390992,
        0.00436766, 0.00421579, 0.00517507, 0.00414014, 0.00342243]),
 'mean_test_score': array([0.96      , 0.95333333, 0.96666667, 0.96666667, 0.96666667,
        0.96666667, 0.96666667, 0.96666667, 0.97333333, 0

In [19]:
grid.cv_results_['mean_test_score']

array([0.96      , 0.95333333, 0.96666667, 0.96666667, 0.96666667,
       0.96666667, 0.96666667, 0.96666667, 0.97333333, 0.96666667,
       0.96666667, 0.97333333, 0.98      , 0.97333333, 0.97333333,
       0.97333333, 0.97333333, 0.98      , 0.97333333, 0.98      ,
       0.96666667, 0.96666667, 0.97333333, 0.96      , 0.96666667,
       0.96      , 0.96666667, 0.95333333, 0.95333333, 0.95333333])

In [20]:
grid.cv_results_['std_test_score']

array([0.05333333, 0.05206833, 0.04472136, 0.04472136, 0.04472136,
       0.04472136, 0.04472136, 0.04472136, 0.03265986, 0.04472136,
       0.04472136, 0.03265986, 0.0305505 , 0.04422166, 0.03265986,
       0.03265986, 0.03265986, 0.0305505 , 0.03265986, 0.0305505 ,
       0.03333333, 0.03333333, 0.03265986, 0.04422166, 0.03333333,
       0.04422166, 0.04472136, 0.04268749, 0.04268749, 0.04268749])

NOTE: 
> - Standard deviation of accuracy scores = 0.053
>   - If SD is high, the cross-validated estimate of the accuracy might not be as reliable

In [21]:
results = list(zip(grid.cv_results_['mean_test_score'], grid.cv_results_['std_test_score'], grid.cv_results_['params']))

### Print the cross validation result report

In [22]:
["mean test score: {0:.3f}, std: {1:.3f}, params: {2}".format(result[0],result[1],result[2]) for result in results]

["mean test score: 0.960, std: 0.053, params: {'n_neighbors': 1}",
 "mean test score: 0.953, std: 0.052, params: {'n_neighbors': 2}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 3}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 4}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 5}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 6}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 7}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 8}",
 "mean test score: 0.973, std: 0.033, params: {'n_neighbors': 9}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 10}",
 "mean test score: 0.967, std: 0.045, params: {'n_neighbors': 11}",
 "mean test score: 0.973, std: 0.033, params: {'n_neighbors': 12}",
 "mean test score: 0.980, std: 0.031, params: {'n_neighbors': 13}",
 "mean test score: 0.973, std: 0.044, params: {'n_neighbors': 14}",
 "mean test score: 0.973, std: 0.033, params: {'n_neighbo

In [23]:
# create a list of the mean scores only
# list comprehension to loop through grid.grid_scores
grid_mean_scores = grid.cv_results_['mean_test_score']
print(grid_mean_scores)

[0.96       0.95333333 0.96666667 0.96666667 0.96666667 0.96666667
 0.96666667 0.96666667 0.97333333 0.96666667 0.96666667 0.97333333
 0.98       0.97333333 0.97333333 0.97333333 0.97333333 0.98
 0.97333333 0.98       0.96666667 0.96666667 0.97333333 0.96
 0.96666667 0.96       0.96666667 0.95333333 0.95333333 0.95333333]


In [24]:
%matplotlib inline

import bokeh.plotting as bp
import matplotlib.pyplot as plt
from bokeh.layouts import gridplot

W = 590
H = 350
bp.output_notebook()

# create a new plot with a log axis type
p = bp.figure(title = 'Grid Search Result', 
              plot_width=600, 
              plot_height=300)

p.line(k_range, grid_mean_scores, line_width=2)
p.circle(k_range, grid_mean_scores, fill_color="white", size=8)
p.title.text_font_size = '16pt'
p.yaxis.axis_label_text_font_size = "14pt"
p.xaxis.axis_label_text_font_size = "14pt"

bp.show(p)

### Examine the best model

In [25]:
# Single best score achieved across all params (k)
print(grid.best_score_)

# Dictionary containing the parameters (k) used to generate that score
print(grid.best_params_)

# Actual model object fit with those best parameters
# Shows default parameters that we did not specify
print(grid.best_estimator_)

0.98
{'n_neighbors': 13}
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=None, n_neighbors=13, p=2,
           weights='uniform')


## $\Delta$ 4. Searching multiple parameters simultaneously

### Build the parameter lists

In [26]:
# define the parameter values that should be searched
k_range = list(range(1, 31))

# Another parameter besides k that we might vary is the weights parameters
# default options --> uniform (all points in the neighborhood are weighted equally)
# another option --> distance (weights closer neighbors more heavily than further neighbors)

# create the paramter list
weight_options = ['uniform', 'distance']

### Build the parameter grid (dictionary)

In [27]:
# create a parameter grid: map the parameter names to the values that should be searched

param_grid = dict(n_neighbors=k_range, weights=weight_options)
print(param_grid)


{'weights': ['uniform', 'distance'], 'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]}


### Instantiate the grid search

In [28]:
# exhaustive grid-search because it's trying every combination
# 10-fold cross-validation is being performed 30 x 2 = 60 times

grid = GridSearchCV(estimator=knn, 
                    param_grid=param_grid, 
                    cv=10, 
                    scoring='accuracy',
                    return_train_score=True)

In [51]:
grid

GridSearchCV(cv=10, error_score='raise',
       estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=13, p=2,
           weights='uniform'),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], 'weights': ['uniform', 'distance']},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='accuracy', verbose=0)

### Fit the grid

In [41]:
grid.fit(X, y)

GridSearchCV(cv=10, error_score='raise',
       estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=13, p=2,
           weights='uniform'),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], 'weights': ['uniform', 'distance']},
       pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
       scoring='accuracy', verbose=0)

### Get the grid search result

In [42]:
# view the complete results
grid.cv_results_

{'mean_fit_time': array([0.00051465, 0.00037193, 0.00036938, 0.00042641, 0.00049453,
        0.00042634, 0.00042639, 0.00045419, 0.00047004, 0.00045664,
        0.00035689, 0.00043957, 0.00045018, 0.00056963, 0.00053213,
        0.00052676, 0.0005502 , 0.00053499, 0.000561  , 0.00053806,
        0.0003916 , 0.00034339, 0.00038791, 0.00033269, 0.00029762,
        0.00029609, 0.00028555, 0.00031729, 0.0003001 , 0.00030963,
        0.00036497, 0.00032265, 0.00031393, 0.00031495, 0.0002753 ,
        0.00026498, 0.0002624 , 0.00028021, 0.00026476, 0.00024431,
        0.00024605, 0.0002691 , 0.0003001 , 0.00042489, 0.00300765,
        0.00068192, 0.00076694, 0.0005295 , 0.0004391 , 0.00037296,
        0.00037537, 0.00033848, 0.00035419, 0.00034513, 0.00035095,
        0.00046573, 0.00048764, 0.00106294, 0.00054934, 0.00097437]),
 'mean_score_time': array([0.0008297 , 0.00059187, 0.00064795, 0.00068326, 0.00079303,
        0.00066805, 0.0007309 , 0.00088549, 0.00074723, 0.00074217,
        0.

### Examine the best model

In [43]:
# examine the best model
print(grid.best_score_)
print(grid.best_params_)


0.98
{'n_neighbors': 13, 'weights': 'uniform'}


**Note**
> Best score did not improve for this model

---

## $\Delta$ 5. Reducing computational expense using RandomizedSearchCV¶

This is a close cousin to GridSearchCV
> Searching many different parameters at once may be computationally infeasible

For example
- Searching 10 parameters (each range of 1000), Require 10,000 trials of CV
  - 100,000 model fits with 10-fold CV
  - 100,000 predictions with 10-fold CV

**RandomizedSearchCV** searches a subset of the parameters, and you control the computational "budget"
- You can decide how long you want it to run for depending on the computational time we have

In [44]:
from sklearn.model_selection import RandomizedSearchCV

In [45]:
# specify "parameter distributions" rather than a "parameter grid"

# since both parameters are discrete, so param_dist is the same as param_grid
param_dist = dict(n_neighbors=k_range, weights=weight_options)
print(param_dist)

{'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], 'weights': ['uniform', 'distance']}


> Important: Specify a continuous distribution (rather than a list of values) for any continous parameters

### Instantiate the RandomizedSearch 

In [46]:
# 2 new params
# n_iter --> controls number of random combinations it will try
# random_state for reproducibility 
rand = RandomizedSearchCV(knn, param_dist, cv=10, scoring='accuracy', n_iter=10, random_state=5)


### Fit the models

In [47]:
# fit
rand.fit(X, y)


RandomizedSearchCV(cv=10, error_score='raise',
          estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=13, p=2,
           weights='uniform'),
          fit_params=None, iid=True, n_iter=10, n_jobs=1,
          param_distributions={'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], 'weights': ['uniform', 'distance']},
          pre_dispatch='2*n_jobs', random_state=5, refit=True,
          return_train_score='warn', scoring='accuracy', verbose=0)

### Examine the best model

In [48]:
print(rand.best_score_)
print(rand.best_params_)
print(rand.best_estimator_)

0.98
{'weights': 'uniform', 'n_neighbors': 18}
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=18, p=2,
           weights='uniform')


In [50]:
# run RandomizedSearchCV 20 times (with n_iter=10) and record the best score

best_scores = []

for _ in list(range(20)):
    rand = RandomizedSearchCV(knn, param_dist, cv=10, scoring='accuracy', n_iter=10)
    rand.fit(X, y)
    best_scores.append(rand.best_score_)

print(best_scores)

[0.9733333333333334, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.9733333333333334, 0.98, 0.98, 0.98, 0.98, 0.98, 0.9733333333333334, 0.98, 0.98, 0.9733333333333334, 0.98]


## $\Delta$ 6. Parameter Search Lab

For this lab, please go to the previous ensemble_trees_lab_student notebook. 

In the previous ensemble lab, we fitted RF and GBM models on the credit risk dataset. Let's now try hyperparamter search on this dataset and see if we can get a performance boost. 

> Note that there's always a tradeoff between the performance of the model and the cost of computation. Sometimes, the extra little performance gain may not be worth the effort.