# Grid Search Walkthrough
We will learn about ```grid search``` and how to use it in Sklearn
In machine learning, two tasks are commonly done at the same time in data pipelines: cross validation and (hyper)parameter tuning.
<br>
<br>
**Cross validation** is the process of training learners using one set of data and testing it using a different set. 
<br>
<br>**Parameter tuning** is the process to selecting the values for a model’s parameters that maximize the accuracy of the model.

In this tutorial we work through an example combining cross validation and parameter tuning using scikit-learn.

In [3]:
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn import datasets, svm
import matplotlib.pyplot as plt

#### Create Two Datasets
In the code below, we load the ```digits dataset```, which contains 64 feature variables. Each feature denotes the darkness of a pizel in an 8 by 8 image of a handwritten digit. Let's see these features

In [4]:
#Load the digit data
digits = datasets.load_digits()

In [5]:
#View the features of the first observation
digits.data[0:1]

array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.,  0.,  0., 13., 15., 10.,
        15.,  5.,  0.,  0.,  3., 15.,  2.,  0., 11.,  8.,  0.,  0.,  4.,
        12.,  0.,  0.,  8.,  8.,  0.,  0.,  5.,  8.,  0.,  0.,  9.,  8.,
         0.,  0.,  4., 11.,  0.,  1., 12.,  7.,  0.,  0.,  2., 14.,  5.,
        10., 12.,  0.,  0.,  0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])

To demonstrate cross validation and parameter tuning, first we are going to divide the digit data into two datasets called ```data1``` and ```data2```. ```data1``` contains the first 1000 rows of the digits data, while ```data2``` contains the remaining ~800 rows.

In [6]:
# Create dataset 1
data1_features = digits.data[:1000]
data1_target = digits.target[:1000]

# Create dataset 2
data2_features = digits.data[1000:]
data2_target = digits.target[1000:]

### Create Parameter Candidates
Before looking for which combination of parameter values produces the most accurate model, we must specify the different candidate values we want to try. In the code below we have a number of candidate parameter values, including four different values for **C (1, 10, 100, 1000)**, two values for **gamma (0.001, 0.0001)**, and two kernels **(linear, rbf)**. The grid search will try all combinations of parameter values and select the set of parameters which provides the most accurate model.

In [8]:
parameter_candidates = [
  {'C': [1, 10, 100, 1000], 'kernel': ['linear']},
  {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']},
]

### Conduct Grid Search To Find Parameters Producing Highest Score

Now we are ready to conduct the grid search using scikit-learn’s ```GridSearchCV``` which stands for grid search cross validation. By default, the ```GridSearchCV```’s cross validation uses 3-fold ```KFold``` or ```StratifiedKFold``` depending on the situation.

In [9]:
# Create a classifier object with the classifier and parameter candidates
clf = GridSearchCV(estimator=svm.SVC(), param_grid=parameter_candidates, n_jobs=-1)

# Train the classifier on data1's feature and target data
clf.fit(data1_features, data1_target)   

First, let's look at the accuracy score when we apply the model to ```data1```.

In [10]:
# View the accuracy score
print('Best score for data1:', clf.best_score_) 

Best score for data1: 0.966


Which parameters are the best? We can have scikit-learn tell us

In [11]:
# View the best parameters for the model found using grid search
print('Best C:',clf.best_estimator_.C) 
print('Best Kernel:',clf.best_estimator_.kernel)
print('Best Gamma:',clf.best_estimator_.gamma)

Best C: 10
Best Kernel: rbf
Best Gamma: 0.001


### Sanity Check Using Second Dataset
We will now use that second data set to prove that these parameters are actually used by the model. First, we apply the claddifier we just trained to the second dataset. Then, we train a *new* support vector classifer from scratch using the parameters found using the grid search.

In [12]:
# Apply the classifier trained using data1 to data2, and view the accuracy score
clf.score(data2_features, data2_target)  

0.9698870765370138

In [13]:
# Train a new classifier using the best parameters found by the grid search
svm.SVC(C=10, kernel='rbf', gamma=0.001).fit(data1_features, data1_target).score(data2_features, data2_target)

0.9698870765370138