
# Grid Search


---

## Learning Objectives
By the end of this lesson students will be able to:

- Understand what grid searching is
- Use `GridSearchCV` class from sklearn to find optimal hyperparameters
- Differentiate `cross_val_score` from `GridSearchCV`

---

## GridSearch CV
GridSearchCV is a nifty sklearn class. 😀 

It performs cross validation and searches over a bunch of parameters.

It replaces the slow, verbose way of cross validation using a `for` loop with `cross_val_score`. 

Using `GridSearchCV` is generally the best way to optimize hyperparameters.

## Hyperparameters vs parameters.

- __Definition 1 of `parameters`__: a function "defines a parameter, and the calling code passes an argument to that parameter. You can think of the parameter as a parking space and the argument as an automobile." - Qutoed from MSDN in [this SO question](https://stackoverflow.com/q/1788923/4590385).

When you pass them to a function they are called `arguments`. The terms _argument_ and _parameter_ are often used interchangeably.

- __Definition 2 of `parameters`__: the weights in a model. For example, the $ \beta $ values in a linear regression equation. These are the model's parameter.

- `hyperparameters` are the arguments YOU CHOOSE to pass to a transformer or estimator. You tune these to improve model performance. For example, the most important hyperparameter for a scikit-learn Ridge regression model is `alpha`. 


### Just remember: YOU choose the hyperparameters.

---
## GridSearchCV

#### Imports

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# import sklearn classes and functions
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler

#### Read in the data

We'll use the diamonds dataset. We want to predict `price`

In [2]:
!pwd

/Users/sangeetsatpathy/Desktop/general-assembly/306-Pipelines-Gridsearch-main


In [4]:
diamonds = pd.read_csv('data/diamonds.csv')
diamonds.head(2)

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31


#### Inspect 

In [5]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   carat    53940 non-null  float64
 1   cut      53940 non-null  object 
 2   color    53940 non-null  object 
 3   clarity  53940 non-null  object 
 4   depth    53940 non-null  float64
 5   table    53940 non-null  float64
 6   price    53940 non-null  int64  
 7   x        53940 non-null  float64
 8   y        53940 non-null  float64
 9   z        53940 non-null  float64
dtypes: float64(6), int64(1), object(3)
memory usage: 4.1+ MB


`price` is what we want to predict.

#### Break into X and y

Use `carat` to predict `price`

In [6]:
X = diamonds[['carat']]
y = diamonds['price']

### Create holdout/test set and training/validation set with `train_test_split`

In [7]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=22)

 Does this step shuffle the rows? -- yeah, that is the default!

## Add Polynomial Features, Scale it, build a Ridge model

- Instantiate
- Fit and transform X_train
- Transform X_test

In [8]:
pipe = Pipeline([('polys', PolynomialFeatures()),
                ('scale', StandardScaler()),
                ('model', Ridge())])

In [9]:
params = {'polys__degree': [1, 2, 3], 
         'model__alpha': np.logspace(0, 3)}

In [10]:
grid = GridSearchCV(estimator=pipe, param_grid=params, scoring='neg_mean_squared_error')

In [11]:
grid.fit(X_train, y_train)

In [12]:
#results
pd.DataFrame(grid.cv_results_).head()

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_model__alpha,param_polys__degree,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.008347,0.002756,0.003935,0.003193,1.0,1,"{'model__alpha': 1.0, 'polys__degree': 1}",-2461911.0,-2375161.0,-2351449.0,-2505760.0,-2394301.0,-2417716.0,57368.053655,101
1,0.006329,0.001828,0.001258,0.000142,1.0,2,"{'model__alpha': 1.0, 'polys__degree': 2}",-2441547.0,-2348841.0,-2314993.0,-2530921.0,-2379629.0,-2403186.0,76255.1428,51
2,0.006545,0.000995,0.001667,0.000506,1.0,3,"{'model__alpha': 1.0, 'polys__degree': 3}",-2204842.0,-2120799.0,-2088249.0,-2356031.0,-2147074.0,-2183399.0,94396.560963,30
3,0.006051,0.001144,0.001333,0.000373,1.151395,1,"{'model__alpha': 1.1513953993264474, 'polys__d...",-2461912.0,-2375160.0,-2351451.0,-2505759.0,-2394300.0,-2417716.0,57367.628448,100
4,0.005954,0.000933,0.001377,0.000435,1.151395,2,"{'model__alpha': 1.1513953993264474, 'polys__d...",-2441547.0,-2348836.0,-2314985.0,-2530934.0,-2379630.0,-2403186.0,76261.781925,52


In [13]:
#best_model
grid.best_estimator_

In [23]:
grid.best_params_

{'model__alpha': 25.59547922699536, 'polys__degree': 3}

In [25]:
np.sqrt(-grid.best_score_)

1476.1564662369121

#### Say you wanted to make a Lasso model and you want to search for a good value for the hyperparameter `alpha`. How would you do that with `cross_val_score`?

grid.

---
## GridSearchCV 🚀

- GridSearchCV performs cross validation for multiple models with the data you fit it with. 
- It saves the best performing model and refits it on all the data you pass it.
- You treat it like an estimator.

`GridSearchCV` accepts a scikit-learn `estimator` object and a **parameter grid**.

- The param grid is a dictionary. 
- The key is the name of the hyperparameter argument in scikit-learn.  
- The value is an iterable to search over (generally a list or a range-style object).

#### Q: What's an iterable?

#### Let's use `GridSearchCV` with a Lasso model and different values for alpha.

Note: You could get the same results with LassoCV, but GridSearchCV can be nicely combined with many algorithms and Pipelines, so I suggest sticking with GridSearchCV. 

#### Set up a parameter grid with several values for alpha

#### Instantiate a GridSearchCV object by passing it an estimator and a param_grid.

### We use this GridSearch object like it's an estimator, fitting, predicting and scoreing it like normal. 🙂

#### Fit it on the training data

#### Score on the training data

#### See all the results of training

#### What were the best params?

#### Make predictions for the test set

#### Score with the MAE, MSE, and RMSE on test set

#### Score the best model on the test data with the default scoring metric

---
# Exercise

With the same X and y, use GridSearchCV with Ridge and several values of alpha. To try to speed things up by using more of your computer's processor cores pass `n_jobs=-1`. 

---
# Summary

You've seen `GridSearchCV` in action. 🚀

It helps you find good hyperparameters for your models. 😎

#### When would you not use GridSearchCV? 

When it takes too long to fit. RandomizedSearchCV and other scikit-learn variants can serve you better in those cases.

## Check for understanding

- Why would you want to use `GridSearchCV`?
- What do you pass `GridSearchCV`?
- How do you specify the parameter grid?
- How do you get the results of fitting the models?

## Challenge questions
- Does `GridSearchCV` randomize the data for cross validation? 
- How do you parallelize the grid search so that multiple models are fit simultaneously on your processor cores? 

`GridSearchCV` is an extremely powerful tool for your toolkit! 🛠
