# Grid Search 

Grid Searching is the process of testing different parameter values for a model and selecting the ones that produce the best results. 

In this exercice, you will:
- Load data
- Make a parameter dictionary
- Initiate a GridSearch algorithm
- GridSearch your data
- Print the results

## Load Data 

Sklearn has a number of easy to use datasets.
- Load the "iris" dataset
- Extract the two first features of the dataset as "X"
- Extract the target of the dataset as y

In [2]:
from sklearn import datasets

iris = datasets.load_iris() # Load dataset as "iris"

# Keep only 2 first features, Sepal Lenght and Sepal Width

# Load targets

<details>
  <summary>Hint</summary>

You have done something very similar yesterday already !
</details>
<details>
  <summary>View solution</summary>

```python
from sklearn import datasets

iris = datasets.load_iris() # Load dataset as "iris"

X = iris.data[:, :2] # Keep only 2 first features, Sepal Lenght and Sepal Width

y = iris.target # Load targets
```
</details>

## Parameter Dictionary

The parameter dictionary defines which values of the parameter will be tested during the Grid Search.

Remember parameter 'C', the error function of an SVM? Make a dictionary to test for 'C': [0.1, 1, 10]

In [3]:
# TODO: your code here !

<details>
  <summary>Hint</summary>

Here, we simply ask you to create a Python dictionary, like `{"some_key_1": 12, "some_key_2": 15.5, ..., "some_key_n": 7}`
</details>
<details>
  <summary>View solution</summary>

```python
param_dic = [{'C': [0.1, 1, 10]}]
```
</details>


## Initiate and Fit Grid Search

Sklearn's `GridSearchCV` trains multiple models for each parameter value, cross validates the results, and stores the best parameters. 

It takes as arguments the Machine Learning algorithm, the parameter dictionary, the number of cross validations to perform, and the scoring metrics.

Initiate a gridsearch with the following arguments:
- A default classification SVM
- The above created parameter dictionary
- 10-Fold Cross Validation
- "accuracy" scoring metric

Then, fit to data.

In [4]:
# Your code here !


<details>
  <summary>Hint</summary>

You can read about what `GridSearchCV` expects [here](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) ! Do not hesitate to ask TA for a quick explanation as well ! A small illustration might help !
</details>
<details>
  <summary>View solution</summary>

```python
from sklearn.model_selection import GridSearchCV
from sklearn import svm

gridsearch = GridSearchCV(svm.SVC(gamma='auto'), param_dic, cv=10, scoring='accuracy')

gridsearch.fit(X, y)
```
</details>

## Print Results 

The results of the gridsearch can be unpacked.

Unpack and print:
- The best parameter value
- The best classification accuracy

In [6]:
# Your code here !


<details>
  <summary>Hint</summary>

`GridSearchCV` gives you access to the best param and best score with 2 specific methods... Find them in the specs, in the Attributes section.
</details>
<details>
  <summary>View solution</summary>

```python
print(gridsearch.best_params_)

print(gridsearch.best_score_)
```
</details>

Run the following code to unpack classification accuracy and standard deviation for each tested value of 'C'.

In [None]:
mean = gridsearch.cv_results_['mean_test_score']
std = gridsearch.cv_results_['std_test_score']

for mean, std, params in zip(mean, std, gridsearch.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r"
          % (mean, std * 2, params))

Parameter tuning is a key step of model building. Each Machine Learning algorithm has specific parameters that affect its performance.