# Importance of optimizing model parameters

# Grid Search

1. It is a optimal hyperparameter tuning technique.
2. It builds a model for each permutation of all of the given hyperparameter
values
3. Each such model is evaluated and ranked.
4. The combination of hyperparameter values that gives the best performing
model is chosen
5. For every combination, cross validation is used and average score is
calculated
6. This is an exhaustive sampling of the hyperparameter space and can be quite
inefficient

### Importing Libraries

In [3]:
from sklearn import datasets
from sklearn.metrics import accuracy_score

### Loading and splitting dataset

In [4]:
iris = datasets.load_iris()
X = iris.data[:,2:]
y = iris.target
from sklearn.model_selection import train_test_split, cross_val_score

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify = y,random_state = 7)

### Creating model

In [5]:
from sklearn.neighbors import KNeighborsClassifier

knn_clf = KNeighborsClassifier()
knn_clf.fit(X_train, y_train)

KNeighborsClassifier()

### Creating parameter grid

In [10]:
param_grid = {'n_neighbors': list(range(1,9)),
             'algorithm': ('auto', 'ball_tree', 'kd_tree' , 'brute') }   # selecting algo of 4 types

In [11]:
from sklearn.model_selection import GridSearchCV
gs = GridSearchCV(knn_clf,param_grid,cv=10)

In [12]:
gs.fit(X_train, y_train)

GridSearchCV(cv=10, estimator=KNeighborsClassifier(),
             param_grid={'algorithm': ('auto', 'ball_tree', 'kd_tree', 'brute'),
                         'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8]})

In [13]:
gs.best_params_

{'algorithm': 'brute', 'n_neighbors': 1}

best_params_ extracts best parameters combinations for you.Best algorithm is "brute" and no. of neighbor is 1

In [14]:
gs.cv_results_['params']

[{'algorithm': 'auto', 'n_neighbors': 1},
 {'algorithm': 'auto', 'n_neighbors': 2},
 {'algorithm': 'auto', 'n_neighbors': 3},
 {'algorithm': 'auto', 'n_neighbors': 4},
 {'algorithm': 'auto', 'n_neighbors': 5},
 {'algorithm': 'auto', 'n_neighbors': 6},
 {'algorithm': 'auto', 'n_neighbors': 7},
 {'algorithm': 'auto', 'n_neighbors': 8},
 {'algorithm': 'ball_tree', 'n_neighbors': 1},
 {'algorithm': 'ball_tree', 'n_neighbors': 2},
 {'algorithm': 'ball_tree', 'n_neighbors': 3},
 {'algorithm': 'ball_tree', 'n_neighbors': 4},
 {'algorithm': 'ball_tree', 'n_neighbors': 5},
 {'algorithm': 'ball_tree', 'n_neighbors': 6},
 {'algorithm': 'ball_tree', 'n_neighbors': 7},
 {'algorithm': 'ball_tree', 'n_neighbors': 8},
 {'algorithm': 'kd_tree', 'n_neighbors': 1},
 {'algorithm': 'kd_tree', 'n_neighbors': 2},
 {'algorithm': 'kd_tree', 'n_neighbors': 3},
 {'algorithm': 'kd_tree', 'n_neighbors': 4},
 {'algorithm': 'kd_tree', 'n_neighbors': 5},
 {'algorithm': 'kd_tree', 'n_neighbors': 6},
 {'algorithm': 'kd

Here at 25th combination we got best accuracy score where algorithm is brute and n_neighbors is 1

### Checking scores

In [15]:
gs.cv_results_['mean_test_score']

array([0.9469697 , 0.93787879, 0.95530303, 0.93787879, 0.95530303,
       0.9469697 , 0.95530303, 0.95530303, 0.9469697 , 0.93787879,
       0.95530303, 0.93787879, 0.95530303, 0.9469697 , 0.95530303,
       0.95530303, 0.9469697 , 0.93787879, 0.95530303, 0.93787879,
       0.95530303, 0.9469697 , 0.95530303, 0.95530303, 0.96515152,
       0.92878788, 0.95530303, 0.93787879, 0.95530303, 0.95530303,
       0.96439394, 0.95530303])

These are the 32 scores. Out of these 32 highest score is 0.96 which is at 25th combination.