# **Hyperparameter Tuning:** Model Tuning
Hyperparameter tuning consists of finding a set of optimal hyperparameter values for a learning algorithm while applying this<br>
 optimized algorithm to any data set. That combination of hyperparameters maximizes the model's performance, minimizing a<br>
  predefined loss function to produce better results with fewer errors.

In [24]:
import numpy as np
import pandas as pd
import seaborn as sns

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.experimental import enable_halving_search_cv

In [2]:
df = sns.load_dataset("iris")
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [3]:
X = df.iloc[:, :2]
y = df.iloc[:, -1]

## Model Evaluation using Cross Validation

In [13]:
from sklearn.model_selection import cross_val_score
KNN = KNeighborsClassifier()
cv = cross_val_score(KNN, X, y, cv=10, scoring="accuracy")
np.mean(cv)

0.76

---
## **Model Hyperparameters**: KNN

In [22]:
# Parameters for KNN Classifier
para_dict = {
    "n_neighbors": [3,5,10,15],
    "weights": ["uniform", "distance"],
    "algorithm": ["ball_tree", "kd_tree", "brute"],
    "leaf_size": [20,30,40,50]
}

print(para_dict)

{'n_neighbors': [3, 5, 10, 15], 'weights': ['uniform', 'distance'], 'algorithm': ['ball_tree', 'kd_tree', 'brute'], 'leaf_size': [20, 30, 40, 50]}


---
## Model Tuning using **GridSearchCV**

In [19]:
from sklearn.model_selection import GridSearchCV

KNN = KNeighborsClassifier()
KNN_grid = GridSearchCV(estimator = KNN, 
                       param_grid = para_dict, 
                       cv = 5, 
                    #    verbose=2, 
                       n_jobs = -1)

KNN_grid.fit(X,y)

print(f"BEST PARAMETERS ARE: \n{KNN_grid.best_params_}")
print(f"\n BEST SCORE IS: {KNN_grid.best_score_}")

BEST PARAMETERS ARE: 
{'algorithm': 'ball_tree', 'leaf_size': 20, 'n_neighbors': 15, 'weights': 'uniform'}

 BEST SCORE IS: 0.7933333333333333


---
## Model Tuning using **Halving GridSearchCV**

In [30]:
from sklearn.model_selection import HalvingGridSearchCV

KNN = KNeighborsClassifier()
KNN_grid = HalvingGridSearchCV(estimator = KNN, 
                                param_grid = para_dict, 
                                cv = 5, 
                                #    verbose=2, 
                                n_jobs = -1)

KNN_grid.fit(X,y)

print(f"BEST PARAMETERS ARE: \n{KNN_grid.best_params_}")
print(f"\n BEST SCORE IS: {KNN_grid.best_score_}")

BEST PARAMETERS ARE: 
{'algorithm': 'ball_tree', 'leaf_size': 50, 'n_neighbors': 3, 'weights': 'uniform'}

 BEST SCORE IS: 0.8


---
## Model Tuning using **RandomSearchCV**

In [21]:
from sklearn.model_selection import RandomizedSearchCV

KNN = KNeighborsClassifier()
KNN_grid = RandomizedSearchCV(estimator = KNN, 
                       param_distributions = para_dict, 
                       cv = 5, 
                    #    verbose=2, 
                       n_jobs = -1)

KNN_grid.fit(X,y)

print(f"BEST PARAMETERS ARE: \n{KNN_grid.best_params_}")
print(f"\n BEST SCORE IS: {KNN_grid.best_score_}")

BEST PARAMETERS ARE: 
{'weights': 'uniform', 'n_neighbors': 10, 'leaf_size': 30, 'algorithm': 'brute'}

 BEST SCORE IS: 0.78


---
## Model Tuning using **Halving RandomSearchCV**

In [28]:
from sklearn.model_selection import HalvingRandomSearchCV

KNN = KNeighborsClassifier()
KNN_grid = HalvingRandomSearchCV(estimator = KNN, 
                       param_distributions = para_dict, 
                       cv = 5, 
                    #    verbose=2, 
                       n_jobs = -1)

KNN_grid.fit(X,y)

print(f"BEST PARAMETERS ARE: \n{KNN_grid.best_params_}")
print(f"\n BEST SCORE IS: {KNN_grid.best_score_}")

BEST PARAMETERS ARE: 
{'weights': 'uniform', 'n_neighbors': 10, 'leaf_size': 40, 'algorithm': 'ball_tree'}

 BEST SCORE IS: 0.8222222222222222


| Technique | Pros | Cons |
| --- | --- | --- |
| `GridSearchCV` | Exhaustively searches the specified hyperparameter space to find the best configuration. Provides the optimal configuration when the number of hyperparameters to tune is small. | Can be time-consuming for large parameter spaces. |
| `HalvingGridSearchCV` | Uses iterative halving to reduce the time required for hyperparameter tuning. Can be faster than `GridSearchCV` for larger parameter spaces. | May not always find the optimal combination of hyperparameters. |
| `RandomizedSearchCV` | Randomly samples a specified number of hyperparameter configurations to find the best configuration. Can be faster than `GridSearchCV` when the number of hyperparameters to tune is large. | May not always find the optimal combination of hyperparameters. |
| `HalvingRandomSearchCV` | Uses iterative halving to reduce the time required for hyperparameter tuning. Can be faster than `RandomizedSearchCV` for larger parameter spaces. | May not always find the optimal combination of hyperparameters. |
