# Hyperparameters Tuning Methods:

- Methods to find the best hyperparameters for the model.

1. GridSearchCV:
    - `sklearn.model_selection.GridSearchCV`

2. RandomizedSearchCV
    - `sklearn.model_selection.RandomizedSearchCV`

3. BayesianOptimzation
    - `pip install scikit-optimize`
    - `skopt.BayesSearchCV`  

- In all the above there are some common parameters, that includes:
    - `estimator` *- model*, 
    - `param_grid` or `param_space` *- dict(hyperparam_name, list(value))*, 
    - `scoring` *- str/list(str)/tuple(str)/dict | `Default=None`*
    - `n_jobs` - [`1`/`None` -> 1 at a time, *int* -> no. of CPU cores, `-1` -> uses all available CPU core.
    - `n_iters` - No. of Random combinations [Randomized Search CV & Bayesian Optimization only] 

---
### Sample Implementation of Hyperparameters Tuning:

In [56]:
import numpy as np
import pandas as pd 
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from skopt import BayesSearchCV
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [4]:
# Generating Sample Data for basic implementation of Hyperparameters tuning...
X, y = make_classification(
    n_samples=5000,
    n_features=15,
    n_informative=13,
    n_redundant=2,
    random_state=100
)

X.shape

(5000, 15)

In [7]:
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.4, random_state=42)

In [8]:
from sklearn.svm import SVC

In [9]:
svm = SVC(class_weight='balanced', random_state=42)

**Grid Search CV:**

In [10]:
param_grid = {
    'kernel' : ['linear', 'rbf'],
    'C': [0.1,1,10,100]
}

In [11]:
grid = GridSearchCV(
    svm, 
    param_grid=param_grid, 
    cv=5, 
    scoring='f1_macro', 
    n_jobs=-1)

In [12]:
grid.fit(X_train, y_train)

In [None]:
# Best Params:
grid.best_params_

{'C': 10, 'kernel': 'rbf'}

In [27]:
# Best CV Score:
print(f'Best CV Score: {round(float(grid.best_score_),2)}')

Best CV Score: 0.96


In [23]:
# Results:
res_df = pd.DataFrame(grid.cv_results_)
res_df

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_C,param_kernel,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.406201,0.043237,0.034086,0.010269,0.1,linear,"{'C': 0.1, 'kernel': 'linear'}",0.813331,0.79982,0.8133,0.841666,0.824961,0.818616,0.014007,5
1,0.345819,0.051331,0.18931,0.041571,0.1,rbf,"{'C': 0.1, 'kernel': 'rbf'}",0.913286,0.894986,0.911647,0.935,0.916666,0.914317,0.012769,4
2,1.399281,0.159541,0.029996,0.007457,1.0,linear,"{'C': 1, 'kernel': 'linear'}",0.813331,0.796484,0.8133,0.84,0.826636,0.81795,0.014597,6
3,0.210315,0.033519,0.084195,0.008773,1.0,rbf,"{'C': 1, 'kernel': 'rbf'}",0.939967,0.949999,0.94998,0.958332,0.94998,0.949652,0.005822,3
4,6.230211,0.178065,0.020265,0.003329,10.0,linear,"{'C': 10, 'kernel': 'linear'}",0.813331,0.794794,0.8133,0.84,0.826636,0.817612,0.015101,8
5,0.143104,0.013183,0.054214,0.003964,10.0,rbf,"{'C': 10, 'kernel': 'rbf'}",0.954979,0.958332,0.959996,0.956662,0.964998,0.958993,0.003436,1
6,32.696635,1.560132,0.012496,0.002798,100.0,linear,"{'C': 100, 'kernel': 'linear'}",0.813331,0.796484,0.813281,0.84,0.826636,0.817946,0.014599,7
7,0.26371,0.040753,0.074359,0.019231,100.0,rbf,"{'C': 100, 'kernel': 'rbf'}",0.946645,0.95165,0.951663,0.949973,0.954994,0.950985,0.002714,2


In [54]:
# Best estimator
model = grid.best_estimator_

# Predictions:
y_pred = model.predict(X_test)

print(f"""Grid Search CV:
      
{classification_report(y_test, y_pred)}
{'-'*75}     
Test Data Accuracy: {accuracy_score(y_test, y_pred)}
{'-'*75}     
Tuning Time: {res_df['mean_fit_time'].sum()}
Best Parameters: {grid.best_params_}
      """)


Grid Search CV:
      
              precision    recall  f1-score   support

           0       0.95      0.96      0.95       997
           1       0.96      0.95      0.95      1003

    accuracy                           0.95      2000
   macro avg       0.95      0.95      0.95      2000
weighted avg       0.95      0.95      0.95      2000

---------------------------------------------------------------------------     
Test Data Accuracy: 0.9545
---------------------------------------------------------------------------     
Tuning Time: 41.69527640342712
Best Parameters: {'C': 10, 'kernel': 'rbf'}
      


**Randomized Search CV:**

In [37]:
rsc = RandomizedSearchCV(svm, param_grid, n_iter=5, n_jobs=-1, scoring='f1_macro')

rsc.fit(X_train, y_train)

In [39]:
# Best Params:
rsc.best_params_

{'kernel': 'rbf', 'C': 100}

In [41]:
# Best CV Score:
print(f'Best CV Score: {round(float(rsc.best_score_),2)}')

Best CV Score: 0.95


In [42]:
# Results:
rsc_res_df = pd.DataFrame(rsc.cv_results_)
rsc_res_df

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_kernel,param_C,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.933745,0.031133,0.01907,0.003768,linear,1.0,"{'kernel': 'linear', 'C': 1}",0.813331,0.796484,0.8133,0.84,0.826636,0.81795,0.014597,4
1,0.130358,0.007226,0.079717,0.01258,rbf,1.0,"{'kernel': 'rbf', 'C': 1}",0.939967,0.949999,0.94998,0.958332,0.94998,0.949652,0.005822,2
2,0.249509,0.011355,0.129222,0.006631,rbf,0.1,"{'kernel': 'rbf', 'C': 0.1}",0.913286,0.894986,0.911647,0.935,0.916666,0.914317,0.012769,3
3,32.856739,2.283921,0.024271,0.00405,linear,100.0,"{'kernel': 'linear', 'C': 100}",0.813331,0.796484,0.813281,0.84,0.826636,0.817946,0.014599,5
4,0.168359,0.01117,0.039436,0.006441,rbf,100.0,"{'kernel': 'rbf', 'C': 100}",0.946645,0.95165,0.951663,0.949973,0.954994,0.950985,0.002714,1


In [55]:
# Best estimator
model = rsc.best_estimator_

# Predictions:
y_pred = model.predict(X_test)

print(f"""Radomized Search CV:
{classification_report(y_test, y_pred)}
{'-'*75}     
Test Data Accuracy: {accuracy_score(y_test, y_pred)}
{'-'*75}     
Tuning Time: {rsc_res_df['mean_fit_time'].sum()}
Best Parameters: {rsc.best_params_}
      """)


Radomized Search CV:
              precision    recall  f1-score   support

           0       0.94      0.96      0.95       997
           1       0.96      0.94      0.95      1003

    accuracy                           0.95      2000
   macro avg       0.95      0.95      0.95      2000
weighted avg       0.95      0.95      0.95      2000

---------------------------------------------------------------------------     
Test Data Accuracy: 0.948
---------------------------------------------------------------------------     
Tuning Time: 34.33871078491211
Best Parameters: {'kernel': 'rbf', 'C': 100}
      


**Bayesian Optimization:**

In [58]:
bsc = BayesSearchCV(svm, search_spaces=param_grid, n_iter=5, n_jobs=-1)
bsc.fit(X_train, y_train)

In [61]:
# Best Params:
bsc.best_params_

OrderedDict([('C', 100), ('kernel', 'rbf')])

In [62]:
# Best CV Score:
print(f'Best CV Score: {round(float(bsc.best_score_),2)}')

Best CV Score: 0.95


In [64]:
# Results:
bsc_res_df = pd.DataFrame(bsc.cv_results_)
bsc_res_df

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_C,param_kernel,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.210816,0.013795,0.014654,0.002374,0.1,linear,"{'C': 0.1, 'kernel': 'linear'}",0.813333,0.8,0.813333,0.841667,0.825,0.818667,0.01396,3
1,0.071056,0.005962,0.033318,0.006877,1.0,rbf,"{'C': 1, 'kernel': 'rbf'}",0.94,0.95,0.95,0.958333,0.95,0.949667,0.005812,2
2,0.098445,0.008417,0.017477,0.007392,100.0,rbf,"{'C': 100, 'kernel': 'rbf'}",0.946667,0.951667,0.951667,0.95,0.955,0.951,0.002708,1
3,30.287513,1.370207,0.010307,0.002999,100.0,linear,"{'C': 100, 'kernel': 'linear'}",0.813333,0.796667,0.813333,0.84,0.826667,0.818,0.014545,5
4,0.183494,0.009661,0.016296,0.000852,0.1,linear,"{'C': 0.1, 'kernel': 'linear'}",0.813333,0.8,0.813333,0.841667,0.825,0.818667,0.01396,3


In [65]:
# Best estimator
model = bsc.best_estimator_

# Predictions:
y_pred = model.predict(X_test)

print(f"""Bayesian Search CV:
{classification_report(y_test, y_pred)}
{'-'*75}     
Test Data Accuracy: {accuracy_score(y_test, y_pred)}
{'-'*75}     
Tuning Time: {bsc_res_df['mean_fit_time'].sum()}
Best Parameters: {bsc.best_params_}
      """)


Bayesian Search CV:
              precision    recall  f1-score   support

           0       0.94      0.96      0.95       997
           1       0.96      0.94      0.95      1003

    accuracy                           0.95      2000
   macro avg       0.95      0.95      0.95      2000
weighted avg       0.95      0.95      0.95      2000

---------------------------------------------------------------------------     
Test Data Accuracy: 0.948
---------------------------------------------------------------------------     
Tuning Time: 30.851324081420902
Best Parameters: OrderedDict([('C', 100), ('kernel', 'rbf')])
      


---

By Kirtan Ghelani $@SculptSoft$