# Task 29: Hyperparameter Tuning Techniques:

**Hyperparameter tuning** allows data scientists to tweak model performance for optimal results. This process is an essential part of machine learning, and choosing appropriate hyperparameter values is crucial for success.

In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm.

Models can have many hyperparameters and finding the best combination of parameters can be treated as a search problem. The best strategies for Hyperparameter tuning are:

- GridSearchCV
- RandomizedSearchCV
- Bayesian Optimization

In [1]:
# importing libraries

import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

In [2]:
# loading the dataset

data = pd.read_csv('E:\Data Science BTF\Task 27/heart.csv')
data.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [3]:
data.shape

(303, 14)

In [4]:
X = data.iloc[:,0:-1]
y = data.iloc[:,-1]

In [5]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [6]:
print(X_train.shape)
print(X_test.shape)

(242, 13)
(61, 13)


In [7]:
rf = RandomForestClassifier()
gb = GradientBoostingClassifier()
svc = SVC()
lo_reg = LogisticRegression(max_iter=5000)

In [8]:
rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)
print("Accuracy Score of Random Forest Classifier: ", accuracy_score(y_test,y_pred))

Accuracy Score of Random Forest Classifier:  0.8524590163934426


In [9]:
gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)
print("Accuracy Score of Gradient Boosting Classifier: ", accuracy_score(y_test,y_pred))

Accuracy Score of Gradient Boosting Classifier:  0.7704918032786885


In [10]:
svc.fit(X_train, y_train)
y_pred = svc.predict(X_test)
print("Accuracy Score of Support Vector Classifier: ", accuracy_score(y_test,y_pred))

Accuracy Score of Support Vector Classifier:  0.7049180327868853


In [11]:
lo_reg.fit(X_train, y_train)
y_pred = lo_reg.predict(X_test)
print("Accuracy Score of Random Forest Classifier: ", accuracy_score(y_test,y_pred))

Accuracy Score of Random Forest Classifier:  0.8688524590163934


As we see the **Accuracy Score** of all four algorithms, If we apply RandomForest at any problem directly state out of the box it is always on the Top 3 algorithm (in terms of performance).

We can improve the accuracy of RandomForest algorithm if tuned the hyperparameter. That's what **"Hyperparameter Tuning Techniques"** comes into play.

### GridSearchCV:

Grid search can be considered as a “brute force” approach to hyperparameter optimization.
- We fit the model using all possible combinations after creating a grid of potential discrete hyperparameter values.
- We log each set’s model performance and then choose the combination that produces the best results.

This approach is called GridSearchCV, because it searches for the best set of hyperparameters from a grid of hyperparameters values.

In [12]:
# no. of trees in random forest
n_estimators = [20,60,100,120] # 4

# no. of features to consider at every split
max_features = [0.2,0.6,1.0] # 4 * 3 = 12

# max. number of levels in tree
max_depth = [2,8,None] # 12 * 3 = 36

# no. of samples
max_samples = [0.5,0.75,0.9] # 36 * 3 = 108

# total combinations we create is 108!
# means 108 different random forest train

In [13]:
param_grid = {'n_estimators': n_estimators,
               'max_features': max_features,
               'max_depth': max_depth,
               'max_samples': max_samples
             }
print(param_grid)

{'n_estimators': [20, 60, 100, 120], 'max_features': [0.2, 0.6, 1.0], 'max_depth': [2, 8, None], 'max_samples': [0.5, 0.75, 0.9]}


In [14]:
rf = RandomForestClassifier()

In [15]:
from sklearn.model_selection import GridSearchCV

rf_grid = GridSearchCV(estimator = rf,
                       param_grid = param_grid,
                       cv = 5)

In [16]:
rf_grid.fit(X_train,y_train)

GridSearchCV(cv=5, estimator=RandomForestClassifier(),
             param_grid={'max_depth': [2, 8, None],
                         'max_features': [0.2, 0.6, 1.0],
                         'max_samples': [0.5, 0.75, 0.9],
                         'n_estimators': [20, 60, 100, 120]})

In [17]:
# To check for best parameter
print("Best Hyperparameters:")
rf_grid.best_params_

Best Hyperparameters:


{'max_depth': 2, 'max_features': 0.2, 'max_samples': 0.9, 'n_estimators': 120}

In [18]:
print("Best RandomForest Score for GridSearchCV:")
rf_grid.best_score_

Best RandomForest Score for GridSearchCV:


0.8388605442176871

### RandomSearchCV:

The **RandomSearchCV** or **RandomizedSearchCV** method selects values at random as opposed to the grid search method’s use of a predetermined set of numbers.
- Every iteration, random search attempts a different set of hyperparameters and logs the model’s performance. 
- It returns the combination that provided the best outcome after several iterations. 

This approach reduces unnecessary computation.
<br>
**RandomizedSearchCV** solves the drawbacks of GridSearchCV. The advantage is that, in most cases, a random search will produce a comparable result faster than a grid search.

In [19]:
# no. of trees in random forest
n_estimators = [20,60,100,120] # 4

# no. of features to consider at every split
max_features = [0.2,0.6,1.0] # 4 * 3 = 12

# max. number of levels in tree
max_depth = [2,8,None] # 12 * 3 = 36

# no. of samples
max_samples = [0.5,0.75,0.9] 

# bootstrap sample
bootstrap = [True,False]

# Min. number of samples required to split a node
min_samples_split = [2, 5]

# Min. number of samples required at each leaf node
min_samples_leaf = [1, 2]

In [20]:
param_grid_2 = {'n_estimators': n_estimators,
               'max_features': max_features,
               'max_depth': max_depth,
               'max_samples': max_samples,
                'bootstrap': bootstrap,
                'min_samples_split': min_samples_split,
                'min_samples_leaf': min_samples_leaf
             }
print(param_grid_2)

{'n_estimators': [20, 60, 100, 120], 'max_features': [0.2, 0.6, 1.0], 'max_depth': [2, 8, None], 'max_samples': [0.5, 0.75, 0.9], 'bootstrap': [True, False], 'min_samples_split': [2, 5], 'min_samples_leaf': [1, 2]}


In [21]:
from sklearn.model_selection import RandomizedSearchCV

rf_grid_2 = RandomizedSearchCV(estimator = rf,
                       param_distributions = param_grid_2,
                       cv = 5)

In [22]:
rf_grid_2.fit(X_train, y_train)

RandomizedSearchCV(cv=5, estimator=RandomForestClassifier(),
                   param_distributions={'bootstrap': [True, False],
                                        'max_depth': [2, 8, None],
                                        'max_features': [0.2, 0.6, 1.0],
                                        'max_samples': [0.5, 0.75, 0.9],
                                        'min_samples_leaf': [1, 2],
                                        'min_samples_split': [2, 5],
                                        'n_estimators': [20, 60, 100, 120]})

In [23]:
print("Best Hyperparameters:")
rf_grid_2.best_params_

Best Hyperparameters:


{'n_estimators': 20,
 'min_samples_split': 5,
 'min_samples_leaf': 1,
 'max_samples': 0.9,
 'max_features': 0.6,
 'max_depth': 2,
 'bootstrap': True}

In [24]:
print("Best RandomForest Score for RandomizedSearchCV:")
rf_grid_2.best_score_

Best RandomForest Score for RandomizedSearchCV:


0.8100340136054422

**Grid search** and **Random search** are often inefficient because they evaluate many unsuitable hyperparameter combinations without considering the previous iterations’ results. Then, **Bayesian Optimization** comes into play. Let's explore this!

### Bayesian Optimization:

**Bayesian optimization**, treats the search for optimal hyperparameters as an optimization problem.
- It considers the previous evaluation results when selecting the next hyperparameter combination and applies a probabilistic function to choose the combination that will likely yield the best results.

This method discovers a good hyperparameter combination in relatively few iterations.

In [50]:
from skopt import BayesSearchCV, space, plots

In [51]:
params = {
    'learning_rate': space.Real(1e-5, 1, prior = 'log-uniform'),
    'n_estimators': space.Integer(20, 1_500),
    'subsample': space.Real(0.05, 1),
    'max_depth': space.Integer(1, 10),
}

In [57]:
bayes_search = BayesSearchCV(gb, params, n_iter = 30, cv = 5, scoring = 'f1',
                       refit = False)

bayes_search.fit(X_train, y_train)

KeyboardInterrupt: 