# Hyper-parameter Tunning of Machine Learning (ML) Models



### Code for Regression Problems

#### `Dataset Used:`
Boston housing dataset 

#### `Machine Learning Algorithm Used:`
* Random Forest (RF) 
* Support Vector Machine (SVM) 
* K-Nearest Neighbor (KNN) 
* Artificial Neural Network (ANN)

#### `Hyper-parameter Tuning Algorithms Used:`
* Grid Search 
* Random Search
* Bayesian Optimization with Gaussian Processes (BO-GP)
* Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE)

---

In [1]:
# Importing required libraries
import numpy as np
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import scipy.stats as stats
from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR

#### Loading Boston Housing Dataset
Boston Housing dataset contains information about different houses in Boston. It contains 506 records with 13 columns. The main goal is to predict the value of prices of the house using the given features.

For more details about the dataset click here: 
[Details-1](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html) ,
[Details-2](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_boston.html)

In [2]:
# Loading dataset
X, y = datasets.load_boston(return_X_y=True)
datasets.load_boston()

{'DESCR': ".. _boston_dataset:\n\nBoston house prices dataset\n---------------------------\n\n**Data Set Characteristics:**  \n\n    :Number of Instances: 506 \n\n    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.\n\n    :Attribute Information (in order):\n        - CRIM     per capita crime rate by town\n        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.\n        - INDUS    proportion of non-retail business acres per town\n        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)\n        - NOX      nitric oxides concentration (parts per 10 million)\n        - RM       average number of rooms per dwelling\n        - AGE      proportion of owner-occupied units built prior to 1940\n        - DIS      weighted distances to five Boston employment centres\n        - RAD      index of accessibility to radial highways\n        - TAX      full-value property-tax rate p

In [3]:
print(X.shape) #The data matrix

(506, 13)


In [4]:
print(y.shape) #The regression target

(506,)


### Baseline Machine Learning Models: Regressor with default Hyper-parameters

### `Random Forest` 

In [5]:
# Random Forest (RF) with 3-fold cross validation
RF_clf = RandomForestRegressor()
RF_scores = cross_val_score(RF_clf, X, y, cv = 3, scoring = 'neg_mean_squared_error') 
print("Mean Square Error (RF) :" + str(-RF_scores.mean()))

Mean Square Error (RF) :29.748647641812237


### `Support Vector Machine`

In [6]:
# Support Vector Machine (SVM)
SVM_clf = SVR(gamma ='scale')
SVM_scores = cross_val_score(SVM_clf, X, y, cv = 3, scoring = 'neg_mean_squared_error')
print("Mean Square Error (SVM) :" + str(-SVM_scores.mean()))

Mean Square Error (SVM) :77.42951812579332


### `K-Nearest Neighbor` 

In [7]:
# K-Nearest Neighbor (KNN)
KN_clf = KNeighborsRegressor()
KN_scores = cross_val_score(KN_clf, X, y, cv = 3,scoring = 'neg_mean_squared_error')
print("Mean Square Error (KNN) :" + str(-KN_scores.mean()))

Mean Square Error (KNN) :81.48773186343571


### `Artificial Neural Network`

In [8]:
# Artificial Neural Network (ANN)
from keras.models import Sequential, Model
from keras.layers import Dense, Input
from keras.wrappers.scikit_learn import KerasRegressor
from keras.callbacks import EarlyStopping

def ann_model(optimizer = 'adam', neurons = 32,batch_size = 32, epochs = 50 ,activation = 'relu',patience = 5,loss = 'mse'):
    model = Sequential()
    model.add(Dense(neurons, input_shape = (X.shape[1],), activation = activation))
    model.add(Dense(neurons, activation = activation))
    model.add(Dense(1))
    model.compile(optimizer = optimizer ,loss = loss) 
    early_stopping = EarlyStopping(monitor = "loss", patience = patience)
    history = model.fit(X, y,batch_size = batch_size,epochs = epochs,callbacks = [early_stopping],verbose=0)
    return model

ANN_clf = KerasRegressor(build_fn = ann_model, verbose = 0)
ANN_scores = cross_val_score(ANN_clf, X, y, cv = 3,scoring = 'neg_mean_squared_error')
print("Mean Square Error (ANN):"+ str(-ANN_scores.mean()))

Mean Square Error (ANN):44.56844398014558


### Hyper-parameter Tuning Algorithms

### ` 1] Grid Search`

In [9]:
from sklearn.model_selection import GridSearchCV

#### `Random Forest`

In [10]:
# Random Forest (RF)
RF_params = {
    'n_estimators': [10, 20, 30],
    'max_depth': [15,20,25,30,50],
}
RF_clf = RandomForestRegressor(random_state = 0)
RF_grid = GridSearchCV(RF_clf, RF_params, cv = 3, scoring = 'neg_mean_squared_error')
RF_grid.fit(X, y)
print(RF_grid.best_params_)
print("Mean Square Error (RF) : "+ str(-RF_grid.best_score_))

{'max_depth': 15, 'n_estimators': 20}
Mean Square Error (RF) : 29.059884504204902


#### `Support Vector Machine`

In [11]:
# Support Vector Machine (SVM)
SVM_params = {
    'C': [1,10, 100,1000],
    'kernel' :['poly','rbf','sigmoid'],
    'epsilon':[0.001, 0.01,0.1,1]
}
SVM_clf = SVR(gamma = 'scale')
SVM_grid = GridSearchCV(SVM_clf, SVM_params, cv = 3, scoring = 'neg_mean_squared_error')
SVM_grid.fit(X, y)
print(SVM_grid.best_params_)
print("Mean Square Error (SVM) :"+ str(-SVM_grid.best_score_))

{'C': 1000, 'epsilon': 1, 'kernel': 'rbf'}
Mean Square Error (SVM) :49.71699970931197


#### `K-Nearest Neighbor`

In [12]:
# K-nearest Neighnor (KNN)
KNN_params = {
    'n_neighbors': [2,4,6,8]
}
KNN_clf = KNeighborsRegressor()
KNN_grid = GridSearchCV(KNN_clf, KNN_params, cv=3, scoring='neg_mean_squared_error')
KNN_grid.fit(X, y)
print(KNN_grid.best_params_)
print("Mean Square Error (KNN) :"+ str(-KNN_grid.best_score_))

{'n_neighbors': 6}
Mean Square Error (KNN) :80.83005201647829


#### `Artificial Neural Network`

In [13]:
# Artificial Neural Network (ANN)
RF_params = {
    'optimizer': ['adam','rmsprop'],
    'activation': ['relu','tanh'],
    'loss': ['mse','mae'],
    'batch_size': [16,32],
    'neurons':[16,32],
    'epochs':[20,50],
    'patience':[3,5]
}
RF_clf = KerasRegressor(build_fn = ann_model, verbose = 0)
RF_grid = GridSearchCV(RF_clf, RF_params, cv=3,scoring = 'neg_mean_squared_error')
RF_grid.fit(X, y)
print(RF_grid.best_params_)
print("MSE:"+ str(-RF_grid.best_score_))

{'activation': 'tanh', 'batch_size': 16, 'epochs': 50, 'loss': 'mse', 'neurons': 32, 'optimizer': 'rmsprop', 'patience': 3}
MSE:37.868857592056166


###  `2] Random Search`

In [14]:
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint as sp_randint

#### `Random Forest`

In [15]:
# Random Forest (RF)
RF_params = {
    'n_estimators': sp_randint(10,100),
    'max_depth': sp_randint(5,50),
    "criterion":['mse','mae']
} 
RF_clf = RandomForestRegressor(random_state = 0)
RF_Random = RandomizedSearchCV(RF_clf, param_distributions = RF_params,
                               n_iter = 20 ,cv = 3,scoring = 'neg_mean_squared_error')
RF_Random.fit(X, y)
print(RF_Random.best_params_)
print("Mean Square Error (RF):"+ str(-RF_Random.best_score_))

{'criterion': 'mse', 'max_depth': 15, 'n_estimators': 93}
Mean Square Error (RF):28.819638693418824


#### `Support Vector Machine`

In [16]:
# Support Vector Machine (SVM)
SVM_params = {
    'C': stats.uniform(0,50),
    "kernel":['poly','rbf'],
    "epsilon":stats.uniform(0,1)
}
SVM_clf = SVR(gamma = 'scale')
SVM_Random = RandomizedSearchCV(SVM_clf, param_distributions = SVM_params,
                            n_iter = 20,cv = 3,scoring = 'neg_mean_squared_error')
SVM_Random.fit(X, y)
print(SVM_Random.best_params_)
print("Mean Square Error (SVM) :"+ str(-SVM_Random.best_score_))

{'C': 27.21076956480878, 'epsilon': 0.46048652191242667, 'kernel': 'poly'}
Mean Square Error (SVM) :59.99237088014893


#### `K-Nearest Neighbor`

In [17]:
# K-Nearest Neighbor (KNN)
KNN_params = {
    'n_neighbors': sp_randint(1,20),
}
KNN_clf = KNeighborsRegressor()
KNN_Random = RandomizedSearchCV(KNN_clf, param_distributions = KNN_params,
                            n_iter = 10,cv = 3,scoring = 'neg_mean_squared_error')
KNN_Random.fit(X, y)
print(KNN_Random.best_params_)
print("Mean Square Error (KNN) :"+ str(-KNN_Random.best_score_))

{'n_neighbors': 13}
Mean Square Error (KNN) :80.74121499347262


#### `Artificial Neural Network`

In [18]:
# Artificial Neural Network (ANN)
ANN_params = {
    'optimizer': ['adam','rmsprop'],
    'activation': ['relu','tanh'],
    'loss': ['mse','mae'],
    'batch_size': [16,32],
    'neurons':sp_randint(10,100),
    'epochs':[20,50],
    'patience':sp_randint(5,20)
}
ANN_clf = KerasRegressor(build_fn = ann_model, verbose = 0)
ANN_Random = RandomizedSearchCV(ANN_clf, param_distributions = ANN_params,
                                n_iter = 10,cv = 3,scoring = 'neg_mean_squared_error')
ANN_Random.fit(X, y)
print(ANN_Random.best_params_)
print("Mean Square Error (ANN):"+ str(-ANN_Random.best_score_))

{'activation': 'relu', 'batch_size': 16, 'epochs': 50, 'loss': 'mse', 'neurons': 71, 'optimizer': 'adam', 'patience': 18}
Mean Square Error (ANN):45.48700310136937


### `3] Bayesian Optimization with Gaussian Processes (BO-GP)`

In [19]:
from skopt import Optimizer
from skopt import BayesSearchCV 
from skopt.space import Real, Categorical, Integer

#### `Random Forest`

In [20]:
# Random Forest (RF)
RF_params = {
    'n_estimators': Integer(10,100),
    'max_depth': Integer(5,50),
    "criterion":['mse','mae']
}
RF_clf = RandomForestRegressor(random_state = 0)
RF_Bayes = BayesSearchCV(RF_clf, RF_params,cv = 3,n_iter = 20, scoring = 'neg_mean_squared_error') 
RF_Bayes.fit(X, y)
print(RF_Bayes.best_params_)
print("Mean Square Error (RF):"+ str(-RF_Bayes.best_score_))

OrderedDict([('criterion', 'mse'), ('max_depth', 31), ('n_estimators', 52)])
Mean Square Error (RF):28.846466514254967


### `Support Vector Machine`

In [23]:
# Support Vector Machine (SVM)
SVM_params = {
    "kernel":['poly','rbf'],
    'C': Real(1,50),
    'epsilon': Real(0,1)
}
SVM_clf = SVR(gamma='scale')
SVM_Bayes = BayesSearchCV(SVM_clf, SVM_params,cv = 3,n_iter = 20, scoring = 'neg_mean_squared_error')
SVM_Bayes.fit(X, y)
print(SVM_Bayes.best_params_)
print("Mean Square Error (SVM):"+ str(-SVM_Bayes.best_score_))

OrderedDict([('C', 43.7998974955581), ('epsilon', 0.14977167499173435), ('kernel', 'poly')])
Mean Square Error (SVM):59.450729055644864


#### `K-Nearest Neighbor`

In [24]:
# K-Nearest Neighbor (KNN)
KNN_params = {
    'n_neighbors': Integer(1,20),
}
KNN_clf = KNeighborsRegressor()
KNN_Bayes = BayesSearchCV(KNN_clf, KNN_params,cv = 3,n_iter = 10, scoring = 'neg_mean_squared_error')
KNN_Bayes.fit(X, y)
print(KNN_Bayes.best_params_)
print("Mean Square Error (KNN):"+ str(-KNN_Bayes.best_score_))

OrderedDict([('n_neighbors', 12)])
Mean Square Error (KNN):81.38111289525693


#### `Artificial Neural Network (ANN)`

In [25]:
# Artificial Neural Network (ANN)
ANN_params = {
    'optimizer': ['adam','rmsprop'],
    'activation': ['relu','tanh'],
    'loss': ['mse','mae'],
    'batch_size': [16,32],
    'neurons':Integer(10,100),
    'epochs':[20,50],
    'patience':Integer(5,20)
}
ANN_clf = KerasRegressor(build_fn = ann_model, verbose = 0)
ANN_Bayes = BayesSearchCV(ANN_clf, ANN_params,cv = 3,n_iter = 10, scoring = 'neg_mean_squared_error')
ANN_Bayes.fit(X, y)
print(ANN_Bayes.best_params_)
print("Mean Square Error (ANN):"+ str(-ANN_Bayes.best_score_))

OrderedDict([('activation', 'tanh'), ('batch_size', 28), ('epochs', 44), ('loss', 'mse'), ('neurons', 81), ('optimizer', 'adam'), ('patience', 9)])
Mean Square Error (ANN):43.640675892710725


### `4] Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE)`

In [26]:
from sklearn.model_selection import StratifiedKFold
from hyperopt import hp, fmin, tpe, STATUS_OK, Trials

#### `Random Forest`

In [27]:
# Random Forest (RF)
def RF_fun(params):
    params = {
        'n_estimators': int(params['n_estimators']), 
        'max_depth': int(params['max_depth']),
        "criterion":str(params['criterion'])
    }
    RF_clf = RandomForestRegressor(**params)
    RF_score = -np.mean(cross_val_score(RF_clf, X, y, cv = 3, n_jobs = -1,scoring = "neg_mean_squared_error"))
    return {'loss':RF_score, 'status': STATUS_OK }

RF_space = {
    'n_estimators': hp.quniform('n_estimators', 10, 100, 1),
    'max_depth': hp.quniform('max_depth', 5, 50, 1),
    "criterion":hp.choice('criterion',['mse','mae'])
}

RF_best = fmin(fn = RF_fun, space = RF_space, algo = tpe.suggest, max_evals = 20)
print("Estimated optimum (RF):" +str (RF_best))

100%|██████████| 20/20 [00:20<00:00,  1.01s/it, best loss: 28.50435224273346]
Estimated optimum (RF):{'criterion': 0, 'max_depth': 38.0, 'n_estimators': 44.0}


#### `Support Vector Machine`

In [28]:
# Support Vector Machine (SVM)
def SVM_fun(params):
    params = {
        "kernel":str(params['kernel']),
        'C': abs(float(params['C'])), 
        'epsilon': abs(float(params['epsilon'])),
    }
    SVM_clf = SVR(gamma='scale', **params)
    SVM_score = -np.mean(cross_val_score(SVM_clf, X, y, cv = 3, n_jobs = -1, scoring="neg_mean_squared_error"))
    return {'loss':SVM_score, 'status': STATUS_OK }

SVM_space = {
    "kernel":hp.choice('kernel',['poly','rbf']),
    'C': hp.normal('C', 0, 50),
    'epsilon': hp.normal('epsilon', 0, 1),
}

SVM_best = fmin(fn = SVM_fun ,space = SVM_space, algo=tpe.suggest, max_evals = 20)
print("Estimated optimum (SVM):" +str(SVM_best))

100%|██████████| 20/20 [00:00<00:00, 22.73it/s, best loss: 60.248025652042806]
Estimated optimum (SVM):{'C': 31.83399735036318, 'epsilon': -0.10130152267175614, 'kernel': 0}


#### `K-Nearest Neighbor`

In [29]:
#K-Nearest Neighbor (KNN)
def KNN_fun(params):
    params = {'n_neighbors': abs(int(params['n_neighbors']))}
    KNN_clf = KNeighborsRegressor(**params)
    KNN_score = -np.mean(cross_val_score(KNN_clf, X, y, cv = 3, n_jobs = -1, scoring = "neg_mean_squared_error"))
    return {'loss':KNN_score, 'status': STATUS_OK }

KNN_space = {'n_neighbors': hp.quniform('n_neighbors', 1, 20, 1),}

KNN_best = fmin(fn = KNN_fun, space = KNN_space,algo = tpe.suggest, max_evals = 10)
print("Estimated optimum (KNN):"+str(KNN_best))

100%|██████████| 10/10 [00:00<00:00, 64.76it/s, best loss: 80.83005201647829]
Estimated optimum (KNN):{'n_neighbors': 6.0}


#### `Artificial Neural Network`

In [30]:
#Artificial Neural Network (ANN)
def ANN_fun(params):
    params = {
        "optimizer":str(params['optimizer']),
        "activation":str(params['activation']),
        "loss":str(params['loss']),
        'batch_size': abs(int(params['batch_size'])),
        'neurons': abs(int(params['neurons'])),
        'epochs': abs(int(params['epochs'])),
        'patience': abs(int(params['patience']))
    }
    ANN_clf = KerasRegressor(build_fn = ann_model,**params, verbose = 0)
    ANN_score = -np.mean(cross_val_score(ANN_clf, X, y, cv = 3, scoring = "neg_mean_squared_error"))
    return {'loss':ANN_score, 'status': STATUS_OK }

ANN_space = {
    "optimizer":hp.choice('optimizer',['adam','rmsprop']),
    "activation":hp.choice('activation',['relu','tanh']),
    "loss":hp.choice('loss',['mse','mae']),
    'batch_size': hp.quniform('batch_size', 16, 32,16),
    'neurons': hp.quniform('neurons', 10, 100,10),
    'epochs': hp.quniform('epochs', 20, 50,20),
    'patience': hp.quniform('patience', 5, 20,5),
}

ANN_best = fmin(fn = ANN_fun, space = ANN_space, algo = tpe.suggest, max_evals = 10)
print("Estimated optimum (ANN): " + str(ANN_best))

100%|██████████| 10/10 [00:55<00:00,  5.56s/it, best loss: 36.59635901614879]
Estimated optimum (ANN): {'activation': 1, 'batch_size': 16.0, 'epochs': 40.0, 'loss': 0, 'neurons': 100.0, 'optimizer': 1, 'patience': 10.0}


---