### Hyperperameter tuning diff ML models
all Machine learning algorithms have different parameters

https://machinelearningmastery.com/hyperparameters-for-classification-machine-learning-algorithms/

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import pandas as pd
Input_path = '/content/drive/My Drive/ML and AI/Datasets/Iris dataset/iris.csv'
data =  pd.read_csv(Input_path)
display(data.head())

Unnamed: 0,sepal.length,sepal.width,petal.length,petal.width,variety
0,5.1,3.5,1.4,0.2,Setosa
1,4.9,3.0,1.4,0.2,Setosa
2,4.7,3.2,1.3,0.2,Setosa
3,4.6,3.1,1.5,0.2,Setosa
4,5.0,3.6,1.4,0.2,Setosa


In [None]:
X = data.iloc[:, :-1].values    #   X -> Feature Variables
y = data.iloc[:, -1].values #   y ->  Target(Labels)

# Splitting the data into Train and Test
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 1)

In [None]:
# Data is ready

The seven classification algorithms we will look at are as follows:

1. Logistic Regression
2. Ridge Classifier
3. K-Nearest Neighbors (KNN)
4. Support Vector Machine (SVM)
5. Bagged Decision Trees (Bagging)
6. Random Forest
7. Stochastic Gradient Boosting

### Logistic Regression
Logistic regression does not really have any critical hyperparameters to tune.

Sometimes, you can see useful differences in performance or convergence with different solvers (solver).

solver in [‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’]
Regularization (penalty) can sometimes be helpful.

penalty in [‘none’, ‘l1’, ‘l2’, ‘elasticnet’]
Note: not all solvers support all regularization terms.

The C parameter controls the penality strength, which can also be effective.

C in [100, 10, 1.0, 0.1, 0.01]
For the full list of hyperparameters, see:

The example below demonstrates grid searching the key hyperparameters for LogisticRegression on a synthetic binary classification dataset.

Some combinations were omitted to cut back on the warnings/errors.

In [None]:
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression

solvers = ['newton-cg', 'lbfgs', 'liblinear']
penalty = ['l2']
c_values = [100, 10, 1.0, 0.1, 0.01]
model = LogisticRegression()

In [None]:
# define grid search
grid = dict(solver=solvers,penalty=penalty,C=c_values)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.968788 using {'C': 1.0, 'penalty': 'l2', 'solver': 'newton-cg'}
0.958788 (0.053820) with: {'C': 100, 'penalty': 'l2', 'solver': 'newton-cg'}
0.958788 (0.053820) with: {'C': 100, 'penalty': 'l2', 'solver': 'lbfgs'}
0.958485 (0.064397) with: {'C': 100, 'penalty': 'l2', 'solver': 'liblinear'}
0.958788 (0.053820) with: {'C': 10, 'penalty': 'l2', 'solver': 'newton-cg'}
0.958788 (0.053820) with: {'C': 10, 'penalty': 'l2', 'solver': 'lbfgs'}
0.952424 (0.059088) with: {'C': 10, 'penalty': 'l2', 'solver': 'liblinear'}
0.968788 (0.044206) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'newton-cg'}
0.968788 (0.044206) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'lbfgs'}
0.933939 (0.073628) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'liblinear'}
0.914848 (0.073123) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'newton-cg'}
0.914848 (0.073123) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'lbfgs'}
0.755758 (0.080211) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'liblinear'}
0.791212 (0.071523) wit

In [None]:
# Print the tuned parameters and score 
print("Logistic Regression: {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

Logistic Regression: {'C': 1.0, 'penalty': 'l2', 'solver': 'newton-cg'}
Best score is 0.9687878787878789


###Ridge Classifier
Ridge regression is a penalized linear regression model for predicting a numerical value.

Nevertheless, it can be very effective when applied to classification.

Perhaps the most important parameter to tune is the regularization strength (alpha). A good starting point might be values in the range [0.1 to 1.0]

alpha in [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
For the full list of hyperparameters, see:

sklearn.linear_model.RidgeClassifier API.
The example below demonstrates grid searching the key hyperparameters for RidgeClassifier on a synthetic binary classification dataset.

In [None]:
# example of grid searching key hyperparametres for ridge classifier
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import RidgeClassifier

model = RidgeClassifier()
alpha = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

In [None]:
# define grid search
grid = dict(alpha=alpha)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.805758 using {'alpha': 0.9}
0.799697 (0.102247) with: {'alpha': 0.1}
0.799697 (0.102247) with: {'alpha': 0.2}
0.796364 (0.106964) with: {'alpha': 0.3}
0.796364 (0.106964) with: {'alpha': 0.4}
0.796364 (0.106964) with: {'alpha': 0.5}
0.796364 (0.106964) with: {'alpha': 0.6}
0.799697 (0.108572) with: {'alpha': 0.7}
0.802727 (0.102534) with: {'alpha': 0.8}
0.805758 (0.098850) with: {'alpha': 0.9}
0.805758 (0.098850) with: {'alpha': 1.0}


###K-Nearest Neighbors (KNN)

In [None]:
# define models and parameters
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier()
n_neighbors = range(1, 21, 2)
weights = ['uniform', 'distance']
metric = ['euclidean', 'manhattan', 'minkowski']

In [None]:
# define grid search
grid = dict(n_neighbors=n_neighbors,weights=weights,metric=metric)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.975556 using {'metric': 'euclidean', 'n_neighbors': 15, 'weights': 'uniform'}
0.960000 (0.044222) with: {'metric': 'euclidean', 'n_neighbors': 1, 'weights': 'uniform'}
0.960000 (0.044222) with: {'metric': 'euclidean', 'n_neighbors': 1, 'weights': 'distance'}
0.960000 (0.040734) with: {'metric': 'euclidean', 'n_neighbors': 3, 'weights': 'uniform'}
0.960000 (0.040734) with: {'metric': 'euclidean', 'n_neighbors': 3, 'weights': 'distance'}
0.964444 (0.037450) with: {'metric': 'euclidean', 'n_neighbors': 5, 'weights': 'uniform'}
0.964444 (0.037450) with: {'metric': 'euclidean', 'n_neighbors': 5, 'weights': 'distance'}
0.968889 (0.037450) with: {'metric': 'euclidean', 'n_neighbors': 7, 'weights': 'uniform'}
0.966667 (0.037515) with: {'metric': 'euclidean', 'n_neighbors': 7, 'weights': 'distance'}
0.966667 (0.047920) with: {'metric': 'euclidean', 'n_neighbors': 9, 'weights': 'uniform'}
0.968889 (0.037450) with: {'metric': 'euclidean', 'n_neighbors': 9, 'weights': 'distance'}
0.971111 

In [None]:
# Print the tuned parameters and score 
print("K-Nearest Neighbors (KNN): {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

K-Nearest Neighbors (KNN): {'metric': 'euclidean', 'n_neighbors': 15, 'weights': 'uniform'}
Best score is 0.9755555555555556


###Support Vector Machine (SVM)

In [None]:
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

model = SVC()
kernel = ['poly', 'rbf', 'sigmoid']
C = [50, 10, 1.0, 0.1, 0.01]
gamma = ['scale']

In [None]:
# define grid search
grid = dict(kernel=kernel,C=C,gamma=gamma)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.977778 using {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.973333 (0.040734) with: {'C': 50, 'gamma': 'scale', 'kernel': 'poly'}
0.964444 (0.041216) with: {'C': 50, 'gamma': 'scale', 'kernel': 'rbf'}
0.028889 (0.044500) with: {'C': 50, 'gamma': 'scale', 'kernel': 'sigmoid'}
0.971111 (0.037251) with: {'C': 10, 'gamma': 'scale', 'kernel': 'poly'}
0.977778 (0.031427) with: {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
0.048889 (0.051448) with: {'C': 10, 'gamma': 'scale', 'kernel': 'sigmoid'}
0.962222 (0.037251) with: {'C': 1.0, 'gamma': 'scale', 'kernel': 'poly'}
0.964444 (0.044666) with: {'C': 1.0, 'gamma': 'scale', 'kernel': 'rbf'}
0.073333 (0.052634) with: {'C': 1.0, 'gamma': 'scale', 'kernel': 'sigmoid'}
0.977778 (0.035832) with: {'C': 0.1, 'gamma': 'scale', 'kernel': 'poly'}
0.933333 (0.059628) with: {'C': 0.1, 'gamma': 'scale', 'kernel': 'rbf'}
0.073333 (0.052634) with: {'C': 0.1, 'gamma': 'scale', 'kernel': 'sigmoid'}
0.913333 (0.064750) with: {'C': 0.01, 'gamma': 'scale', 

In [None]:
# Print the tuned parameters and score 
print("Support Vector Machine (SVM): {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

Support Vector Machine (SVM): {'C': 10, 'gamma': 'scale', 'kernel': 'rbf'}
Best score is 0.9777777777777779


###Bagged Decision Trees (Bagging)

In [None]:
#example of grid searching key hyperparameters for BaggingClassifier
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import BaggingClassifier

# define models and parameters
model = BaggingClassifier()
n_estimators = [10, 100, 1000]

In [None]:
# define grid search
grid = dict(n_estimators=n_estimators)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.956364 using {'n_estimators': 1000}
0.949394 (0.047452) with: {'n_estimators': 10}
0.953030 (0.052596) with: {'n_estimators': 100}
0.956364 (0.052297) with: {'n_estimators': 1000}


In [None]:
# Print the tuned parameters and score 
print("Bagged Decision Trees (Bagging): {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

Bagged Decision Trees (Bagging): {'n_estimators': 1000}
Best score is 0.9563636363636364


##Random Forest

In [None]:
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
n_estimators = [10, 100, 1000]
max_features = ['sqrt', 'log2']

In [None]:
# define grid search
grid = dict(n_estimators=n_estimators,max_features=max_features)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.953030 using {'max_features': 'sqrt', 'n_estimators': 1000}
0.940303 (0.066294) with: {'max_features': 'sqrt', 'n_estimators': 10}
0.946667 (0.062884) with: {'max_features': 'sqrt', 'n_estimators': 100}
0.953030 (0.052596) with: {'max_features': 'sqrt', 'n_estimators': 1000}
0.946364 (0.068166) with: {'max_features': 'log2', 'n_estimators': 10}
0.950000 (0.057436) with: {'max_features': 'log2', 'n_estimators': 100}
0.953030 (0.052596) with: {'max_features': 'log2', 'n_estimators': 1000}


In [None]:
# Print the tuned parameters and score 
print("Random Forest: {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

Random Forest: {'max_features': 'sqrt', 'n_estimators': 1000}
Best score is 0.953030303030303


###Stochastic Gradient Boosting

In [None]:
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier

model = GradientBoostingClassifier()
n_estimators = [10, 100, 1000]
learning_rate = [0.001, 0.01, 0.1]
subsample = [0.5, 0.7, 1.0]
max_depth = [3, 7, 9]

In [None]:
# define grid search
grid = dict(learning_rate=learning_rate, n_estimators=n_estimators, subsample=subsample, max_depth=max_depth)
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
grid_search = GridSearchCV(estimator=model, param_grid=grid, n_jobs=-1, cv=cv, scoring='accuracy',error_score=0)
grid_result = grid_search.fit(X_train, y_train)

In [None]:
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.959394 using {'learning_rate': 0.01, 'max_depth': 9, 'n_estimators': 1000, 'subsample': 0.7}
0.463030 (0.164106) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 10, 'subsample': 0.5}
0.460000 (0.161699) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 10, 'subsample': 0.7}
0.463030 (0.164106) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 10, 'subsample': 1.0}
0.952727 (0.053956) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 100, 'subsample': 0.5}
0.955455 (0.054259) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 100, 'subsample': 0.7}
0.955758 (0.058857) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 100, 'subsample': 1.0}
0.953030 (0.052596) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 1000, 'subsample': 0.5}
0.956364 (0.057323) with: {'learning_rate': 0.001, 'max_depth': 3, 'n_estimators': 1000, 'subsample': 0.7}
0.953333 (0.057339) with: {'learning_rate': 0.001, '

In [None]:
# Print the tuned parameters and score 
print("Stochastic Gradient Boosting: {}".format(grid_result.best_params_)) 
print("Best score is {}".format(grid_result.best_score_)) 

Stochastic Gradient Boosting: {'learning_rate': 0.01, 'max_depth': 9, 'n_estimators': 1000, 'subsample': 0.7}
Best score is 0.9593939393939392
