<a href="https://colab.research.google.com/github/and-is/learning-pytorch/blob/main/hyperparemeterTuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Hyperparameter Tuning
Tuning the parameters like learning rate, no of hidden layers and so on for best outcome.
\
Ways:
- Gridsearch CV
- Random CV
- Bayesian Search
\
We're doing Bayesian Search method using Optuna.

### Content
- Grid search CV works by finding accuracy for every combination there is.
i.e. for every iteration of possible values of two or three parameters.
- Random Search CV does the same but for only random values. So the obvious downside is that we might miss the best value.


Bayesian Search uses a different way of doing that search. It assumes there's a relation between the two hyperparameters we're concerned with which gives accuracy. We aim to find nature of that graph and find the maxima, which is our required accuracy.
\
accuracy = f(param1, param2)
\
Trying few combinations out of those all gives us a simple graph. Then we can visualize the relation. Then new combinations are intelligently tried based on what we're seeing and finally, we obtain the accurate nature out of those parameters.
\
Optuna helps us do this bayesian search.

#### Optuna

- Study in Optuna is an optimization session encompassing multiple trials. i.e. overall experiment of our process.
- Trial is a single iteration of the optimization process where a specific set of hyperparameters is evaluated. Each trial runs the accuracy function once with a distinct values.
- Trial parameters are the specific hyperparameter values chosen during a trial.
- Objective function is our accuracy function, i.e. the relation we want to find and optimize here.
- Sampler is the algorithm which suggests which combination (which hyperparameter values) to try next based on what we already saw.
- TPE (Tree-structured Parzen Estimator) used as sampler behind the hood.


In [1]:
!pip install optuna

Collecting optuna
  Downloading optuna-4.2.0-py3-none-any.whl.metadata (17 kB)
Collecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.14.1-py3-none-any.whl.metadata (7.4 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Collecting Mako (from alembic>=1.5.0->optuna)
  Downloading Mako-1.3.8-py3-none-any.whl.metadata (2.9 kB)
Downloading optuna-4.2.0-py3-none-any.whl (383 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m383.4/383.4 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading alembic-1.14.1-py3-none-any.whl (233 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m233.6/233.6 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading colorlog-6.9.0-py3-none-any.whl (11 kB)
Downloading Mako-1.3.8-py3-none-any.whl (78 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.6/78.6 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: Mak

In [2]:
import optuna
# from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [3]:
import pandas as pd

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
           'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)

df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [4]:
import numpy as np
cols_with_missing_vals = ['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']
df[cols_with_missing_vals] = df[cols_with_missing_vals].replace(0, np.nan)
df.fillna(df.mean(), inplace=True)
print(df.isnull().sum())

Pregnancies                 0
Glucose                     0
BloodPressure               0
SkinThickness               0
Insulin                     0
BMI                         0
DiabetesPedigreeFunction    0
Age                         0
Outcome                     0
dtype: int64


In [5]:
X = df.drop('Outcome', axis=1)
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print(f'Training set shape: {X_train.shape}')
print(f'Test set shape: {X_test.shape}')

Training set shape: (537, 8)
Test set shape: (231, 8)


In [8]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

# defining the objective function
def objective(trial):
  # suggest values for hyperparemeter
  n_estimators = trial.suggest_int('n_estimators', 100,300)
  max_depth = trial.suggest_int('max_depth', 10,30)

  # creating model with suggested hyperparams
  model = RandomForestClassifier(
      n_estimators=n_estimators,
      max_depth=max_depth,
      random_state=42
  )

  # perform 3 times cross validation and calculate accuracy
  score = cross_val_score(model, X_train, y_train, cv=3, scoring='accuracy').mean()

  return score


In [9]:
study = optuna.create_study(direction='maximize', sampler=optuna.samplers.TPESampler())
# can used optuna.samplers.RandomSampler() as well for selecting sampler randomly.
study.optimize(objective, n_trials=50)

[I 2025-02-02 07:57:16,703] A new study created in memory with name: no-name-5d6d2cad-39a6-42fb-8508-6bf9f2698b6d
[I 2025-02-02 07:57:18,348] Trial 0 finished with value: 0.7728119180633147 and parameters: {'n_estimators': 237, 'max_depth': 12}. Best is trial 0 with value: 0.7728119180633147.
[I 2025-02-02 07:57:19,294] Trial 1 finished with value: 0.7746741154562384 and parameters: {'n_estimators': 110, 'max_depth': 28}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-02-02 07:57:20,284] Trial 2 finished with value: 0.7653631284916201 and parameters: {'n_estimators': 111, 'max_depth': 11}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-02-02 07:57:22,097] Trial 3 finished with value: 0.7709497206703911 and parameters: {'n_estimators': 279, 'max_depth': 17}. Best is trial 1 with value: 0.7746741154562384.
[I 2025-02-02 07:57:23,323] Trial 4 finished with value: 0.7728119180633147 and parameters: {'n_estimators': 219, 'max_depth': 15}. Best is trial 1 with value: 0.7746

In [10]:
print(f'Best trial accuracy: {study.best_trial.value}')
print(f'Best hyperparameters: {study.best_trial.params}')

Best trial accuracy: 0.7858472998137803
Best hyperparameters: {'n_estimators': 121, 'max_depth': 15}


In [13]:
# In Grid Search, we call sampler=optuna.samplers.GridSampler(search_space)
#
# search_space = {
#   'param-1': [2,3,4],
#   'param-2': [4,5,7]
# }

#### Optuna Visualizations

In [14]:
from optuna.visualization import plot_optimization_history, plot_parallel_coordinate, plot_slice, plot_contour, plot_param_importances

In [15]:
plot_optimization_history(study).show()

In [16]:
plot_parallel_coordinate(study).show()

In [17]:
plot_slice(study).show()

In [18]:
plot_contour(study).show()

In [19]:
plot_param_importances(study).show()

#### Optuna with Multiple Models but Single Hyperparameter Tuning

Can set a hyperparameter to toggle between different algorithms.

In [20]:
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC

In [21]:
# Define the objective function for Optuna
def objective(trial):
    # Choose the algorithm to tune
    classifier_name = trial.suggest_categorical('classifier', ['SVM', 'RandomForest', 'GradientBoosting'])

    if classifier_name == 'SVM':
        # SVM hyperparameters
        c = trial.suggest_float('C', 0.1, 100, log=True)
        kernel = trial.suggest_categorical('kernel', ['linear', 'rbf', 'poly', 'sigmoid'])
        gamma = trial.suggest_categorical('gamma', ['scale', 'auto'])

        model = SVC(C=c, kernel=kernel, gamma=gamma, random_state=42)

    elif classifier_name == 'RandomForest':
        # Random Forest hyperparameters
        n_estimators = trial.suggest_int('n_estimators', 50, 300)
        max_depth = trial.suggest_int('max_depth', 3, 20)
        min_samples_split = trial.suggest_int('min_samples_split', 2, 10)
        min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 10)
        bootstrap = trial.suggest_categorical('bootstrap', [True, False])

        model = RandomForestClassifier(
            n_estimators=n_estimators,
            max_depth=max_depth,
            min_samples_split=min_samples_split,
            min_samples_leaf=min_samples_leaf,
            bootstrap=bootstrap,
            random_state=42
        )

    elif classifier_name == 'GradientBoosting':
        # Gradient Boosting hyperparameters
        n_estimators = trial.suggest_int('n_estimators', 50, 300)
        learning_rate = trial.suggest_float('learning_rate', 0.01, 0.3, log=True)
        max_depth = trial.suggest_int('max_depth', 3, 20)
        min_samples_split = trial.suggest_int('min_samples_split', 2, 10)
        min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 10)

        model = GradientBoostingClassifier(
            n_estimators=n_estimators,
            learning_rate=learning_rate,
            max_depth=max_depth,
            min_samples_split=min_samples_split,
            min_samples_leaf=min_samples_leaf,
            random_state=42
        )

    # Perform cross-validation and return the mean accuracy
    score = cross_val_score(model, X_train, y_train, cv=3, scoring='accuracy').mean()
    return score

In [22]:
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

[I 2025-02-02 08:06:40,437] A new study created in memory with name: no-name-deff10de-f15e-4788-ab55-cc2a583022b6
[I 2025-02-02 08:06:41,214] Trial 0 finished with value: 0.7672253258845437 and parameters: {'classifier': 'RandomForest', 'n_estimators': 65, 'max_depth': 10, 'min_samples_split': 4, 'min_samples_leaf': 10, 'bootstrap': True}. Best is trial 0 with value: 0.7672253258845437.
[I 2025-02-02 08:06:41,456] Trial 1 finished with value: 0.6797020484171323 and parameters: {'classifier': 'SVM', 'C': 76.60742671798609, 'kernel': 'poly', 'gamma': 'scale'}. Best is trial 0 with value: 0.7672253258845437.
[I 2025-02-02 08:06:41,522] Trial 2 finished with value: 0.7169459962756052 and parameters: {'classifier': 'SVM', 'C': 8.131518772182526, 'kernel': 'poly', 'gamma': 'scale'}. Best is trial 0 with value: 0.7672253258845437.
[I 2025-02-02 08:06:44,443] Trial 3 finished with value: 0.7579143389199254 and parameters: {'classifier': 'RandomForest', 'n_estimators': 266, 'max_depth': 7, 'min

In [23]:
best_trial = study.best_trial
print("Best trial parameters:", best_trial.params)
print("Best trial accuracy:", best_trial.value)

Best trial parameters: {'classifier': 'SVM', 'C': 0.11831525923946343, 'kernel': 'linear', 'gamma': 'scale'}
Best trial accuracy: 0.7895716945996275


In [28]:
# Can visualize every statistic using dataframe thingy here.
study.trials_dataframe()['params_classifier'].value_counts()

Unnamed: 0_level_0,count
params_classifier,Unnamed: 1_level_1
SVM,78
RandomForest,12
GradientBoosting,10
