<a href="https://colab.research.google.com/github/shuvad23/Deep-learning-with-PyTorch/blob/main/Hyperparameter_Tuning_the_ANN_using_Optuna(pytorch).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

"Hyperparameter tuning is the process of finding the best configuration of hyperparameters (the settings you choose before training a model) to maximize performance in machine learning and deep learning."

üî• What Are Hyperparameters?
- Hyperparameters are external settings that control how a model learns.
They are not learned from data ‚Äî you pick them manually or let an algorithm search for the best ones.

‚úÖ Examples in Machine Learning:

  - Learning rate (Œ∑)

  - Number of trees in Random Forest

  - Maximum depth of a decision tree

  - Number of neighbors (K) in KNN

  - Regularization strength (C) in SVM or Logistic Regression

‚úÖ Examples in Deep Learning:

  - Learning rate

  - Number of layers (depth)

  - Number of neurons per layer

  - Activation functions

  - Batch size

  - Dropout rate

  - Optimizer (Adam, SGD, RMSprop, etc.)

---

üîß What Is Hyperparameter Tuning?

- Hyperparameter tuning means:

    - Trying different combinations of hyperparameters to find the one that gives the best accuracy, loss, or performance on validation data.

  - It‚Äôs like adjusting the knobs of the model until it performs the best.


üîç Why Is Hyperparameter Tuning Important?

- Because wrong hyperparameters ‚Üí bad results, even if the model architecture is good.

  - Good tuning can:

  - Increase accuracy

  - Reduce overfitting

  - Speed up training

  - Improve model stability

üß™ Common Hyperparameter Tuning Methods

‚≠ê 1. Grid Search

  - Try every possible combination.

  - Pro: Finds best among listed options.

  - Con: Very slow for large search spaces.

‚≠ê 2. Random Search

  - Randomly sample combinations.

  - Pro: Much faster than grid search.

  - Con: Might skip good combinations.

‚≠ê 3. Bayesian Optimization

  - Uses probabilities to choose the next best hyperparameters.

  - Pro: Very efficient

  - Con: Harder to implement

‚≠ê 4. Hyperband / ASHA (Deep Learning)

  - Early-stops bad models and saves training time.

‚≠ê 5. Genetic Algorithms / Evolutionary Search

  - Search based on mutation & selection.

---
üî• Hyperparameter Tuning an ANN Using Optuna (PyTorch Example)

- Optuna is a state-of-the-art hyperparameter optimization framework.
It automatically finds the best learning rate, hidden units, optimizer, dropout, etc.

## ‚úÖ Step-by-Step Code: ANN + Optuna Tuning

In [None]:
#install optuna
!pip install optuna==4.6.0
!pip install sympy==1.12

üß† 1. Build a Simple ANN Class

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import optuna

In [None]:
# Example dataset (dummy)
x = torch.randn(1000,20) # 1000 samples, 20 features
y = torch.randint(0,2,(1000,)) # 1000 binary labels

In [None]:
x.shape

torch.Size([1000, 20])

In [None]:
x

tensor([[ 0.3852, -0.2024,  0.6418,  ...,  0.5914,  0.9515, -1.0156],
        [ 0.7890, -0.2004, -0.9029,  ...,  1.0663, -0.3850,  0.1282],
        [ 0.4612,  0.0124, -0.2938,  ..., -0.2692, -0.2672,  0.0660],
        ...,
        [-0.1010,  1.3794,  0.9487,  ..., -0.4104,  0.3701,  0.7955],
        [-0.8198, -0.3324,  0.8307,  ..., -0.7330, -0.8682,  1.4792],
        [ 0.9425, -0.6863,  1.8670,  ...,  1.0880,  0.8200,  1.7518]])

In [None]:
y.shape

torch.Size([1000])

In [None]:
y

tensor([1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0,
        1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1,
        1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1,
        1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1,
        0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1,
        1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1,
        1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0,
        0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
        1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0,
        0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1,
        1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1,
        0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1,
        0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,

In [None]:
dataset = TensorDataset(x,y)
train_loader = DataLoader(dataset,batch_size=32,shuffle=True)
test_loader = DataLoader(dataset,batch_size=32,shuffle=False)

üèó 2. Define the ANN model

In [None]:
class ANN(nn.Module):
    def __init__(self,input_dim,hidden_dim,output_dim,dropout_rate):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_dim,hidden_dim),
            nn.BatchNorm1d(hidden_dim),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            nn.Linear(hidden_dim,output_dim)
        )
    def forward(self,x):
        return self.net(x)


üéØ 3. The Optuna Objective Function

- Optuna will:

  - suggest learning rate

  - suggest hidden units

  - suggest dropout

  - pick optimizer

  - return accuracy

In [None]:
# REMOVED: This objective function is replaced by the more flexible one below.
# def objective(trial):

#     # hyperparameters to tune
#     input = x.shape[1]
#     output = 2  # Changed from 1 to 2 for binary classification with CrossEntropyLoss
#     hidden_dim = trial.suggest_int('hidden_dim',16,256)
#     dropout_rate = trial.suggest_float('dropout_rate',0.0,0.5)
#     learning_rate = trial.suggest_float('learning_rate',1e-5,1e-1,log=True)
#     batch_size = trial.suggest_categorical('batch_size',[32,64,128])
#     optimizer_name = trial.suggest_categorical('optimizer',['Adam','RMSprop','SGD'])

#     #Model
#     model = ANN(input_dim=input,hidden_dim=hidden_dim,output_dim=output,dropout_rate=dropout_rate)
#     criterion = nn.CrossEntropyLoss()

#     #optimizer
#     optimizer = getattr(optim,optimizer_name)(model.parameters(),lr=learning_rate)

#     # training loop(train 10 epoch)
#     model.train()
#     for epoch in range(10):
#         for batch_idx,(data,target) in enumerate(train_loader):
#             optimizer.zero_grad()
#             preds = model(data)
#             loss = criterion(preds,target)
#             loss.backward()
#             optimizer.step()

#     # Evaluate accuracy
#     model.eval()
#     correct = 0
#     total = 0
#     with torch.no_grad():
#         for data,target in test_loader:
#             preds = model(data)
#             predicted = preds.argmax(dim=1,keepdim=True)
#             correct += (predicted == target.view_as(predicted)).sum().item()
#             total += target.size(0)

#     accuracy = correct/total
#     return accuracy

üöÄ 4. Run Optuna Study

In [None]:
# üöÄ 4. Run Optuna Study
study = optuna.create_study(direction='maximize')
study.optimize(objective,n_trials=20)


[I 2025-12-13 20:15:20,450] A new study created in memory with name: no-name-ec633a4e-1815-4888-af77-f531edfb0776
[I 2025-12-13 20:15:21,111] Trial 0 finished with value: 0.576 and parameters: {'hidden_dim': 148, 'dropout_rate': 0.4964464565881205, 'learning_rate': 0.00020050410940550896, 'batch_size': 128, 'optimizer': 'Adam'}. Best is trial 0 with value: 0.576.
[I 2025-12-13 20:15:21,752] Trial 1 finished with value: 0.518 and parameters: {'hidden_dim': 151, 'dropout_rate': 0.4928859208300117, 'learning_rate': 1.2561779356883488e-05, 'batch_size': 64, 'optimizer': 'Adam'}. Best is trial 0 with value: 0.576.
[I 2025-12-13 20:15:22,238] Trial 2 finished with value: 0.568 and parameters: {'hidden_dim': 255, 'dropout_rate': 0.2318087242976828, 'learning_rate': 0.012738910186494258, 'batch_size': 32, 'optimizer': 'SGD'}. Best is trial 0 with value: 0.576.
[I 2025-12-13 20:15:22,737] Trial 3 finished with value: 0.506 and parameters: {'hidden_dim': 199, 'dropout_rate': 0.17784263391728494,

üèÜ 5. Print Best Hyperparameters

In [None]:
# üèÜ 5. Print Best Hyperparameters
print("Best Hyperparameters:", study.best_params)
for idx,(key, value) in enumerate(study.best_params.items()):
    print(f"\t{idx+1}- {key}: {value}")
print("Best Accuracy:",study.best_value)

Best Hyperparameters: {'hidden_dim': 193, 'dropout_rate': 0.10529330995375707, 'learning_rate': 0.0026656472362355187, 'batch_size': 128, 'optimizer': 'Adam'}
	1- hidden_dim: 193
	2- dropout_rate: 0.10529330995375707
	3- learning_rate: 0.0026656472362355187
	4- batch_size: 128
	5- optimizer: Adam
Best Accuracy: 0.762
