<a href="https://colab.research.google.com/github/maleehahassan/HIDA_Into_to_DL/blob/main/07_hyperparam_search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Essential libraries for deep learning and data handling
import time  # For measuring execution time
import random  # For random parameter sampling
import numpy as np  # For numerical operations
import tensorflow as tf  # Main deep learning framework
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split  # For dataset splitting
from itertools import product  # For generating parameter combinations

# Optuna is optional - used for advanced hyperparameter optimization
try:
    import optuna
    OPTUNA_AVAILABLE = True
except Exception:
    OPTUNA_AVAILABLE = False

# Fix random seeds for reproducibility
# This ensures we get the same results each time we run the notebook
seed = 42
np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)


## 2. Data Preparation

Here we prepare the CIFAR-10 dataset:
- Load the full dataset (50,000 training + 10,000 test images)
- Create a smaller subset for faster experimentation
- Apply proper preprocessing and splitting

In [None]:
# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

# Create smaller subset for faster experimentation
train_subset = 2000  # Instead of 50,000
test_subset = 500    # Instead of 10,000
x = np.concatenate([x_train, x_test])
y = np.concatenate([y_train, y_test]).flatten()

# Normalize pixel values to [0,1] range
x = x.astype("float32") / 255.0

# Create stratified train/test split to maintain class distribution
x_small, _, y_small, _ = train_test_split(
    x, y,
    train_size=train_subset + test_subset,
    stratify=y,  # Ensure balanced classes
    random_state=seed
)
x_train_small, x_test_small, y_train_small, y_test_small = train_test_split(
    x_small, y_small,
    train_size=train_subset,
    stratify=y_small,
    random_state=seed
)

## 3. Model Definition

Define a simple CNN architecture that we'll optimize:
- 2 convolutional layers with max pooling
- Flatten layer to connect to dense layers
- Dense layer with dropout for regularization
- Output layer for 10-class classification

The model accepts several hyperparameters that we'll optimize:
- conv_filters: Number of filters in conv layers
- kernel_size: Size of conv filters
- dense_units: Number of neurons in dense layer
- dropout_rate: Dropout probability for regularization
- learning_rate: Learning rate for Adam optimizer

In [None]:
num_classes = 10  # CIFAR-10 has 10 classes
input_shape = x_train_small.shape[1:]  # (32, 32, 3) for CIFAR-10

def build_model(conv_filters=16, kernel_size=3, dense_units=64, dropout_rate=0.3, learning_rate=1e-3):
    """
    Build and compile a CNN model with given hyperparameters.

    Args:
        conv_filters (int): Number of filters in first conv layer (doubled in second)
        kernel_size (int): Size of convolutional filters
        dense_units (int): Number of neurons in dense layer
        dropout_rate (float): Dropout probability for regularization
        learning_rate (float): Learning rate for Adam optimizer

    Returns:
        Compiled Keras model
    """
    model = keras.Sequential()
    # Input layer
    model.add(layers.Input(shape=input_shape))
    # First conv block
    model.add(layers.Conv2D(conv_filters, kernel_size, activation='relu', padding='same'))
    model.add(layers.MaxPooling2D())
    # Second conv block
    model.add(layers.Conv2D(conv_filters*2, kernel_size, activation='relu', padding='same'))
    model.add(layers.MaxPooling2D())
    # Flatten and dense layers
    model.add(layers.Flatten())
    model.add(layers.Dense(dense_units, activation='relu'))
    model.add(layers.Dropout(dropout_rate))
    model.add(layers.Dense(num_classes, activation='softmax'))

    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    return model

def evaluate_model(model, x_test, y_test):
    """Evaluate model accuracy on test data"""
    loss, acc = model.evaluate(x_test, y_test, verbose=0)
    return acc


## 4. Grid Search Implementation

Grid Search:
- Most basic approach to hyperparameter optimization
- Tests every possible combination of given parameters
- Guaranteed to find best combination in search space
- Computationally expensive (scales exponentially with parameters)
- May miss better values between grid points

In [None]:
# Training settings
DEFAULT_EPOCHS = 3  # Keep small for demonstration
DEFAULT_BATCH = 64

# Define parameter grid
param_grid = {
    'conv_filters': [8, 16],      # Number of conv filters
    'dense_units': [32, 64],      # Neurons in dense layer
    'dropout_rate': [0.2, 0.4],   # Dropout probabilities
}

print('\nStarting simple manual grid search...')
start = time.time()
best_acc = 0.0
best_params = None

# Try every possible combination of parameters
for conv_filters, dense_units, dropout_rate in product(
    param_grid['conv_filters'],
    param_grid['dense_units'],
    param_grid['dropout_rate']
):
    # Create parameter dictionary for current combination
    params = dict(conv_filters=conv_filters, dense_units=dense_units, dropout_rate=dropout_rate)
    print('Testing params:', params)

    # Build and train model with current parameters
    model = build_model(**params)
    model.fit(x_train_small, y_train_small, epochs=DEFAULT_EPOCHS, batch_size=DEFAULT_BATCH, verbose=0)

    # Evaluate and update best if necessary
    acc = evaluate_model(model, x_test_small, y_test_small)
    print(' Accuracy:', acc)
    if acc > best_acc:
        best_acc = acc
        best_params = params

print('Grid search done. Best accuracy: {:.4f} with {}'.format(best_acc, best_params))
print('Time:', time.time() - start)


## 5. Random Search Implementation

Random Search:
- Randomly samples parameter combinations
- More efficient than grid search for high-dimensional spaces
- Can explore larger parameter ranges with fewer trials
- May find better solutions by testing diverse combinations
- Doesn't guarantee finding global optimum

In [None]:
print('\nStarting random search...')
start = time.time()
best_acc_rs = 0.0
best_params_rs = None

# Define broader parameter distributions
param_distributions = {
    'conv_filters': [8, 16, 24],          # More options than grid search
    'dense_units': [32, 64, 128],         # Wider range
    'dropout_rate': [0.2, 0.3, 0.4],      # More granular
    'learning_rate': [1e-3, 5e-4, 1e-4]   # Additional parameter
}

# Try random combinations
n_iter = 4  # Number of random combinations to try
for i in range(n_iter):
    # Randomly sample one value from each parameter distribution
    params = {k: random.choice(v) for k, v in param_distributions.items()}
    print('Iter', i+1, 'params:', params)

    # Build and train model with sampled parameters
    model = build_model(**params)
    model.fit(x_train_small, y_train_small, epochs=DEFAULT_EPOCHS, batch_size=DEFAULT_BATCH, verbose=0)

    # Evaluate and update best if necessary
    acc = evaluate_model(model, x_test_small, y_test_small)
    print(' Accuracy:', acc)
    if acc > best_acc_rs:
        best_acc_rs = acc
        best_params_rs = params

print('Random search done. Best accuracy: {:.4f} with {}'.format(best_acc_rs, best_params_rs))
print('Time:', time.time() - start)

## 6. Optuna Implementation

Optuna:
- Advanced hyperparameter optimization framework
- Supports various sampling strategies
- Can adapt search based on previous results
- Allows continuous parameter ranges
- More efficient than grid or random search
- Provides visualization and analysis tools


In [None]:
if OPTUNA_AVAILABLE:
    print('\nStarting Optuna optimization...')
    start = time.time()

    def objective(trial):
        """
        Optuna objective function that:
        1. Suggests parameter values using various sampling strategies
        2. Builds and trains model with these parameters
        3. Returns accuracy for Optuna to optimize
        """
        # Define parameter space with different sampling strategies
        conv = trial.suggest_categorical('conv_filters', [8, 16, 24])
        dense = trial.suggest_categorical('dense_units', [32, 64, 128])
        drop = trial.suggest_float('dropout_rate', 0.1, 0.5)  # Continuous range
        lr = trial.suggest_loguniform('learning_rate', 1e-4, 1e-2)  # Log scale

        # Build and train model
        model = build_model(
            conv_filters=conv,
            dense_units=dense,
            dropout_rate=drop,
            learning_rate=lr
        )
        model.fit(x_train_small, y_train_small, epochs=DEFAULT_EPOCHS, batch_size=DEFAULT_BATCH, verbose=0)

        # Return accuracy for optimization
        acc = evaluate_model(model, x_test_small, y_test_small)
        return acc

    # Create and run Optuna study
    study = optuna.create_study(direction='maximize')  # We want to maximize accuracy
    study.optimize(objective, n_trials=4)  # Run 4 trials

    # Report results
    print('Optuna best value:', study.best_value)
    print('Optuna best params:', study.best_params)
    print('Time:', time.time() - start)
else:
    print('\nOptuna not installed. Skipping Optuna section. To run Optuna, install it with:\n    pip install optuna')

print('\nAll searches completed.')
