<a href="https://colab.research.google.com/github/Bosy-Ayman/DSAI-403-Nature-Inspired-Computation/blob/main/Lab2_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 2: Hyperparameter Tuning with Hill Climbing (NLP Project)

---

## Learning Objectives
By the end of this lab, you will be able to:
1. Understand how **Hill Climbing** search can be applied for hyperparameter tuning in NLP models.  
2. Explore the trade-offs between **greedy local optimization** and **global exploration**.  
3. Implement, evaluate, and analyze a **classification model on text data**.  
4. Compare **basic Hill Climbing** with **improved Hill Climbing (First-Ascent with Restarts)**.  

---

## Task Overview
In this lab, you will work with a **text classification dataset** (e.g., IMDB reviews, News classification, or SMS spam dataset).  
Your goal is to tune the hyperparameters of a machine learning model using **Hill Climbing**.

---

## Dataset
- Choose an **NLP dataset** (sentiment analysis or text classification). Examples:  
  - IMDB Movie Reviews (binary sentiment)  
  - SMS Spam Collection (spam vs. ham)  
  - News Category Dataset  

---

## Lab Tasks

### Step 1: Data Preparation
- Load the dataset.  
- Perform preprocessing:  
  - Tokenization  
  - Stopword removal  
  - TF-IDF or CountVectorizer transformation  
- Split the dataset into **training** and **validation** sets.  

---

### Step 2: Define the Model
- Select a baseline ML model (e.g., **Logistic Regression**, **Naive Bayes**, or **SVM**).  
- Identify hyperparameters to tune. Examples:  
  - Learning rate  
  - Regularization strength (`C` for Logistic Regression, `alpha` for Naive Bayes, etc.)  
  - Maximum features for TF-IDF  
  - n-gram range  

---

### Step 3: Define Search Space
- Clearly specify a **discrete search space** for each hyperparameter. Example:  
  - Learning rate: [0.001, 0.01, 0.1, 1]  
  - Regularization (C): [0.1, 1, 10]  
  - Max features: [1000, 5000, 10000]  
  - N-gram range: [(1,1), (1,2), (1,3)]  

---

### Step 4: Implement Hill Climbing
- Start with **randomly selected hyperparameters**.  
- Generate **neighbors** by changing one hyperparameter at a time.  
- Evaluate model performance (validation accuracy) for each neighbor.  
- Move to the neighbor if it improves performance.  
- Stop if no improvement is found.  

---

### Step 5: Improved Hill Climbing (First-Ascent with Restarts)
- Implement **First-Ascent strategy**: accept the first better neighbor you find.  
- Add **random restarts** to escape local optima.  
- Keep track of the **best overall solution** across restarts.  

---

### Step 6: Model Training and Evaluation
- Compile and train the **final best model** with tuned hyperparameters.  
- Evaluate on the **test set**.  
- Report:  
  - Best hyperparameters found  
  - Training vs. validation accuracy  
  - Test accuracy  

---

### Step 7: Visualization
- Plot the **accuracy progression over iterations** of Hill Climbing.  
- Show how hyperparameters evolved during search.  
- Compare **baseline vs. tuned model** performance.  

---

## Deliverables
At the end of this lab, you must submit:
1. Preprocessed dataset description (with shapes and samples).  
2. Implementation of **basic and improved Hill Climbing**.  
3. Visualization of the tuning process (accuracy vs. iterations).  
4. Final report with:  
   - Best hyperparameters found  
   - Validation and test accuracy  
   - Observations on differences between basic and improved Hill Climbing  

---

## Reflection Questions
- Why does Hill Climbing sometimes get stuck in local optima?  
- How do **restarts** help in escaping local optima?  
- Compare **Hill Climbing** with **Grid Search** and **Random Search** in terms of efficiency and results.  
- What would happen if the search space is **continuous** instead of discrete?  

---


In [None]:
import random
import copy
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization, Activation, Rescaling
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.utils import to_categorical


In [None]:
seed = 123
tf.random.set_seed(seed)
np.random.seed(seed)
random.seed(seed)

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Set memory growth for each GPU
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        # Use only the first GPU
        tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
        strategy = tf.distribute.OneDeviceStrategy(device="/gpu:0")
        print('\nGPU Found! Using GPU...')
    except RuntimeError as e:
        print(e)
else:
    strategy = tf.distribute.get_strategy()
    print('Number of replicas:', strategy.num_replicas_in_sync)



In [None]:


# =======================================================
# Step 1: Load and Prepare CIFAR-10 Dataset
# =======================================================
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# Split training data into training and validation sets
# Use 10% of the training data as validation set
X_train, X_val, y_train, y_val = train_test_split(
    X_train_full, y_train_full, test_size=0.1, random_state=seed, stratify=y_train_full
)

# Convert labels to categorical (one-hot encoding)
y_train_cat = to_categorical(y_train, num_classes=10)
y_val_cat = to_categorical(y_val, num_classes=10)
y_test_cat = to_categorical(y_test, num_classes=10)

# Rescale pixel values (0-255 to 0-1)
X_train = X_train.astype('float32') / 255.0
X_val = X_val.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Prepare TensorFlow Datasets (for Keras compatibility, similar to original code structure)
# Note: CIFAR-10 images are 32x32x3
BATCH_SIZE = 32 # Temporary batch size, the optimal one will be tuned later

train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train_cat)).shuffle(10000).batch(BATCH_SIZE).cache().prefetch(buffer_size=tf.data.AUTOTUNE)
val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val_cat)).batch(BATCH_SIZE).cache().prefetch(buffer_size=tf.data.AUTOTUNE)
test_ds = tf.data.Dataset.from_tensor_slices((X_test, y_test_cat)).batch(BATCH_SIZE).cache().prefetch(buffer_size=tf.data.AUTOTUNE)

# --- Data Augmentation Pipeline (Adapted for CIFAR-10 32x32 size) ---
augmentation = tf.keras.Sequential(
    [
        tf.keras.layers.RandomFlip("horizontal"),
        tf.keras.layers.RandomRotation(
        factor = (-.1, .1), # Smaller rotation for smaller images
        fill_mode = 'reflect',
        interpolation = 'bilinear',
        seed = seed),
        tf.keras.layers.RandomContrast(
        factor = (.2),
        seed = seed)
    ], name="data_augmentation"
)
augmentation.build((None, 32, 32, 3))


In [None]:
search_space = {
    "learning_rate": [1e-4, 1e-3, 5e-3],
    "dropout_rate": [0.3, 0.5],
    "filters": [32, 64],
    "batch_size": [32, 64] # Updated options
}

In [None]:
def build_model(params):
    with strategy.scope():
        model = Sequential()
        model.add(tf.keras.Input(shape=(32, 32, 3))) # Explicit input shape for 32x32x3
        model.add(augmentation)

        # First Conv Block
        model.add(Conv2D(params["filters"], (3,3), padding="same"))
        model.add(Activation('relu'))
        model.add(BatchNormalization())
        model.add(MaxPooling2D(pool_size=(2,2), padding="same"))
        model.add(Dropout(params["dropout_rate"]))

        # Second Conv Block (fixed 64 filters)
        model.add(Conv2D(64, (3,3), padding="same"))
        model.add(Activation('relu'))
        model.add(BatchNormalization())
        model.add(MaxPooling2D(pool_size=(2,2), padding="same"))
        model.add(Dropout(params["dropout_rate"]))

        # Third Conv Block (fixed 128 filters)
        model.add(Conv2D(128, (3,3), padding="same"))
        model.add(Activation('relu'))
        model.add(BatchNormalization())
        model.add(MaxPooling2D(pool_size=(2,2), padding="same"))
        model.add(Dropout(params["dropout_rate"]))


        model.add(Flatten())
        model.add(Dense(512, activation="relu"))
        model.add(Dropout(0.5))
        model.add(Dense(10, activation="softmax")) # 10 classes for CIFAR-10

        optimizer = tf.keras.optimizers.RMSprop(params["learning_rate"])
        model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"])
    return model

In [None]:

#  For hyperparameter tuning, we use 'val_ds' which is X_val/y_val_cat
def evaluate_model(params):
    model = build_model(params)
    # Dynamically change dataset batch size for training based on params
    train_ds_tuned = tf.data.Dataset.from_tensor_slices((X_train, y_train_cat)).shuffle(10000).batch(params["batch_size"]).cache().prefetch(buffer_size=tf.data.AUTOTUNE)
    val_ds_tuned = tf.data.Dataset.from_tensor_slices((X_val, y_val_cat)).batch(params["batch_size"]).cache().prefetch(buffer_size=tf.data.AUTOTUNE)

    history = model.fit(
        train_ds_tuned,
        validation_data=val_ds_tuned,
        epochs=3,  # train only a few epochs for speed
        verbose=0
    )
    val_acc = max(history.history.get("val_accuracy", [0]))
    # Free up memory
    del model
    tf.keras.backend.clear_session()
    return val_acc


In [None]:

def hill_climb(max_iter=5):
    current = {k: random.choice(v) for k, v in search_space.items()}
    current_score = evaluate_model(current)
    print(f"Initial Params: {current} | Accuracy: {current_score:.4f}")

    history = [(0, current_score)]

    for i in range(1, max_iter + 1):
        # Generate neighbor by changing exactly one random parameter
        neighbor = copy.deepcopy(current)
        key = random.choice(list(search_space.keys()))

        # Ensure the new value is actually different from the current one
        available_values = [v for v in search_space[key] if v != current[key]]
        if not available_values: # If there's only one choice, skip this iteration
             history.append((i, current_score))
             continue

        neighbor[key] = random.choice(available_values)
        score = evaluate_model(neighbor)

        print(f"Iteration {i}: Tried {neighbor} -> Acc={score:.4f}")

        if score > current_score:
            print("  Improvement found! Updating current solution.")
            current, current_score = neighbor, score

        history.append((i, current_score))

    return current, current_score, history


In [None]:

def hill_climb_with_restarts(max_iter=5, restarts=3):
    best_overall = None
    best_score = 0
    all_history = []

    for r in range(restarts):
        print(f"\n=== Restart {r+1} ===")
        current, score, history = hill_climb(max_iter)
        all_history.extend([(r*max_iter + i, acc) for i, acc in history])

        if score > best_score:
            best_overall, best_score = current, score

    return best_overall, best_score, all_history


In [None]:
print("\n" + "="*50)
print("Starting Hill Climbing Hyperparameter Search for CIFAR-10 CNN")
print("="*50)

In [None]:

# Reduced max_iter and restarts for quicker example run; adjust if necessary
best_params, best_score, history = hill_climb_with_restarts(max_iter=4, restarts=2)
# Note: A real search would use more iterations/restarts

print("\nBest Hyperparameters Found:")
print(best_params)
print(f"Best Validation Accuracy (after {3} epochs): {best_score:.4f}")



Number of replicas: 1

Starting Hill Climbing Hyperparameter Search for CIFAR-10 CNN

=== Restart 1 ===


In [None]:
if history:
    iterations, accuracies = zip(*history)
    plt.figure(figsize=(8,5))
    plt.plot(iterations, accuracies, marker='o')
    plt.title("Hill Climbing Accuracy Progression (CIFAR-10)")
    plt.xlabel("Iteration")
    plt.ylabel("Validation Accuracy")
    plt.grid(True)
    plt.show()


In [None]:

print("\n" + "="*50)
print("Training Final Model with Best Parameters")
print("="*50)

# Define Early Stopping and Model Checkpoints (unchanged)
early_stopping = EarlyStopping(monitor = 'val_accuracy',
                              patience = 5, mode = 'max',
                              restore_best_weights = True)

checkpoint = ModelCheckpoint('best_cifar10_model.h5',
                            monitor = 'val_accuracy',
                            save_best_only = True)

final_model = build_model(best_params)

# Retrain with the optimal batch size and the full training dataset
final_train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train_cat)).shuffle(10000).batch(best_params["batch_size"]).cache().prefetch(buffer_size=tf.data.AUTOTUNE)
final_val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val_cat)).batch(best_params["batch_size"]).cache().prefetch(buffer_size=tf.data.AUTOTUNE)

# Use the X_val/y_val for monitoring (Test set used only for final evaluation)
history = final_model.fit(
    final_train_ds,
    validation_data=final_val_ds,
    epochs=50, # Set a high number of epochs, EarlyStopping will stop it
    callbacks=[early_stopping, checkpoint]
)


In [None]:

print("\n" + "="*50)
print("Plotting Training History")
print("="*50)

# Extract metrics from training history
loss = history.history['loss']
val_loss = history.history['val_loss']
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
epochs_ran = range(1, len(loss) + 1)

# Create subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# ---- Loss Plot ----
ax1.plot(epochs_ran, loss, 'b-', linewidth=2.5, label='Training Loss')
ax1.plot(epochs_ran, val_loss, 'r-', linewidth=2.5, label='Validation Loss')
ax1.set_title('Loss Over Epochs', fontsize=14, fontweight='bold')
ax1.set_xlabel('Epochs')
ax1.set_ylabel('Loss')
ax1.grid(True, linestyle="--", alpha=0.6)
ax1.legend()

# ---- Accuracy Plot ----
ax2.plot(epochs_ran, acc, 'b-', linewidth=2.5, label='Training Accuracy')
ax2.plot(epochs_ran, val_acc, 'r-', linewidth=2.5, label='Validation Accuracy')
ax2.set_title('Accuracy Over Epochs', fontsize=14, fontweight='bold')
ax2.set_xlabel('Epochs')
ax2.set_ylabel('Accuracy')
ax2.grid(True, linestyle="--", alpha=0.6)
ax2.legend()

# Adjust layout
plt.suptitle('Loss and Accuracy Over Epochs (CIFAR-10)', fontsize=16, fontweight='bold')
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()

In [None]:

print("\n" + "="*50)
print("Final Evaluation on Test Set")
print("="*50)

# Load best weights before final evaluation
final_model.load_weights('best_cifar10_model.h5')

test_loss, test_acc = final_model.evaluate(test_ds)

print('\nTest Loss: ', test_loss)
print(f'Test Accuracy: {np.round(test_acc * 100, 2)}%')

In [None]:

# Select a random sample from the test set for prediction
sample_index = random.randint(0, len(X_test) - 1)
sample_image = X_test[sample_index]
true_label = class_names[np.argmax(y_test_cat[sample_index])]

# Model prediction requires batch dimension
preds = final_model.predict(np.expand_dims(sample_image, axis=0))
preds_class = np.argmax(preds)
preds_label = class_names[preds_class]
confidence_score = preds[0][preds_class]

print(f'\n--- Prediction Example ---')
print(f'True Class: {true_label}')
print(f'Predicted Class: {preds_label}')
print(f'Confidence Score: {confidence_score:.4f}')

# Display the image
plt.figure(figsize=(3,3))
plt.imshow(sample_image)
plt.title(f"True: {true_label} | Predicted: {preds_label} ({confidence_score:.2%})")
plt.axis('off')
plt.show()