# Problem Statement: **Hyperparameter Tuning for AtliQâ€™s Fashion Item Classifier**

### AtliQ Fashion wants to develop a neural network to classify fashion items using the FashionMNIST dataset. Your task is to optimize the neural network's performance by fine-tuning its hyperparameters. We will be using **FashionMNIST** dataset but since the dataset is large, we will work with only a subset to ensure that the solution is computationally feasible.

**References:**

* transforms.Compose (PyTorch): [Link](https://pytorch.org/vision/master/generated/torchvision.transforms.Compose.html)
* Optuna (Hyperparameter Optimization Framework) [Link](https://optuna.readthedocs.io/en/stable/)

In [None]:
!pip install optuna

In [None]:
import torch
import numpy as np
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset, random_split
from torchvision import datasets, transforms
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
import pandas as pd
import optuna
import random

# Check if CUDA (GPU) is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")



---



**Dataset Overview**

* Dataset: FashionMNIST
* Classes: 10 (e.g., T-shirts, trousers, shoes)
* Training Images: Subset of 10,000 (randomly sampled from 60,000)
* Test Images: Subset of 2,000 (randomly sampled from 10,000)



---



**Step1**: Load and Sample the Dataset

* Load the FashionMNIST dataset using torchvision.datasets.
* Sample 10,000 images for training and 2,000 images for testing.
* Normalize the pixel values to the range [-1, 1].
* Create PyTorch DataLoaders for the training and test sets.

In [None]:
# Transform: Normalize and convert to tensor
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)) # Centers the pixel values around 0 and scales them to [-1, 1]
])

# Load FashionMNIST dataset

dataset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)

test_dataset = # Code Here



# Sample the datset

train_subset_size = # Code Here
test_subset_size = # Code Here




---



**Step2**: Create Dataloaders

* batch size = 32


In [None]:
train_subset, _ = random_split(dataset, [train_subset_size, len(dataset) - train_subset_size])
test_subset, _ = random_split(test_dataset, [test_subset_size, len(test_dataset) - test_subset_size])

In [None]:
batch_size = 32
train_loader = # Code Here
test_loader = # Code Here

In [None]:
print(f"Training data size: {len(train_subset)}")
print(f"Testing data size: {len(test_subset)}")



---



**Step3**: Define the Neural Network

* Create a fully connected feed-forward neural network (no CNN).

Structure:
* Input layer: 784 neurons (28x28 image flattened).
* 1st hidden layer: 128 neurons with ReLU activation.
* 2nd hidden layer: 64 neurons with ReLU activation.
* Output layer: 10 neurons (one for each class) with Softmax activation.

Use `nn.Sequential`

In [None]:
class FashionNN(nn.Module):
    def __init__(self):
        super(FashionNN, self).__init__()
        self.network = nn.Sequential(
            # Flatten the input tensor

            # Input layer (784)

            # Activation

            # Hidden layer 1

            # Activation

            # Output layer (10 classes)

            # Softmax for probabilities
        )

    def forward(self, x):
        return self.network(x)

model = # Code Here
print(model)

loss_fn = # Code Here



---



**Step 3**: Train the Base Model

Instructions:

Set the following base hyperparameters:
* Loss function: Cross Entropy Loss
* Learning rate: 0.01
* Batch size: 32
* Optimizer: SGD
* Epochs: 100

Train the model and record the training/validation accuracy and loss.


In [None]:
# Define loss function and optimizer
loss_function = # Code Here
optimizer = # Code Here

# Training loop
num_epochs =
for epoch in range(num_epochs):
    train_loss = 0.0
    model.train()  # Set model to training mode
    for images, labels in train_loader:
        # Zero gradients

        # Forward pass
        predictions = # Code Here
        loss = # Code Here

        # Backward pass
        loss.backward()
        optimizer.step()

        # Append the training loss


    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {train_loss/len(train_loader):.4f}")




---



**Step 4**: Perform Hyperparameter Tuning
Instructions:

**Grid Search:**

Hyperparameters:
* Learning rate: [0.001, 0.01, 0.1]
* Batch size: [32, 64]
* Evaluate all combinations systematically.

In [None]:
# Define grid search parameters
learning_rates =
batch_sizes =

# Train and evaluate for all combinations
best_loss = float('inf')
best_params = {}

for lr in learning_rates:
    for batch_size in batch_sizes:
        optimizer = # Code Here
        train_loader = # Code Here
        train_loss = 0.0
        for images, labels in train_loader:
            # Code Here




        avg_loss = train_loss / len(train_loader)
        print(f"LR: {lr}, Batch size: {batch_size}, Loss: {avg_loss:.4f}")
        if avg_loss < best_loss:
            best_loss = avg_loss
            best_params = {'lr': lr, 'batch_size': batch_size}

print(f"Best Params (Grid Search): {best_params}")




---



**Random Search:**

Randomly select hyperparameters for 5 trials from:
* Learning rate: [0.0001, 0.001, 0.01, 0.1]
* Batch size: [16, 32, 64, 128]

In [None]:
# Define random search space
learning_rates =
batch_sizes =
# Randomly sample 5 combinations
for _ in range(5):
    lr = random.choice(learning_rates)
    batch_size = random.choice(batch_sizes)
    for batch_size in batch_sizes:
        optimizer = # Code Here
        train_loader = # Code Here
        train_loss = 0.0
        for images, labels in train_loader:
            # Code Here





        avg_loss = train_loss / len(train_loader)
        print(f"LR: {lr}, Batch size: {batch_size}, Loss: {avg_loss:.4f}")
        if avg_loss < best_loss:
            best_loss = avg_loss
            best_params = {'lr': lr, 'batch_size': batch_size}

print(f"Best Params (Random Search): {best_params}")




---



**Bayesian Optimization (Optuna):**

Use optuna.create_study to dynamically suggest:
* Learning rate: Range (0.0001, 0.1)
* Hidden layer neurons: Range (32, 256)

In [None]:
import optuna

def objective(trial):
    # Suggest parameters
    lr = # Code Here
    neurons = # Code Here

    # Modify model
    model = nn.Sequential(
        nn.Linear(28*28, neurons),
        nn.ReLU(),
        nn.Linear(neurons, 10),
        nn.Softmax(dim=1)
    )
    optimizer = # Code Here
    loss_function = # Code Here

    # Train model
    model.train()
    num_epochs =
    for epoch in range(num_epochs):
        for images, labels in train_loader:
            # Flatten images
            images = images.view(images.size(0), -1)

            # Forward pass


            # Backward pass


    # Evaluate on the validation set
    model.eval()
    total_loss = 0.0
    with torch.no_grad():
        for images, labels in test_loader:
            # Flatten images
            images = images.view(images.size(0), -1)

            # Forward pass
            predictions = model(images)
            loss = loss_function(predictions, labels)

            # Append the total_loss


    avg_loss = total_loss / len(test_loader)  # Average loss over all batches
    return avg_loss  # Return loss for Optuna to minimize

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=10)
print(f"Best Params (Optuna): {study.best_params}")




---



**Step5**: Evaluate and Compare the Model

* Train the model using the best hyperparameters from each method (Grid Search, Random Search, Optuna).
* num_epochs = 50
* Evaluate all models on the test set.
* Plot training/validation accuracy and loss for the best model.


In [None]:
# Train model with best params and evaluate
model = FashionNN()  # Re-initialize the model
optimizer = # Code here
train_loader = # Code here

# Define loss function
loss_function = # Code here

# Training loop
num_epochs = 50 # Re-train with best parameters
for epoch in range(num_epochs):
    train_loss = 0.0
    model.train()  # Set model to training mode
    for images, labels in train_loader:
        # Clear previous gradients

        predictions =                # Forward pass

        loss =                       # Compute loss
        # Backpropagation

        # Update weights

        train_loss += loss.item()
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {train_loss/len(train_loader):.4f}")

# Evaluate on test set
model.eval()  # Set model to evaluation mode
test_loss = 0.0
correct = 0
total = 0

with torch.no_grad():  # Disable gradient computation for evaluation
    for images, labels in test_loader:
        predictions =            # Forward pass
        loss =                   # Compute loss
        test_loss += loss.item()

        # Calculate accuracy
        _, predicted = torch.max(predictions, 1)  # Get class with highest probability
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

# Print test loss and accuracy
print(f"Test Loss: {test_loss/len(test_loader):.4f}")
print(f"Test Accuracy: {100 * correct / total:.2f}%")





---



**Step6**: Visualize the Model

In [None]:
# Get predictions
model.eval()
images, labels = next(iter(test_loader))
predictions = model(images).argmax(dim=1)

# Plot 9 images
for i in range(9):
    plt.subplot(3, 3, i+1)
    plt.imshow(images[i].squeeze(), cmap='gray')
    plt.title(f"Pred: {predictions[i]}, True: {labels[i]}")
    plt.axis('off')
plt.show()
