# Assignment 5: Deep Learning {-}

This assignment aims at familiarizing you with training and testing a Deep Neural Network (DNN). The dataset you will be working on is CIFAR-10. You will have to do:

1.  **(5 points) Coding tasks:** The following questions involve writing code to complete specific tasks.  
    1.1 *(1 point)* Load the CIFAR-10 dataset, visualize sample images, and perform data normalization to improve training performance.  
    1.2 *(1 point)* First network: Build, train, and test a deep neural network with at least three convolutional layers, two fully connected layers, and two pooling layers.  
    1.3 *(1 point)* Second network: Build, train, and test another deep neural network, with an architecture of your choice, but at most 4M (four million) parameters, ensuring the architecture meets this constraint by verifying with model.summary().  
    1.4 *(2 points)* Modify the second network architecture by tuning the layer hyperparameters or adjusting the layer design to improve test accuracy while remaining within the four million parameter limit. Discuss your observations and the trade offs of the changes you make.  

2.  **(5 points) Open discussion questions:** These discussion questions ask you to analyze and argue your points.  Feel free to include relevant code examples to strengthen your arguments.  
    2.1 *(1 point)* How did hyperparameter tuning (learning rate, dropout, batch size) affect your model’s accuracy? Were there any unexpected results?  
    2.2 *(1 point)* How did the constraint of keeping the model within 4 million parameters impact your design choices? Would a larger model necessarily perform better?  
    2.3 *(1 point)* How can deep learning models trained on datasets like CIFAR-10 be applied in real-world scenarios? Give an example.  
    2.4 *(1 point)* Deep learning models for image recognition can have biases. What ethical concerns should be considered when deploying such models?  
    2.5 *(1 point)* What was the most interesting or challenging part of this assignment? If you had more time, what additional improvements would you make?  

The dataset you will be working on is CIFAR-10 (https://www.cs.toronto.edu/~kriz/cifar.html) which consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. Here follows the ten object classes:
* airplane
*	automobile
*	bird
*	cat
*	deer
*	dog
*	frog
*	horse
*	ship
*	truck

Here follows some data samples in the dataset:

![alt text](https://docs.pytorch.org/tutorials/_images/cifar10.png)

### Submission {-}
The structure of submission folder should be organized as follows:

- ./\<StudentID>-assignment5-notebook.ipynb: Jupyter notebook containing source code.
- ./\<Test-accuracy>-\<StudentID>.txt: accuracy of the second network on the test set (for extra credit, see the 'Evaluation' part below). For example if you get 0.8124 accuracy, the name of this file is 08124-2012345.txt. The file content is left empty.

The submission folder is named ML4DS-\<StudentID>-Assignment5 (e.g., ML4DS-2012345-Assigment5) and then compressed with the same name.
    
### Evaluation {-}
Assignment evaluation will be conducted on how you accomplish the assignment requirements. It is a plus if you have modeling steps other than the basic requirements and achieve an excellent model accuracy. In addition, your code should conform to a Python coding convention such as PEP-8.

EXTRA CREDIT: Top-3 submissions achieving the highest test accuracy on the second network (of 4M params at most) will be rewarded an extra credit. **You have to ensure the architecture meets this constraint by verifying and printing out the number of parameters with model.summary(). Please follow the submission format to be eligible for this extra credit.**

### Deadline {-}
Please visit Canvas for details.

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import keras              # Keras is the high-level API of TensorFlow

In [None]:
from keras.models import Sequential
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization

In [None]:
# PLEASE DO NOT CHANGE THIS CODE

# Load the cifar10 dataset and split train/test
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Split train/valid from the training set
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, random_state=5)

print("Train shape: X_train = " + str(X_train.shape) + ", y_train = " + str(y_train.shape))
print("Validation shape: X_val = " + str(X_val.shape) + ", y_val = " + str(y_val.shape))
print("Test shape: X_test = " + str(X_test.shape) + ", y_test = " + str(y_test.shape))

In [None]:
# ----- TEMPORARILY reduce training set to 200 samples for quick testing -----
X_train = X_train[:200]
y_train = y_train[:200]

print("Reduced Train shape: X_train = " + str(X_train.shape) + ", y_train = " + str(y_train.shape))


In [None]:
# # Show some samples in the dataset
# import matplotlib.pyplot as plt
# imgplot = plt.imshow(X_train[44999])
# plt.show()
# imgplot = plt.imshow(X_test[4999])
# plt.show()

## 1. Coding tasks

1. Load the CIFAR-10 dataset, visualize sample images, and perform data normalization to improve training performance.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# CIFAR-10 class names
class_names = [
    "airplane", "automobile", "bird", "cat", "deer",
    "dog", "frog", "horse", "ship", "truck"
]

# Count how many samples belong to each class
class_counts = np.bincount(y_train.flatten(), minlength=10)

# For each class, collect indices of all images
class_indices = {i: np.where(y_train.flatten() == i)[0] for i in range(10)}

# Plot setup
fig, axes = plt.subplots(4, 3, figsize=(6, 8))
axes = axes.flatten()

for i in range(10):
    ax = axes[i]

    # Randomly pick 4 images from this class
    idxs = np.random.choice(class_indices[i], 4, replace=False)
    imgs = X_train[idxs]

    # Create a small 2x2 grid inside the subplot
    combined = np.zeros((64, 64, 3), dtype=np.uint8)  # 32*2 = 64 pixels per side
    combined[:32, :32] = imgs[0]
    combined[:32, 32:] = imgs[1]
    combined[32:, :32] = imgs[2]
    combined[32:, 32:] = imgs[3]

    ax.imshow(combined)
    ax.set_title(f"{class_names[i]} ({class_counts[i]})", fontsize=10)
    ax.axis("off")

# Turn off remaining unused subplots
for ax in axes[10:]:
    ax.axis("off")

plt.show()


In [None]:
# Convert pixel values from 0–255 to 0–1
X_train_norm = X_train.astype("float32") / 255.0
X_val_norm = X_val.astype("float32") / 255.0
X_test_norm = X_test.astype("float32") / 255.0

print("Data normalized: pixel values are now between 0 and 1.")
print("Random normalized pixel: ", X_train_norm[44998][16][16])

2. First network: Build, train, and test a deep neural network with at least three convolutional layers, two fully connected layers, and two pooling layers

In [None]:
# from keras.models import Sequential
# from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# model = Sequential()

# # First Convolution + Pooling
# model.add(Conv2D(32, (5,5), activation='relu', input_shape=(X_train_norm.shape[1], X_train_norm.shape[2], X_train_norm.shape[3])))
# model.add(MaxPooling2D((2,2)))

# # Second Convolution
# model.add(Conv2D(64, (5,5), activation='relu'))


# # Third Convolution + Pooling
# model.add(Conv2D(128, (5,5), activation='relu'))
# model.add(MaxPooling2D((2,2)))

# # Flatten before Dense layers
# model.add(Flatten())

# # Fully connected layers
# model.add(Dense(128, activation='relu'))
# model.add(Dropout(0.5))  # Optional: helps prevent overfitting
# model.add(Dense(64, activation='relu'))

# # Output layer (change units based on number of classes)
# model.add(Dense(10, activation='softmax'))

# # Compile the model
# model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# # Summary of the model
# model.summary()

In [None]:
# # Compile the model
# model.compile(loss = tf.keras.losses.sparse_categorical_crossentropy, # Define loss function
#                 optimizer= tf.keras.optimizers.Adam(learning_rate=1e-2), metrics=['accuracy']) # Define initial learning rate and metrics.


# # Train the model. Using Colab for training
# history = model.fit(X_train_norm, y_train, # Data feature and data label
#                     batch_size=1024, # Batch size
#                     epochs=10, # Number of training epochs
#                     validation_data=(X_val_norm, y_val)) # Validation set

In [None]:
# # Visualize training and validation performance
# f,ax=plt.subplots(2,1)

# # Plot training and validation loss
# ax[0].plot(history.history['loss'], color='b',label='Training Loss')
# ax[0].plot(history.history['val_loss'],color='r',label='Validation Loss')

# # Plot training and validation accuracy
# ax[1].plot(history.history['accuracy'],color='b',label='Training Accuracy')
# ax[1].plot(history.history['val_accuracy'],color='r',label='Validation Accuracy')

# plt.legend()

In [None]:
# # Show the model performance
# result = model.evaluate(X_test_norm, y_test) # If unspecified, batch_size will default to 32
# print(model.metrics_names) # result[0] is loss, result[1] is accuracy. The metrics are defined in dnn_model.complie(...)
# print("Loss and accuracy on the test set: loss = {}, accuracy = {}".format(result[0],result[1]))
# model.summary()

3. Second network: Build, train, and test another deep neural network, with an architecture of your choice, but at most 4M (four million) parameters, ensuring the architecture meets this constraint by verifying with model.summary().

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

import numpy as np
from PIL import Image
from tqdm.auto import tqdm
import matplotlib.pyplot as plt


In [None]:
# Load DINOv2 ViT-S/14
dinov2 = torch.hub.load("facebookresearch/dinov2", "dinov2_vits14")
dinov2.eval()
for p in dinov2.parameters():
    p.requires_grad = False  # freeze backbone

# Preprocessing for DINOv2
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize(256, interpolation=3),    # Bicubic
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std =[0.229, 0.224, 0.225],
    )
])


In [None]:
class DinoDataset(Dataset):
    def __init__(self, X, y):
        self.X = X
        self.y = y

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        img = Image.fromarray(self.X[idx])
        x = transform(img).unsqueeze(0)

        with torch.no_grad():
            feat = dinov2(x).squeeze(0)  # shape (384,)

        return feat, torch.tensor(self.y[idx][0], dtype=torch.long)


In [None]:
train_ds = DinoDataset(X_train, y_train)
val_ds   = DinoDataset(X_val,   y_val)
test_ds  = DinoDataset(X_test,  y_test)

train_loader = DataLoader(train_ds, batch_size=64, shuffle=True)
val_loader   = DataLoader(val_ds,   batch_size=64, shuffle=False)
test_loader  = DataLoader(test_ds,  batch_size=64, shuffle=False)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(device)


In [None]:
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(384, 1024),
            nn.ReLU(),
            nn.Dropout(0.3),

            nn.Linear(1024, 256),
            nn.ReLU(),
            nn.Dropout(0.3),

            nn.Linear(256, 10)
        )

    def forward(self, x):
        return self.net(x)


# Instantiate model
model = Classifier().to(device)

# Count parameters
def count_params(model):
    total = sum(p.numel() for p in model.parameters())
    trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
    return total, trainable

total, trainable = count_params(model)
print(f"Classifier total parameters: {total:,}")
print(f"Classifier trainable parameters: {trainable:,}")

# Loss & optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)


In [None]:
# train_losses, val_losses = [], []
# train_accs, val_accs = [], []

# for epoch in range(10):
#     model.train()
#     total, correct, running_loss = 0, 0, 0

#     for feats, labels in tqdm(train_loader):
#         feats, labels = feats.to(device), labels.to(device)

#         optimizer.zero_grad()
#         outputs = model(feats)
#         loss = criterion(outputs, labels)
#         loss.backward()
#         optimizer.step()

#         running_loss += loss.item() * feats.size(0)
#         _, predicted = outputs.max(1)
#         correct += predicted.eq(labels).sum().item()
#         total += labels.size(0)

#     train_loss = running_loss / total
#     train_acc  = correct / total
#     train_losses.append(train_loss)
#     train_accs.append(train_acc)

#     # validation
#     model.eval()
#     total, correct, running_loss = 0, 0, 0
#     with torch.no_grad():
#         for feats, labels in val_loader:
#             feats, labels = feats.to(device), labels.to(device)
#             outputs = model(feats)
#             loss = criterion(outputs, labels)

#             running_loss += loss.item() * feats.size(0)
#             _, predicted = outputs.max(1)
#             correct += predicted.eq(labels).sum().item()
#             total += labels.size(0)

#     val_loss = running_loss / total
#     val_acc  = correct / total
#     val_losses.append(val_loss)
#     val_accs.append(val_acc)

#     print(f"Epoch {epoch+1}/10 | Train Acc: {train_acc:.4f} | Val Acc: {val_acc:.4f}")


In [None]:
# --- Training on precomputed features (fast) ---
train_losses, val_losses = [], []
train_accs, val_accs = [], []

for epoch in range(1):  # change to range(10) for full training
    model.train()
    running_loss, correct, total = 0.0, 0, 0

    for feats, labels in tqdm(train_loader):
        feats, labels = feats.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(feats)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * feats.size(0)
        _, preds = outputs.max(1)
        correct += preds.eq(labels).sum().item()
        total += labels.size(0)

    train_loss = running_loss / total
    train_acc = correct / total
    train_losses.append(train_loss)
    train_accs.append(train_acc)

    # --- Validation ---
    model.eval()
    running_loss, correct, total = 0.0, 0, 0
    with torch.no_grad():
        for feats, labels in val_loader:
            feats, labels = feats.to(device), labels.to(device)
            outputs = model(feats)
            loss = criterion(outputs, labels)

            running_loss += loss.item() * feats.size(0)
            _, preds = outputs.max(1)
            correct += preds.eq(labels).sum().item()
            total += labels.size(0)

    val_loss = running_loss / total
    val_acc = correct / total
    val_losses.append(val_loss)
    val_accs.append(val_acc)

    print(f"Epoch {epoch+1}/10 | Train Acc: {train_acc:.4f} | Val Acc: {val_acc:.4f}")


  0%|          | 0/704 [00:00<?, ?it/s]

In [None]:
plt.figure(figsize=(14,5))

plt.subplot(1,2,1)
plt.plot(train_losses, label="Train Loss")
plt.plot(val_losses, label="Val Loss")
plt.title("Loss over epochs")
plt.legend()

plt.subplot(1,2,2)
plt.plot(train_accs, label="Train Acc")
plt.plot(val_accs, label="Val Acc")
plt.title("Accuracy over epochs")
plt.legend()

plt.show()


In [None]:
model.eval()
total, correct = 0, 0

with torch.no_grad():
    for feats, labels in test_loader:
        feats, labels = feats.to(device), labels.to(device)
        outputs = model(feats)
        _, predicted = outputs.max(1)
        correct += predicted.eq(labels).sum().item()
        total += labels.size(0)

test_acc = correct / total
print("Final Test Accuracy:", test_acc)


In [None]:
# ====== Save Model with Prefix 'dukemodel' ======
import torch
import datetime
import os

# Create timestamp
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

# File names
weights_path = f"dukemodel_{timestamp}.pth"
fullmodel_path = f"dukemodel_{timestamp}_full.pth"

# Save weights only (recommended for PyTorch models)
torch.save(model.state_dict(), weights_path)

# Save full model (architecture + weights)
torch.save(model, fullmodel_path)

print("Model saved successfully:")
print("  Weights file :", weights_path)
print("  Full model   :", fullmodel_path)
print("\nFiles are in:", os.getcwd())


4. Modify the second network architecture by tuning the layer hyperparameters or adjusting the layer design to improve test accuracy while remaining within the four million parameter limit. Discuss your observations and the trade offs of the changes you make.

In [None]:
# Your code goes here. Please make sure to explain the reasons behind your data processing and modeling choices.
# 1.4

## 2. Open discussion questions

1. How did hyperparameter tuning (learning rate, dropout, batch size) affect your model’s accuracy? Were there any unexpected results?

In [None]:
# Your argument goes here. Please include data visualization and analysis to back up your argument.
# 2.1

2. How did the constraint of keeping the model within 4 million parameters impact your design choices? Would a larger model necessarily perform better?

In [None]:
# Your argument goes here. Please include data visualization and analysis to back up your argument.
# 2.2

3. How can deep learning models trained on datasets like CIFAR-10 be applied in real-world scenarios? Give an example.

In [None]:
# Your argument goes here. Please include data visualization and analysis to back up your argument.
# 2.3

4. Deep learning models for image recognition can have biases. What ethical concerns should be considered when deploying such models?

In [None]:
# Your argument goes here. Please include data visualization and analysis to back up your argument.
# 2.4

5. What was the most interesting or challenging part of this assignment? If you had more time, what additional improvements would you make?

In [None]:
# Your argument goes here. Please include data visualization and analysis to back up your argument.
# 2.5