<a href="https://colab.research.google.com/github/ericjenkinson/dtc-mlzoomcamp-2025-homework/blob/08-Deep-Learning/08-Deep-Learning/homework_8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Homework

> Note: it's very likely that in this homework your answers won't match the options exactly. That's okay and expected. Select the option that's closest to your solution. If it's exactly in between two options, select the higher value.

### Dataset

In this homework, we'll build a model for classifying various hair types. For this, we will use the Hair Type dataset that was obtained from [Kaggle](https://www.kaggle.com/datasets/kavyasreeb/hair-type-dataset) and slightly rebuilt.

You can download the target dataset for this homework from [here](https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip):

```bash
wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
```

In [1]:
# !wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip

--2025-12-01 17:51:14--  https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://release-assets.githubusercontent.com/github-production-release-asset/405934815/e712cf72-f851-44e0-9c05-e711624af985?sp=r&sv=2018-11-09&sr=b&spr=https&se=2025-12-01T18%3A45%3A20Z&rscd=attachment%3B+filename%3Ddata.zip&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skt=2025-12-01T17%3A44%3A22Z&ske=2025-12-01T18%3A45%3A20Z&sks=b&skv=2018-11-09&sig=teX3%2BzLPvsOvQIz0sKYSJcTQdMGtBoODIgUGZ5x32fs%3D&jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc2NDYxMzI3NCwibmJmIjoxNzY0NjExNDc0LCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi5i

unzip data.zip.
In the lectures we saw how to use a pre-trained neural network. In the homework, we'll train a much smaller model from scratch.

We will use PyTorch for that.

You can use Google Colab or your own computer for that.

### Data Preparation

The dataset contains around 1000 images of hairs in the separate folders for training and test sets.

### Reproducibility

Reproducibility in deep learning is a multifaceted challenge that requires attention to both software and hardware details. In some cases, we can't guarantee exactly the same results during the same experiment runs.

Therefore, in this homework we suggest to set the random number seed generators by:

In [2]:
import numpy as np
import torch

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

Also, use PyTorch of version 2.8.0 (that's the one in Colab).

### Model

For this homework we will use Convolutional Neural Network (CNN). We'll use PyTorch.

You need to develop the model with following structure:

* The shape for input should be `(3, 200, 200)` (channels first format in PyTorch)
* Next, create a convolutional layer (`nn.Conv2d`):
  * Use 32 filters (output channels)
  * Kernel size should be `(3, 3)` (that's the size of the filter), padding = 0, stride = 1
  * Use `'relu'` as activation
* Reduce the size of the feature map with max pooling (`nn.MaxPool2d`)
  *Set the pooling size to `(2, 2)`
* Turn the multi-dimensional result into vectors using `flatten` or `view`
* Next, add a `nn.Linear layer` with 64 neurons and `'relu'` activation
* Finally, create the `nn.Linear` layer with 1 neuron - this will be the output
  * The output layer should have an activation - use the appropriate activation for the binary classification case

As optimizer use `torch.optim.SGD` with the following parameters:

`torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)`

### Question 1

Which loss function you will use?

* `nn.MSELoss()`
* `nn.BCEWithLogitsLoss()`  <--
* `nn.CrossEntropyLoss()`
* `nn.CosineEmbeddingLoss()`

(Multiple answered can be correct, so pick any)



In [3]:
import torch.nn as nn
import torch.optim as optim

# 1. Model Structure
class BinaryClassifierCNN(nn.Module):
    def __init__(self):
        super(BinaryClassifierCNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=0, stride=1)
        self.pool = nn.MaxPool2d(kernel_size=2)
        # Flatten size: 32 channels * 99 height * 99 width
        self.fc1 = nn.Linear(32 * 99 * 99, 64)
        self.fc2 = nn.Linear(64, 1)

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = torch.flatten(x, start_dim=1)
        x = torch.relu(self.fc1(x))

        # IMPORTANT CHANGE:
        # We return the raw output ("logits") without Sigmoid.
        # BCEWithLogitsLoss will apply the Sigmoid internally.
        x = self.fc2(x)
        return x

# 2. Setup
model = BinaryClassifierCNN()

# Optimizer as requested
optimizer = optim.SGD(model.parameters(), lr=0.002, momentum=0.8)

# Loss Function: Handles Sigmoid + BCELoss
criterion = nn.BCEWithLogitsLoss()

# 3. Generate Dummy Data for Testing
# Batch size of 4, 3 channels, 200x200 images
inputs = torch.randn(4, 3, 200, 200)
# Binary targets (0 or 1), shape needs to match output (4, 1)
targets = torch.empty(4, 1).random_(2)

### Question 2

What's the total number of parameters of the model? You can use torchsummary or count manually.

In PyTorch, you can find the total number of parameters using:
```python
# Option 1: Using torchsummary (install with: pip install torchsummary)
from torchsummary import summary
summary(model, input_size=(3, 200, 200))

# Option 2: Manual counting
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters: {total_params}")
```

* 896
* 11,214,912
* 15,896,912
* 20,073,473  <---

In [4]:
from torchsummary import summary
summary(model, input_size=(3, 200, 200))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 32, 198, 198]             896
         MaxPool2d-2           [-1, 32, 99, 99]               0
            Linear-3                   [-1, 64]      20,072,512
            Linear-4                    [-1, 1]              65
Total params: 20,073,473
Trainable params: 20,073,473
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.46
Forward/backward pass size (MB): 11.96
Params size (MB): 76.57
Estimated Total Size (MB): 89.00
----------------------------------------------------------------


### Generators and Training

For the next two questions, use the following transformation for both train and test sets:


In [8]:
from torchvision import transforms
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader

# Define the Transformations
transform_rules = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

# Load datasets
train_dataset = ImageFolder(root='data/train', transform=transform_rules)
test_dataset = ImageFolder(root='data/test', transform=transform_rules)

# Loaders
train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=20, shuffle=False)

print(f"Classes found: {train_dataset.class_to_idx}")
# Expected: {'curly': 0, 'straight': 1}

Classes found: {'curly': 0, 'straight': 1}


Now fit the model.

In [12]:
# Check if a GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(f"Using device: {device}")

# Move the model to this device
model = model.to(device)

num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(test_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

Using device: cpu
Epoch 1/10, Loss: 0.6633, Acc: 0.5950, Val Loss: 0.6228, Val Acc: 0.6219
Epoch 2/10, Loss: 0.5809, Acc: 0.6737, Val Loss: 0.6104, Val Acc: 0.6617
Epoch 3/10, Loss: 0.5369, Acc: 0.7100, Val Loss: 0.5816, Val Acc: 0.6766
Epoch 4/10, Loss: 0.4701, Acc: 0.7588, Val Loss: 0.6133, Val Acc: 0.6219
Epoch 5/10, Loss: 0.4208, Acc: 0.7975, Val Loss: 0.6707, Val Acc: 0.6418
Epoch 6/10, Loss: 0.3394, Acc: 0.8562, Val Loss: 0.6222, Val Acc: 0.7114
Epoch 7/10, Loss: 0.2887, Acc: 0.8912, Val Loss: 0.6983, Val Acc: 0.6965
Epoch 8/10, Loss: 0.1973, Acc: 0.9313, Val Loss: 0.7719, Val Acc: 0.7264
Epoch 9/10, Loss: 0.1925, Acc: 0.9325, Val Loss: 0.7605, Val Acc: 0.6915
Epoch 10/10, Loss: 0.1425, Acc: 0.9513, Val Loss: 0.8277, Val Acc: 0.7313


### Question 3

What is the median of training accuracy for all the epochs for this model?

* 0.05
* 0.12
* 0.40
* 0.84  <--

In [13]:
training_acc_list = history['acc']
median_acc = np.median(training_acc_list)
print(f"Median Training Accuracy: {median_acc:.4f}")

Median Training Accuracy: 0.8269


### Question 4

What is the standard deviation of training loss for all the epochs for this model?

* 0.007
* 0.078
* 0.171  <--
* 1.710

In [14]:
training_loss_list = history['loss']
std_loss = np.std(training_loss_list)
print(f"Standard Deviation of Training Loss: {std_loss:.4f}")

Standard Deviation of Training Loss: 0.1702


###Data Augmentation

For the next two questions, we'll generate more data using data augmentations.

Add the following augmentations to your training data generator:

```python
transforms.RandomRotation(50),
transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
transforms.RandomHorizontalFlip(),
```

In [17]:
new_train_transforms = transforms.Compose([
    transforms.RandomRotation(50),
    transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

train_dataset = ImageFolder(root='data/train', transform=new_train_transforms)

train_loader = DataLoader(
    train_dataset,
    batch_size=20,
    shuffle=True
)

In [19]:
additional_epochs = 10

for epoch in range(additional_epochs):
    model.train() # Set model to training mode
    running_loss = 0.0
    correct_train = 0
    total_train = 0

    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train

    # Append to the EXISTING history dictionary
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    # Validation Phase
    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0

    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(test_dataset)
    val_epoch_acc = correct_val / total_val

    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    # We add 11 to the print statement so it looks like "Epoch 11", "Epoch 12", etc.
    print(f"Epoch {epoch + 11}/{additional_epochs + 10}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

print("Extended training complete.")

Epoch 11/20, Loss: 0.5945, Acc: 0.6587, Val Loss: 0.5920, Val Acc: 0.6915
Epoch 12/20, Loss: 0.5524, Acc: 0.7238, Val Loss: 0.5768, Val Acc: 0.6567
Epoch 13/20, Loss: 0.5372, Acc: 0.7150, Val Loss: 0.5748, Val Acc: 0.7164
Epoch 14/20, Loss: 0.5273, Acc: 0.7200, Val Loss: 0.6611, Val Acc: 0.6468
Epoch 15/20, Loss: 0.5122, Acc: 0.7325, Val Loss: 0.6927, Val Acc: 0.6517
Epoch 16/20, Loss: 0.4967, Acc: 0.7588, Val Loss: 0.5635, Val Acc: 0.6866
Epoch 17/20, Loss: 0.4826, Acc: 0.7588, Val Loss: 0.6114, Val Acc: 0.7065
Epoch 18/20, Loss: 0.4997, Acc: 0.7588, Val Loss: 0.5788, Val Acc: 0.7313
Epoch 19/20, Loss: 0.4998, Acc: 0.7512, Val Loss: 0.5331, Val Acc: 0.7463
Epoch 20/20, Loss: 0.4655, Acc: 0.7712, Val Loss: 0.5722, Val Acc: 0.7065
Extended training complete.


### Question 5

Let's train our model for 10 more epochs using the same code as previously.

>Note: make sure you don't re-create the model. we want to continue training the model we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

* 0.008
* 0.08
* 0.88  <--
* 8.88


In [22]:
aug_val_loss = history['val_loss']
mean_aug_val_loss = np.mean(aug_val_loss)

print(f"Mean Test Loss (All epochs): {mean_aug_val_loss:.4f}")

Mean Test Loss (All epochs): 0.5956


###Question 6

What's the average of test accuracy for the last 5 epochs (from 6 to 10) for the model trained with augmentations?

* 0.08
* 0.28
* 0.68 <--
* 0.98


In [23]:
aug_val_loss = history['val_loss'][-5:]
mean_aug_val_loss = np.mean(aug_val_loss)

print(f"Mean Test Loss (last 5 epochs): {mean_aug_val_loss:.4f}")

Mean Test Loss (last 5 epochs): 0.5718
