### <CENTER>ML ZOOMCAMP 2025 </CENTER>
### <CENTER>08 DEEP LEARNING - Homework</CENTER>
### <CENTER>ANGOLE DANIEL</CENTER>

Note: it's very likely that in this homework your answers won't match the options exactly. That's okay and expected. Select the option that's closest to your solution. If it's exactly in between two options, select the higher value.

#### Dataset

In this homework, we'll build a model for classifying various hair types. For this, we will use the Hair Type dataset that was obtained from Kaggle and slightly rebuilt.

You can download the target dataset for this homework from here:

wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
unzip data.zip



In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import torch
import torch.nn as nn
from torchsummary import summary
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

In [2]:
#  !wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip

In [3]:
# !unzip data.zip

In the lectures we saw how to use a pre-trained neural network. In the homework, we'll train a much smaller model from scratch.

We will use PyTorch for that.

You can use Google Colab or your own computer for that.

#### Data Preparation

The dataset contains around 1000 images of hairs in the separate folders for training and test sets.

#### Reproducibility

Reproducibility in deep learning is a multifaceted challenge that requires attention to both software and hardware details. In some cases, we can't guarantee exactly the same results during the same experiment runs.

Therefore, in this homework we suggest to set the random number seed generators by:

```python
import numpy as np
import torch

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
Also, use PyTorch of version 2.8.0 (that's the one in Colab).
```


In [4]:
train_gen = ImageDataGenerator()

train_ds = train_gen.flow_from_directory(
    './data/train',
    target_size=(200, 200),
    batch_size=32
)

test_gen = ImageDataGenerator()

test_ds = test_gen.flow_from_directory(
    './data/test',
    target_size=(200, 200),
    batch_size=32
)

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

Found 800 images belonging to 2 classes.
Found 201 images belonging to 2 classes.


#### Model

For this homework we will use Convolutional Neural Network (CNN). We'll use PyTorch.

You need to develop the model with following structure:

- The shape for input should be (3, 200, 200) (channels first format in PyTorch)
- Next, create a convolutional layer (nn.Conv2d):
    - Use 32 filters (output channels)
    - Kernel size should be (3, 3) (that's the size of the filter)
    - Use 'relu' as activation
- Reduce the size of the feature map with max pooling (nn.MaxPool2d)
    - Set the pooling size to (2, 2)
- Turn the multi-dimensional result into vectors using flatten or view
- Next, add a nn.Linear layer with 64 neurons and 'relu' activation
- Finally, create the nn.Linear layer with 1 neuron - this will be the output
    - The output layer should have an activation - use the appropriate activation for the binary classification case
As optimizer use torch.optim.SGD with the following parameters:

- torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)

In [5]:
import torch
import torch.nn as nn

class HairTypeCNN(nn.Module):
    def __init__(self):
        super(HairTypeCNN, self).__init__()

        # Convolutional block
        self.conv = nn.Sequential(
            nn.Conv2d(
                in_channels=3,
                out_channels=32,
                kernel_size=3
            ),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )

        # After conv: input (3,200,200)
        # Conv output: (32,198,198)
        # After pool: (32,99,99)
        flattened_size = 32 * 99 * 99

        # Fully connected layers
        self.fc = nn.Sequential(
            nn.Linear(flattened_size, 64),
            nn.ReLU(),
            nn.Linear(64, 1)   # binary output
        )

    def forward(self, x):
        x = self.conv(x)
        x = x.view(x.size(0), -1)  # flatten
        x = self.fc(x)
        return x  # logits â†’ use BCEWithLogitsLoss


# Instantiate the model
model = HairTypeCNN()

# Print summary
print(model)

HairTypeCNN(
  (conv): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Sequential(
    (0): Linear(in_features=313632, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=1, bias=True)
  )
)


In [6]:
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)

#### Question 1

Which loss function you will use?

- nn.MSELoss()
- nn.BCEWithLogitsLoss()
- nn.CrossEntropyLoss()
- nn.CosineEmbeddingLoss()

(Multiple answered can be correct, so pick any)

#### Question 2

What's the total number of parameters of the model? You can use torchsummary or count manually.

In PyTorch, you can find the total number of parameters using:

```python
# Option 1: Using torchsummary (install with: pip install torchsummary)
from torchsummary import summary
summary(model, input_size=(3, 200, 200))

# Option 2: Manual counting
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters: {total_params}")
```

In [7]:
summary(model, input_size=(3, 200, 200))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 32, 198, 198]             896
              ReLU-2         [-1, 32, 198, 198]               0
         MaxPool2d-3           [-1, 32, 99, 99]               0
            Linear-4                   [-1, 64]      20,072,512
              ReLU-5                   [-1, 64]               0
            Linear-6                    [-1, 1]              65
Total params: 20,073,473
Trainable params: 20,073,473
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.46
Forward/backward pass size (MB): 21.54
Params size (MB): 76.57
Estimated Total Size (MB): 98.57
----------------------------------------------------------------


In [8]:
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters: {total_params}")

Total parameters: 20073473


#### Generators and Training

For the next two questions, use the following transformation for both train and test sets:

```python
train_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ) # ImageNet normalization
])
```

- We don't need to do any additional pre-processing for the images.
- Use batch_size=20
- Use shuffle=True for both training, but False for test.

Now fit the model.

```python
You can use this code:

num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in validation_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(validation_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}"))       
```

In [9]:
train_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

test_transforms = train_transforms

train_dataset = datasets.ImageFolder("data/train", transform=train_transforms)
test_dataset = datasets.ImageFolder("data/test", transform=test_transforms)

train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=20, shuffle=False)

len(train_dataset), len(test_dataset)

(800, 201)

In [10]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(test_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

Epoch 1/10, Loss: 0.6665, Acc: 0.6112, Val Loss: 0.6511, Val Acc: 0.6617
Epoch 2/10, Loss: 0.5702, Acc: 0.6787, Val Loss: 0.6332, Val Acc: 0.6318
Epoch 3/10, Loss: 0.5207, Acc: 0.7350, Val Loss: 0.6143, Val Acc: 0.6766
Epoch 4/10, Loss: 0.4773, Acc: 0.7600, Val Loss: 0.6049, Val Acc: 0.6617
Epoch 5/10, Loss: 0.4606, Acc: 0.7550, Val Loss: 0.7307, Val Acc: 0.5672
Epoch 6/10, Loss: 0.3954, Acc: 0.8275, Val Loss: 0.6412, Val Acc: 0.6866
Epoch 7/10, Loss: 0.2844, Acc: 0.8838, Val Loss: 0.8307, Val Acc: 0.6816
Epoch 8/10, Loss: 0.2885, Acc: 0.8788, Val Loss: 0.7052, Val Acc: 0.7114
Epoch 9/10, Loss: 0.1882, Acc: 0.9313, Val Loss: 0.9275, Val Acc: 0.6866
Epoch 10/10, Loss: 0.2585, Acc: 0.8912, Val Loss: 0.8158, Val Acc: 0.6915


### Question 3

What is the median of training accuracy for all the epochs for this model?

- 0.05
- 0.12
- 0.40
- 0.84


In [11]:
median_train_acc = np.median(history['acc'])
median_train_acc

np.float64(0.79375)

#### Question 4

What is the standard deviation of training loss for all the epochs for this model?

- 0.007
- 0.078
- 0.171
- 1.710


In [12]:
std_train_loss = np.std(history['loss'])
std_train_loss

np.float64(0.14617721532456407)

#### Data Augmentation

For the next two questions, we'll generate more data using data augmentations.

Add the following augmentations to your training data generator:

```python
transforms.RandomRotation(50),
transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
transforms.RandomHorizontalFlip(),
```

In [13]:
augmented_transforms = transforms.Compose([
    transforms.RandomRotation(50),
    transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

train_dataset_aug = datasets.ImageFolder("data/train", transform=augmented_transforms)

train_loader_aug = DataLoader(train_dataset_aug, batch_size=20, shuffle=True)

#### Question 5

Let's train our model for 10 more epochs using the same code as previously.

Note: make sure you don't re-create the model. we want to continue training the model we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

- 0.008
- 0.08
- 0.88
- 8.88

In [14]:
num_epochs_aug = 10

aug_history = {'val_loss': [], 'val_acc': []}

for epoch in range(num_epochs_aug):
    model.train()
    running_loss = 0.0

    for images, labels in train_loader_aug:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    # Validation after each epoch
    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0

    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            correct_val += (predicted == labels).sum().item()
            total_val += labels.size(0)

    val_epoch_loss = val_running_loss / len(test_dataset)
    val_epoch_acc = correct_val / total_val

    aug_history['val_loss'].append(val_epoch_loss)
    aug_history['val_acc'].append(val_epoch_acc)

    print(f"[AUG] Epoch {epoch+1}/{num_epochs_aug}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

[AUG] Epoch 1/10, Val Loss: 0.5963, Val Acc: 0.7114
[AUG] Epoch 2/10, Val Loss: 0.5968, Val Acc: 0.7214
[AUG] Epoch 3/10, Val Loss: 0.5870, Val Acc: 0.7015
[AUG] Epoch 4/10, Val Loss: 0.5696, Val Acc: 0.7264
[AUG] Epoch 5/10, Val Loss: 0.6706, Val Acc: 0.6766
[AUG] Epoch 6/10, Val Loss: 0.5673, Val Acc: 0.7413
[AUG] Epoch 7/10, Val Loss: 0.5685, Val Acc: 0.6915
[AUG] Epoch 8/10, Val Loss: 0.5941, Val Acc: 0.7015
[AUG] Epoch 9/10, Val Loss: 0.5517, Val Acc: 0.7363
[AUG] Epoch 10/10, Val Loss: 0.5401, Val Acc: 0.7164


In [15]:
mean_test_loss = np.mean(aug_history['val_loss'])
mean_test_loss

np.float64(0.5841957455399025)

#### Question 6

What's the average of test accuracy for the last 5 epochs (from 6 to 10) for the model trained with augmentations?

- 0.08
- 0.28
- 0.68
- 0.98

In [16]:
last5_acc = aug_history['val_acc'][5:]
mean_last5_acc = np.mean(last5_acc)
mean_last5_acc

np.float64(0.7174129353233829)