## Homework

> **Note**: it's very likely that in this homework your answers won't match
> the options exactly. That's okay and expected. Select the option that's
> closest to your solution.
> If it's exactly in between two options, select the higher value.

### Dataset

In this homework, we'll build a model for classifying various hair types.
For this, we will use the Hair Type dataset that was obtained from
[Kaggle](https://www.kaggle.com/datasets/kavyasreeb/hair-type-dataset)
and slightly rebuilt.

You can download the target dataset for this homework from
[here](https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip):

```bash
wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
unzip data.zip
```

In [1]:
#!wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
#!mv data.zip data

--2025-12-02 10:41:38--  https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://release-assets.githubusercontent.com/github-production-release-asset/405934815/e712cf72-f851-44e0-9c05-e711624af985?sp=r&sv=2018-11-09&sr=b&spr=https&se=2025-12-02T16%3A24%3A50Z&rscd=attachment%3B+filename%3Ddata.zip&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skt=2025-12-02T15%3A24%3A49Z&ske=2025-12-02T16%3A24%3A50Z&sks=b&skv=2018-11-09&sig=wfksabzRG1aiI8HG68MnvL9u2c3iCMT6FcI7NhwyR7A%3D&jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc2NDY5MTg5OSwibmJmIjoxNzY0NjkwMDk5LCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi

In [2]:
#! unzip ./data/data.zip

Archive:  ./data/data.zip
replace data/test/curly/03312ac556a7d003f7570657f80392c34.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C




In the lectures we saw how to use a pre-trained neural network. In the homework, we'll train a much smaller model from scratch.

We will use PyTorch for that.

You can use Google Colab or your own computer for that.

### Data Preparation

The dataset contains around 1000 images of hairs in the separate folders
for training and test sets.

### Reproducibility

Reproducibility in deep learning is a multifaceted challenge that requires attention
to both software and hardware details. In some cases, we can't guarantee exactly the same results during the same experiment runs.

Therefore, in this homework we suggest to set the random number seed generators by:

In [9]:
import numpy as np
import torch

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False


Also, use PyTorch of version 2.8.0 (that's the one in Colab).

In [10]:
torch.__version__

'2.8.0'

In [11]:
import sys

print("=" * 50)
print("Framework GPU Verification")
print("=" * 50)

# System and Python info
print(f"Python version: {sys.version}")
print()

# PyTorch Info
print("\nPYTORCH:")
print(f"  Version: {torch.__version__}")
print(f"  CUDA Available: {torch.cuda.is_available()}")
print(f"  CUDA Version: {torch.version.cuda}")
if torch.cuda.is_available():
    print(f"  GPU Device: {torch.cuda.get_device_name(0)}")
    print(f"  GPU Count: {torch.cuda.device_count()}")

    # Test PyTorch GPU computation
    device = torch.device("cuda")
    x = torch.randn(3, 3).to(device)
    y = torch.randn(3, 3).to(device)
    z = x + y
    print(f"  GPU Test: Computation successful on {z.device}")
else:
    print("  GPU Test: Using CPU")

Framework GPU Verification
Python version: 3.11.13 (main, Jun  5 2025, 08:21:08) [Clang 14.0.6 ]


PYTORCH:
  Version: 2.8.0
  CUDA Available: False
  CUDA Version: None
  GPU Test: Using CPU



### Model

For this homework we will use Convolutional Neural Network (CNN). We'll use PyTorch.

You need to develop the model with following structure:

* The shape for input should be `(3, 200, 200)` (channels first format in PyTorch)
* Next, create a convolutional layer (`nn.Conv2d`):
    * Use 32 filters (output channels)
    * Kernel size should be `(3, 3)` (that's the size of the filter)
    * Use `'relu'` as activation
* Reduce the size of the feature map with max pooling (`nn.MaxPool2d`)
    * Set the pooling size to `(2, 2)`
* Turn the multi-dimensional result into vectors using `flatten` or `view`
* Next, add a `nn.Linear` layer with 64 neurons and `'relu'` activation
* Finally, create the `nn.Linear` layer with 1 neuron - this will be the output
    * The output layer should have an activation - use the appropriate activation for the binary classification case

As optimizer use `torch.optim.SGD` with the following parameters:

* `torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)`

In [12]:
import torch
import torch.nn as nn

class HomeworkCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            # 1) The shape for input should be (3, 200, 200)
            #    -> in_channels=3 means the model expects inputs with 3 channels
            # 2) Next, create a convolutional layer (nn.Conv2d):
            # 3) Use 32 filters (output channels)
            # 4) Kernel size should be (3, 3)
            nn.Conv2d(3, 32, (3, 3)),
            # 5) Use 'relu' as activation
            nn.ReLU(),
            # 6) Reduce the size of the feature map with max pooling (nn.MaxPool2d)
            # 7) Set the pooling size to (2, 2)
            nn.MaxPool2d((2, 2)),
            # 8) Turn the multi-dimensional result into vectors using flatten
            nn.Flatten(),
            # 9) Next, add a nn.Linear layer with 64 neurons and 'relu' activation
            nn.Linear(32 * 99 * 99, 64),
            nn.ReLU(),
            # 10) Finally, create the nn.Linear layer with 1 neuron - this will be the output
            nn.Linear(64, 1),
            # 11) The output layer should have an activation - appropriate for binary classification
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.net(x)

# 12) As optimizer use torch.optim.SGD with the given parameters
model = HomeworkCNN()
optimizer = torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)


In [13]:
# take the images and resize
import os
from PIL import Image
from torch.utils.data import Dataset

class ClothingDataset(Dataset):
    def __init__(self, data_dir, transform=None):
        self.data_dir = data_dir
        self.transform = transform
        self.image_paths = []
        self.labels = []
        self.classes = sorted(os.listdir(data_dir))
        self.class_to_idx = {cls: i for i, cls in enumerate(self.classes)}

        for label_name in self.classes:
            label_dir = os.path.join(data_dir, label_name)
            for img_name in os.listdir(label_dir):
                self.image_paths.append(os.path.join(label_dir, img_name))
                self.labels.append(self.class_to_idx[label_name])

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        img_path = self.image_paths[idx]
        image = Image.open(img_path).convert('RGB')
        label = self.labels[idx]

        if self.transform:
            image = self.transform(image)

        return image, label



### Question 1

Which loss function you will use?

* `nn.MSELoss()`
* `nn.BCEWithLogitsLoss()` <--
* `nn.CrossEntropyLoss()`
* `nn.CosineEmbeddingLoss()`

(Multiple answered can be correct, so pick any)


### Question 2

What's the total number of parameters of the model? You can use `torchsummary` or count manually.

In PyTorch, you can find the total number of parameters using:

```python
# Option 1: Using torchsummary

In [15]:
from torchsummary import summary
summary(model, input_size=(3, 200, 200))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 32, 198, 198]             896
              ReLU-2         [-1, 32, 198, 198]               0
         MaxPool2d-3           [-1, 32, 99, 99]               0
           Flatten-4               [-1, 313632]               0
            Linear-5                   [-1, 64]      20,072,512
              ReLU-6                   [-1, 64]               0
            Linear-7                    [-1, 1]              65
           Sigmoid-8                    [-1, 1]               0
Total params: 20,073,473
Trainable params: 20,073,473
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.46
Forward/backward pass size (MB): 23.93
Params size (MB): 76.57
Estimated Total Size (MB): 100.96
----------------------------------------------------------------



# Option 2: Manual counting
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters: {total_params}")
```

* 896
* 11214912
* 15896912
* 20073473 <-

### Generators and Training

For the next two questions, use the following transformation for both train and test sets:


In [22]:
from torchvision import transforms


test_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
	transforms.RandomRotation(50),
    transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
    transforms.RandomHorizontalFlip(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ) # ImageNet normalization
])


In [23]:
from torch.utils.data import DataLoader

train_dataset = ClothingDataset(
    data_dir='./data/train',
    transform=test_transforms
)

test_dataset = ClothingDataset(
    data_dir='./data/test',
    transform=test_transforms
)

train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)
val_loader = DataLoader(test_dataset, batch_size=20, shuffle=False)



* We don't need to do any additional pre-processing for the images.
* Use `batch_size=20`
* Use `shuffle=True` for both training, but `False` for test.

Now fit the model.

You can use this code:

In [24]:
criterion = nn.BCEWithLogitsLoss()
num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        # forward pass
        outputs = model(images)
        # calculate the loss
        loss = criterion(outputs, labels)
        # backward pass and optimize
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(test_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

Epoch 1/10, Loss: 0.6586, Acc: 0.5169, Val Loss: 0.6476, Val Acc: 0.5124
Epoch 2/10, Loss: 0.6510, Acc: 0.5318, Val Loss: 0.6738, Val Acc: 0.5920
Epoch 3/10, Loss: 0.6534, Acc: 0.5493, Val Loss: 0.6386, Val Acc: 0.5323
Epoch 4/10, Loss: 0.6458, Acc: 0.5843, Val Loss: 0.6440, Val Acc: 0.6020
Epoch 5/10, Loss: 0.6448, Acc: 0.6117, Val Loss: 0.6602, Val Acc: 0.5373
Epoch 6/10, Loss: 0.6414, Acc: 0.5618, Val Loss: 0.6556, Val Acc: 0.6219
Epoch 7/10, Loss: 0.6920, Acc: 0.6355, Val Loss: 0.6928, Val Acc: 0.6368
Epoch 8/10, Loss: 0.6931, Acc: 0.6017, Val Loss: 0.6931, Val Acc: 0.6816
Epoch 9/10, Loss: 0.6928, Acc: 0.6342, Val Loss: 0.6927, Val Acc: 0.6716
Epoch 10/10, Loss: 0.6846, Acc: 0.6816, Val Loss: 0.6622, Val Acc: 0.6219


```
results for questions 3 and 4:
Epoch 1/10, Loss: 0.5982, Acc: 0.4894, Val Loss: 0.6682, Val Acc: 0.4876
Epoch 2/10, Loss: 0.5910, Acc: 0.4881, Val Loss: 0.6507, Val Acc: 0.4876
Epoch 3/10, Loss: 0.5925, Acc: 0.4894, Val Loss: 0.6727, Val Acc: 0.4876
Epoch 4/10, Loss: 0.5866, Acc: 0.4869, Val Loss: 0.6502, Val Acc: 0.4975
Epoch 5/10, Loss: 0.5807, Acc: 0.4894, Val Loss: 0.6456, Val Acc: 0.5025
Epoch 6/10, Loss: 0.6470, Acc: 0.6454, Val Loss: 0.6615, Val Acc: 0.6119
Epoch 7/10, Loss: 0.6231, Acc: 0.5880, Val Loss: 0.6477, Val Acc: 0.5423
Epoch 8/10, Loss: 0.5847, Acc: 0.5094, Val Loss: 0.6442, Val Acc: 0.5025
Epoch 9/10, Loss: 0.5738, Acc: 0.4931, Val Loss: 0.6542, Val Acc: 0.4925
Epoch 10/10, Loss: 0.5685, Acc: 0.4931, Val Loss: 0.6544, Val Acc: 0.4975

results for questions 5 and 6:
Epoch 1/10, Loss: 0.6586, Acc: 0.5169, Val Loss: 0.6476, Val Acc: 0.5124
Epoch 2/10, Loss: 0.6510, Acc: 0.5318, Val Loss: 0.6738, Val Acc: 0.5920
Epoch 3/10, Loss: 0.6534, Acc: 0.5493, Val Loss: 0.6386, Val Acc: 0.5323
Epoch 4/10, Loss: 0.6458, Acc: 0.5843, Val Loss: 0.6440, Val Acc: 0.6020
Epoch 5/10, Loss: 0.6448, Acc: 0.6117, Val Loss: 0.6602, Val Acc: 0.5373
Epoch 6/10, Loss: 0.6414, Acc: 0.5618, Val Loss: 0.6556, Val Acc: 0.6219
Epoch 7/10, Loss: 0.6920, Acc: 0.6355, Val Loss: 0.6928, Val Acc: 0.6368
Epoch 8/10, Loss: 0.6931, Acc: 0.6017, Val Loss: 0.6931, Val Acc: 0.6816
Epoch 9/10, Loss: 0.6928, Acc: 0.6342, Val Loss: 0.6927, Val Acc: 0.6716
Epoch 10/10, Loss: 0.6846, Acc: 0.6816, Val Loss: 0.6622, Val Acc: 0.6219
```




### Question 3

What is the median of training accuracy for all the epochs for this model?

* 0.05
* 0.12
* 0.40 <-
* 0.84

0.4869
0.4881
0.4894
0.4894
0.4894 <- middle
0.4931 <- middle
0.4931
0.5094
0.5880
0.6454

(0.4894+0.4931)/2 = 0.49125

### Question 4

What is the standard deviation of training loss for all the epochs for this model?

* 0.007 <-
* 0.078
* 0.171
* 1.710

In [25]:

train_losses = np.array([
    0.5982,
    0.5910,
    0.5925,
    0.5866,
    0.5807,
    0.6470,
    0.6231,
    0.5847,
    0.5738,
    0.5685
])

train_losses.std()

np.float64(0.022488683820979835)


### Data Augmentation

For the next two questions, we'll generate more data using data augmentations.

Add the following augmentations to your training data generator:

```python
transforms.RandomRotation(50),
transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
transforms.RandomHorizontalFlip(),
```

### Question 5

Let's train our model for 10 more epochs using the same code as previously.

> **Note:** make sure you don't re-create the model.
> we want to continue training the model we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

* 0.008
* 0.08
* 0.88 <-
* 8.88

0.6476 +
0.6738 +
0.6386 +
0.6440 +
0.6602 +
0.6556 +
0.6928 +
0.6931 +
0.6927 +
0.6622
= 6.9606/10 = 0.69606


### Question 6

What's the average of test accuracy for the last 5 epochs (from 6 to 10)
for the model trained with augmentations?

* 0.08
* 0.28
* 0.68  <-
* 0.98

0.6219 +
0.6368 +
0.6816 +
0.6716 +
0.6219
= 3.2338/5 = 0.64676




## Submit the results

* Submit your results here: https://courses.datatalks.club/ml-zoomcamp-2025/homework/hw08
* If your answer doesn't match options exactly, select the closest one. If the answer is exactly in between two options, select the higher value.