In [2]:
!wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip


--2025-12-02 06:28:23--  https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
Resolving github.com (github.com)... 20.27.177.113
Connecting to github.com (github.com)|20.27.177.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://release-assets.githubusercontent.com/github-production-release-asset/405934815/e712cf72-f851-44e0-9c05-e711624af985?sp=r&sv=2018-11-09&sr=b&spr=https&se=2025-12-02T07%3A10%3A14Z&rscd=attachment%3B+filename%3Ddata.zip&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skt=2025-12-02T06%3A10%3A08Z&ske=2025-12-02T07%3A10%3A14Z&sks=b&skv=2018-11-09&sig=2iLuipeGBW5uqu3iDwmmlnuLbO5F6Sz3GCc3oU30pig%3D&jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmVsZWFzZS1hc3NldHMuZ2l0aHVidXNlcmNvbnRlbnQuY29tIiwia2V5Ijoia2V5MSIsImV4cCI6MTc2NDY1ODcwMywibmJmIjoxNzY0NjU2OTAzLCJwYXRoIjoicmVsZWFzZWFzc2V0cHJvZHVjdGlvbi5i

In [1]:
import numpy as np
import torch

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

## CNN Model

In [2]:
import torch
import torch.nn as nn

class HairCNN(nn.Module):
    def __init__(self):
        super(HairCNN, self).__init__()

        # Conv layer: 32 filters, kernel 3x3, padding=0, stride=1
        self.conv = nn.Conv2d(
            in_channels=3,
            out_channels=32,
            kernel_size=3,
            stride=1,
            padding=0
        )

        # MaxPool 2x2
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        # Fully connected layers
        # Input shape: (3, 200, 200)
        # After conv (no padding): -> (32, 198, 198)
        # After 2×2 maxpool: -> (32, 99, 99)
        flatten_dim = 32 * 99 * 99

        self.fc1 = nn.Linear(flatten_dim, 64)  # 64 neurons
        self.fc2 = nn.Linear(64, 1)             # output layer (1 neuron)
        self.sigmoid = nn.Sigmoid()             # binary classification

    def forward(self, x):
        x = torch.relu(self.conv(x))
        x = self.pool(x)
        x = torch.flatten(x, 1)   # or x.view(x.size(0), -1)
        x = torch.relu(self.fc1(x))
        # x = self.sigmoid(self.fc2(x))
        x = self.fc2(x)
        return x


## Optimizer

### Question 1
Which loss function you will use?

- `nn.MSELoss()`
- `nn.BCEWithLogitsLoss()`
- `nn.CrossEntropyLoss()`
- `nn.CosineEmbeddingLoss()`

(Multiple answered can be correct, so pick any)

In [3]:
model = HairCNN()
optimizer = torch.optim.SGD(model.parameters(), lr=0.002, momentum=0.8)
criterion = nn.BCEWithLogitsLoss()  # 二元分類對應 sigmoid 輸出


### Question 2
What's the total number of parameters of the model? You can use torchsummary or count manually.

In PyTorch, you can find the total number of parameters using:

In [4]:
# Option 1: Using torchsummary (install with: pip install torchsummary)
from torchsummary import summary
summary(model, input_size=(3, 200, 200))

# Option 2: Manual counting
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters: {total_params}")

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 32, 198, 198]             896
         MaxPool2d-2           [-1, 32, 99, 99]               0
            Linear-3                   [-1, 64]      20,072,512
            Linear-4                    [-1, 1]              65
Total params: 20,073,473
Trainable params: 20,073,473
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.46
Forward/backward pass size (MB): 11.96
Params size (MB): 76.57
Estimated Total Size (MB): 89.00
----------------------------------------------------------------
Total parameters: 20073473


## Generators and Training
For the next two questions, use the following transformation for both train and test sets:

In [5]:
from torchvision import transforms

train_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ) # ImageNet normalization
])
test_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ) # ImageNet normalization
])

We don't need to do any additional pre-processing for the images.

- Use `batch_size`=20
- Use `shuffle=True` for both training, but `False` for test.

Now fit the model.

You can use this code:

In [15]:
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader

train_dataset = ImageFolder("data/train", transform=train_transforms)
validation_dataset  = ImageFolder("data/test",  transform=test_transforms)

train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)
validation_loader  = DataLoader(test_dataset,  batch_size=20, shuffle=False)


In [16]:
# 自動選擇 GPU（如果有）或 CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

# 將模型也搬到 device
model = model.to(device)

Using device: cpu


In [17]:
num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in validation_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(validation_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

Epoch 1/10, Loss: 0.5333, Acc: 0.7388, Val Loss: 0.6053, Val Acc: 0.6567
Epoch 2/10, Loss: 0.4731, Acc: 0.7775, Val Loss: 0.6417, Val Acc: 0.6667
Epoch 3/10, Loss: 0.4803, Acc: 0.7588, Val Loss: 0.8099, Val Acc: 0.6119
Epoch 4/10, Loss: 0.3998, Acc: 0.8313, Val Loss: 0.6123, Val Acc: 0.6716
Epoch 5/10, Loss: 0.2956, Acc: 0.8688, Val Loss: 0.7451, Val Acc: 0.7114
Epoch 6/10, Loss: 0.3051, Acc: 0.8638, Val Loss: 0.7088, Val Acc: 0.6866
Epoch 7/10, Loss: 0.2172, Acc: 0.9025, Val Loss: 0.9213, Val Acc: 0.6716
Epoch 8/10, Loss: 0.3192, Acc: 0.8675, Val Loss: 0.7105, Val Acc: 0.7114
Epoch 9/10, Loss: 0.1625, Acc: 0.9413, Val Loss: 0.7524, Val Acc: 0.7363
Epoch 10/10, Loss: 0.1446, Acc: 0.9463, Val Loss: 0.8719, Val Acc: 0.7164


### Question 3
What is the median of training accuracy for all the epochs for this model?
- 0.05
- 0.12
- 0.40
- 0.84


In [18]:
import numpy as np

median_acc = np.median(history['acc'])
print(median_acc)


0.8656250000000001


### Question 4
What is the standard deviation of training loss for all the epochs for this model?
- 0.007
- 0.078
- 0.171
- 1.710

In [19]:
std_loss = np.std(history['loss'])
print(std_loss)


0.12893830517029975


## Data Augmentation
For the next two questions, we'll generate more data using data augmentations.

Add the following augmentations to your training data generator:

```python
transforms.RandomRotation(50),
transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),
transforms.RandomHorizontalFlip(),
```

In [25]:
from torchvision import transforms

train_transforms = transforms.Compose([
    transforms.RandomRotation(50),  # 隨機旋轉 ±50 度
    transforms.RandomResizedCrop(200, scale=(0.9, 1.0), ratio=(0.9, 1.1)),  # 隨機裁切並 resize
    transforms.RandomHorizontalFlip(),  # 隨機水平翻轉
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

# test transforms 不需要 augmentation
test_transforms = transforms.Compose([
    transforms.Resize((200, 200)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])


In [26]:
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader

train_dataset = ImageFolder("data/train", transform=train_transforms)
validation_dataset  = ImageFolder("data/test",  transform=test_transforms)

train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)
validation_loader  = DataLoader(test_dataset,  batch_size=20, shuffle=False)


In [27]:
num_epochs = 10
history = {'acc': [], 'loss': [], 'val_acc': [], 'val_loss': []}

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        labels = labels.float().unsqueeze(1) # Ensure labels are float and have shape (batch_size, 1)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        # For binary classification with BCEWithLogitsLoss, apply sigmoid to outputs before thresholding for accuracy
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total_train += labels.size(0)
        correct_train += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = correct_train / total_train
    history['loss'].append(epoch_loss)
    history['acc'].append(epoch_acc)

    model.eval()
    val_running_loss = 0.0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in validation_loader:
            images, labels = images.to(device), labels.to(device)
            labels = labels.float().unsqueeze(1)

            outputs = model(images)
            loss = criterion(outputs, labels)

            val_running_loss += loss.item() * images.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            total_val += labels.size(0)
            correct_val += (predicted == labels).sum().item()

    val_epoch_loss = val_running_loss / len(validation_dataset)
    val_epoch_acc = correct_val / total_val
    history['val_loss'].append(val_epoch_loss)
    history['val_acc'].append(val_epoch_acc)

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}, "
          f"Val Loss: {val_epoch_loss:.4f}, Val Acc: {val_epoch_acc:.4f}")

Epoch 1/10, Loss: 0.6396, Acc: 0.6538, Val Loss: 0.7832, Val Acc: 0.6517
Epoch 2/10, Loss: 0.5501, Acc: 0.7250, Val Loss: 0.6757, Val Acc: 0.7363
Epoch 3/10, Loss: 0.5448, Acc: 0.7300, Val Loss: 0.6059, Val Acc: 0.6866
Epoch 4/10, Loss: 0.5009, Acc: 0.7562, Val Loss: 0.5618, Val Acc: 0.7214
Epoch 5/10, Loss: 0.5022, Acc: 0.7388, Val Loss: 0.5541, Val Acc: 0.7264
Epoch 6/10, Loss: 0.4952, Acc: 0.7375, Val Loss: 0.5530, Val Acc: 0.7264
Epoch 7/10, Loss: 0.4512, Acc: 0.7762, Val Loss: 0.5583, Val Acc: 0.7313
Epoch 8/10, Loss: 0.4493, Acc: 0.7800, Val Loss: 0.6339, Val Acc: 0.6517
Epoch 9/10, Loss: 0.4458, Acc: 0.8013, Val Loss: 0.5465, Val Acc: 0.7214
Epoch 10/10, Loss: 0.4610, Acc: 0.7700, Val Loss: 0.5261, Val Acc: 0.7065


### Question 5
Let's train our model for 10 more epochs using the same code as previously.

Note: make sure you don't re-create the model. we want to continue training the model we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

- 0.008
- 0.08
- 0.88
- 8.88

In [28]:
mean_test_loss = np.mean(history['val_loss'])
print("Mean Test Loss (all epochs):", mean_test_loss)


Mean Test Loss (all epochs): 0.5998579801834045



### Question 6
What's the average of test accuracy for the last 5 epochs (from 6 to 10) for the model trained with augmentations?

- 0.08
- 0.28
- 0.68
- 0.98

In [32]:
avg_last5_test_acc = np.mean(history['val_acc'][-5:])
print("Average Test Accuracy (last 5 epochs):", avg_last5_test_acc)


Average Test Accuracy (last 5 epochs): 0.7074626865671643
