# PART I: Theory Questions

---

## 1. Question

Activation functions are mathematical functions that provide non-linearity to the model. They are applied to the output of the each neuron, before passing it to the next layer. They are important because as I mentioned before, they help introducing non-linearity to the neural network. Another importance can be their ability to be differentiated. This makes the usage of backpropagation possible. Also some activation functions like Softmax can be used in the output layer of neural networks to provide probabilities for classifiacion tasks.

---

## 2. Question

The table is the answer:

|   Layer  | Output Volume Shape | Number of Parameters |
|----------|---------------------|----------------------|
|   Input  |     (64, 64, 3)     |          0           |
|  CONV5-8 |     (60, 60, 8)     |         608          |
|  POOL-2  |     (30, 30, 8)     |          0           |
| CONV3-16 |     (28, 28, 16)    |        1168          |
|  POOL-3  |     (13, 13, 16)    |          0           |
|   FC-30  |         (30,)       |        81150         |
|   FC-5   |         (5,)        |         155          |

---

# PART II: Classification of Skin Lesion Images using Neural Network

## 0- Imports and Constants

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Dataset

import os
from PIL import Image
from tqdm import tqdm

DATA_DIR = '311PA3_melanoma_dataset'
torch.manual_seed(42)
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

---

## 1- Creating Custom Structures for Backend

### 1.1- Dataset

In [3]:
class MelanomaDataset(Dataset):
    def __init__(self, data_dir, target_size=(300, 300), augmentations=None, transformations=None, train=True):
        self.data_dir = data_dir
        self.target_size = target_size
        self.augmentations = augmentations
        self.transforms = transformations
        self.train = train
        self.samples = self.load_samples_(self.train)

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        img_path, label = self.samples[idx]
        img = Image.open(img_path).convert('RGB')

        if self.train and self.augmentations:
            img = self.augmentations(img)

        if self.transforms:
            img = self.transforms(img)

        return img, 0 if label == 'benign' else 1

    def load_samples_(self, train=True):
        samples = []
        mode = 'train' if train else 'test'
        dir = os.path.join(self.data_dir, mode)

        for label in os.listdir(dir):
            label_dir = os.path.join(dir, label)
            for img in os.listdir(label_dir):
                samples.append((os.path.join(label_dir, img), label))

        return samples

---

### 1.2- Activation Function Map

In [4]:
activation_func_map = {
    'relu': nn.ReLU,
    'sigmoid': nn.Sigmoid
}

---

### 1.3- Models

#### Multi-Layer Neural Network

In [5]:
class MultiLayerNN(nn.Module):
    def __init__(self, input_size, activation_func):
        super(MultiLayerNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 1)
        self.activation_func = activation_func

    def forward(self, x):
        x = x.flatten(start_dim=1)
        x = self.fc1(x)
        x = activation_func_map[self.activation_func]()(x)
        x = self.fc2(x)
        x = activation_func_map[self.activation_func]()(x)
        x = self.fc3(x)
        x = torch.sigmoid(x)

        return x

---

#### Convolutional Neural Network

In [6]:
class ConvNN(nn.Module):
    def __init__(self, activation_func):
        super(ConvNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 128, 5, padding=2)
        self.conv2 = nn.Conv2d(128, 256, 3, padding=1)
        self.fc1 = nn.Linear(256 * 75 * 75, 128)
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(128, 1)
        self.activation_func = activation_func

    def forward(self, x):
        x = self.conv1(x)
        x = activation_func_map[self.activation_func]()(x)
        x = F.adaptive_avg_pool2d(x, (150, 150))

        x = self.conv2(x)
        x = activation_func_map[self.activation_func]()(x)
        x = F.adaptive_avg_pool2d(x, (75, 75))

        x = x.flatten(start_dim=1)
        x = self.fc1(x)
        x = self.dropout(x)
        x = activation_func_map[self.activation_func]()(x)
        x = self.fc2(x)
        x = torch.sigmoid(x)

        return x

---

### 1.4- Model Running Code

In [5]:
def run_experiment(
        model,
        device,
        optimizer,
        criterion,
        augmentations=None,
        transformations=None,
        num_epochs=10,
        batch_size=16
    ):

    train_dataset = MelanomaDataset(DATA_DIR, augmentations=augmentations, transformations=transformations, train=True)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    test_dataset = MelanomaDataset(DATA_DIR, augmentations=None, transformations=transformations, train=False)
    test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

    # Training loop
    for epoch in range(num_epochs):
        model.train()
        train_loss = 0
        correct = 0
        total = 0

        for inputs, labels in tqdm(train_loader, desc=f'Training Epoch {epoch+1}/{num_epochs}'):
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels.float().view(-1, 1))
            loss.backward()
            optimizer.step()

            train_loss += loss.item() * inputs.size(0)
            predicted = (outputs > 0.5).float()
            total += labels.size(0)
            correct += (predicted == labels.float().view(-1, 1)).sum().item()

        train_loss /= total
        train_accuracy = correct / total
        print(f'Epoch {epoch + 1}/{num_epochs}, Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.4f}')

    # Testing after training
    model.eval()
    test_loss = 0
    correct = 0
    total = 0

    with torch.no_grad():
        for inputs, labels in tqdm(test_loader, desc='Testing'):
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels.float().view(-1, 1))
            test_loss += loss.item() * inputs.size(0)  # Multiply by batch size
            predicted = (outputs > 0.5).float()
            total += labels.size(0)
            correct += (predicted == labels.float().view(-1, 1)).sum().item()

    test_loss /= total
    test_accuracy = correct / total
    print(f'Test Loss: {test_loss:.4f}, Test Accuracy: {test_accuracy:.4f}')

---

### 1.5- Transformations

In [8]:
mnn_transforms = transforms.Compose([
    transforms.Resize((50, 50)),
    transforms.Grayscale(num_output_channels=1),
    transforms.ToTensor()
])

In [6]:
cnn_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

---

## 2- Experiments

### 2.1 Multi-Layer Neural Network Experiments

- Learning rate: 0.005
- Input size: 50x50
- Activation function: Sigmoid

In [None]:
input_size = 50 * 50
epochs = 10
model = MultiLayerNN(input_size, 'sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=mnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 24.98it/s]


Epoch 1/10, Train Loss: 0.6914, Train Accuracy: 0.5211


Training Epoch 2/10: 100%|██████████| 481/481 [00:18<00:00, 25.71it/s]


Epoch 2/10, Train Loss: 0.6883, Train Accuracy: 0.5341


Training Epoch 3/10: 100%|██████████| 481/481 [00:18<00:00, 25.54it/s]


Epoch 3/10, Train Loss: 0.6861, Train Accuracy: 0.5450


Training Epoch 4/10: 100%|██████████| 481/481 [00:18<00:00, 25.81it/s]


Epoch 4/10, Train Loss: 0.6834, Train Accuracy: 0.5622


Training Epoch 5/10: 100%|██████████| 481/481 [00:19<00:00, 25.30it/s]


Epoch 5/10, Train Loss: 0.6811, Train Accuracy: 0.5800


Training Epoch 6/10: 100%|██████████| 481/481 [00:18<00:00, 25.46it/s]


Epoch 6/10, Train Loss: 0.6786, Train Accuracy: 0.5946


Training Epoch 7/10: 100%|██████████| 481/481 [00:19<00:00, 24.97it/s]


Epoch 7/10, Train Loss: 0.6755, Train Accuracy: 0.6053


Training Epoch 8/10: 100%|██████████| 481/481 [00:18<00:00, 25.45it/s]


Epoch 8/10, Train Loss: 0.6723, Train Accuracy: 0.6134


Training Epoch 9/10: 100%|██████████| 481/481 [00:19<00:00, 25.13it/s]


Epoch 9/10, Train Loss: 0.6686, Train Accuracy: 0.6150


Training Epoch 10/10: 100%|██████████| 481/481 [00:18<00:00, 25.68it/s]


Epoch 10/10, Train Loss: 0.6641, Train Accuracy: 0.6334


Testing: 100%|██████████| 121/121 [00:04<00:00, 28.16it/s]

Test Loss: 0.6668, Test Accuracy: 0.5659





---

- Learning rate: 0.005
- Input size: 50x50
- Activation function: ReLU

In [None]:
input_size = 50 * 50
epochs = 10
model = MultiLayerNN(input_size, 'relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=mnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:18<00:00, 25.61it/s]


Epoch 1/10, Train Loss: 0.6579, Train Accuracy: 0.5586


Training Epoch 2/10: 100%|██████████| 481/481 [00:19<00:00, 25.23it/s]


Epoch 2/10, Train Loss: 0.5737, Train Accuracy: 0.7202


Training Epoch 3/10: 100%|██████████| 481/481 [00:18<00:00, 25.45it/s]


Epoch 3/10, Train Loss: 0.4728, Train Accuracy: 0.7689


Training Epoch 4/10: 100%|██████████| 481/481 [00:18<00:00, 25.51it/s]


Epoch 4/10, Train Loss: 0.4340, Train Accuracy: 0.7853


Training Epoch 5/10: 100%|██████████| 481/481 [00:18<00:00, 25.48it/s]


Epoch 5/10, Train Loss: 0.4123, Train Accuracy: 0.7989


Training Epoch 6/10: 100%|██████████| 481/481 [00:19<00:00, 25.25it/s]


Epoch 6/10, Train Loss: 0.4014, Train Accuracy: 0.8041


Training Epoch 7/10: 100%|██████████| 481/481 [00:18<00:00, 25.38it/s]


Epoch 7/10, Train Loss: 0.3914, Train Accuracy: 0.8082


Training Epoch 8/10: 100%|██████████| 481/481 [00:18<00:00, 25.81it/s]


Epoch 8/10, Train Loss: 0.3812, Train Accuracy: 0.8139


Training Epoch 9/10: 100%|██████████| 481/481 [00:18<00:00, 26.02it/s]


Epoch 9/10, Train Loss: 0.3733, Train Accuracy: 0.8222


Training Epoch 10/10: 100%|██████████| 481/481 [00:18<00:00, 26.05it/s]


Epoch 10/10, Train Loss: 0.3694, Train Accuracy: 0.8191


Testing: 100%|██████████| 121/121 [00:04<00:00, 27.98it/s]

Test Loss: 0.3676, Test Accuracy: 0.8329





---

- Learning rate: 0.005
- Input size: 300x300
- Activation function: Sigmoid

In [None]:
input_size = 300 * 300
epochs = 10
model = MultiLayerNN(input_size, 'sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
last_two_transforms = transforms.Compose(mnn_transforms.transforms[1:])
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=last_two_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 24.80it/s]


Epoch 1/10, Train Loss: 0.6835, Train Accuracy: 0.5768


Training Epoch 2/10: 100%|██████████| 481/481 [00:20<00:00, 22.93it/s]


Epoch 2/10, Train Loss: 0.6747, Train Accuracy: 0.6170


Training Epoch 3/10: 100%|██████████| 481/481 [00:20<00:00, 23.50it/s]


Epoch 3/10, Train Loss: 0.6641, Train Accuracy: 0.6298


Training Epoch 4/10: 100%|██████████| 481/481 [00:19<00:00, 24.86it/s]


Epoch 4/10, Train Loss: 0.6510, Train Accuracy: 0.6387


Training Epoch 5/10: 100%|██████████| 481/481 [00:19<00:00, 25.01it/s]


Epoch 5/10, Train Loss: 0.6375, Train Accuracy: 0.6435


Training Epoch 6/10: 100%|██████████| 481/481 [00:20<00:00, 23.91it/s]


Epoch 6/10, Train Loss: 0.6234, Train Accuracy: 0.6502


Training Epoch 7/10: 100%|██████████| 481/481 [00:20<00:00, 23.92it/s]


Epoch 7/10, Train Loss: 0.6051, Train Accuracy: 0.6581


Training Epoch 8/10: 100%|██████████| 481/481 [00:20<00:00, 23.69it/s]


Epoch 8/10, Train Loss: 0.5780, Train Accuracy: 0.6838


Training Epoch 9/10: 100%|██████████| 481/481 [00:20<00:00, 23.91it/s]


Epoch 9/10, Train Loss: 0.5414, Train Accuracy: 0.7363


Training Epoch 10/10: 100%|██████████| 481/481 [00:20<00:00, 23.76it/s]


Epoch 10/10, Train Loss: 0.5009, Train Accuracy: 0.7728


Testing: 100%|██████████| 121/121 [00:04<00:00, 27.61it/s]

Test Loss: 0.4792, Test Accuracy: 0.8142





---

- Learning rate: 0.005
- Input size: 300x300
- Activation function: ReLU

In [None]:
input_size = 300 * 300
epochs = 10
model = MultiLayerNN(input_size, 'relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=last_two_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 25.00it/s]


Epoch 1/10, Train Loss: 0.5385, Train Accuracy: 0.7180


Training Epoch 2/10: 100%|██████████| 481/481 [00:20<00:00, 23.80it/s]


Epoch 2/10, Train Loss: 0.4341, Train Accuracy: 0.7815


Training Epoch 3/10: 100%|██████████| 481/481 [00:20<00:00, 23.96it/s]


Epoch 3/10, Train Loss: 0.4125, Train Accuracy: 0.7962


Training Epoch 4/10: 100%|██████████| 481/481 [00:20<00:00, 23.85it/s]


Epoch 4/10, Train Loss: 0.3907, Train Accuracy: 0.8108


Training Epoch 5/10: 100%|██████████| 481/481 [00:20<00:00, 23.95it/s]


Epoch 5/10, Train Loss: 0.3795, Train Accuracy: 0.8186


Training Epoch 6/10: 100%|██████████| 481/481 [00:20<00:00, 23.22it/s]


Epoch 6/10, Train Loss: 0.3734, Train Accuracy: 0.8170


Training Epoch 7/10: 100%|██████████| 481/481 [00:20<00:00, 23.05it/s]


Epoch 7/10, Train Loss: 0.3607, Train Accuracy: 0.8248


Training Epoch 8/10: 100%|██████████| 481/481 [00:20<00:00, 23.16it/s]


Epoch 8/10, Train Loss: 0.3572, Train Accuracy: 0.8293


Training Epoch 9/10: 100%|██████████| 481/481 [00:21<00:00, 22.85it/s]


Epoch 9/10, Train Loss: 0.3528, Train Accuracy: 0.8290


Training Epoch 10/10: 100%|██████████| 481/481 [00:20<00:00, 23.94it/s]


Epoch 10/10, Train Loss: 0.3454, Train Accuracy: 0.8363


Testing: 100%|██████████| 121/121 [00:04<00:00, 26.20it/s]

Test Loss: 0.4092, Test Accuracy: 0.8053





---

- Learning rate: 0.02
- Input size: 50x50
- Activation function: Sigmoid

In [None]:
input_size = 50 * 50
epochs = 10
model = MultiLayerNN(input_size, 'sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=mnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:18<00:00, 25.83it/s]


Epoch 1/10, Train Loss: 0.6902, Train Accuracy: 0.5360


Training Epoch 2/10: 100%|██████████| 481/481 [00:18<00:00, 25.78it/s]


Epoch 2/10, Train Loss: 0.6813, Train Accuracy: 0.5688


Training Epoch 3/10: 100%|██████████| 481/481 [00:18<00:00, 26.19it/s]


Epoch 3/10, Train Loss: 0.6702, Train Accuracy: 0.6175


Training Epoch 4/10: 100%|██████████| 481/481 [00:18<00:00, 26.08it/s]


Epoch 4/10, Train Loss: 0.6509, Train Accuracy: 0.6317


Training Epoch 5/10: 100%|██████████| 481/481 [00:20<00:00, 23.66it/s]


Epoch 5/10, Train Loss: 0.6256, Train Accuracy: 0.6417


Training Epoch 6/10: 100%|██████████| 481/481 [00:18<00:00, 25.53it/s]


Epoch 6/10, Train Loss: 0.5932, Train Accuracy: 0.6635


Training Epoch 7/10: 100%|██████████| 481/481 [00:18<00:00, 26.04it/s]


Epoch 7/10, Train Loss: 0.5507, Train Accuracy: 0.7104


Training Epoch 8/10: 100%|██████████| 481/481 [00:18<00:00, 25.85it/s]


Epoch 8/10, Train Loss: 0.5058, Train Accuracy: 0.7486


Training Epoch 9/10: 100%|██████████| 481/481 [00:18<00:00, 26.59it/s]


Epoch 9/10, Train Loss: 0.4691, Train Accuracy: 0.7668


Training Epoch 10/10: 100%|██████████| 481/481 [00:18<00:00, 26.45it/s]


Epoch 10/10, Train Loss: 0.4429, Train Accuracy: 0.7825


Testing: 100%|██████████| 121/121 [00:04<00:00, 28.10it/s]

Test Loss: 0.4625, Test Accuracy: 0.7616





---

- Learning rate: 0.02
- Input size: 50x50
- Activation function: ReLU

In [None]:
input_size = 50 * 50
epochs = 10
model = MultiLayerNN(input_size, 'relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=mnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 25.11it/s]


Epoch 1/10, Train Loss: 0.5785, Train Accuracy: 0.6812


Training Epoch 2/10: 100%|██████████| 481/481 [00:18<00:00, 25.78it/s]


Epoch 2/10, Train Loss: 0.4514, Train Accuracy: 0.7700


Training Epoch 3/10: 100%|██████████| 481/481 [00:18<00:00, 25.72it/s]


Epoch 3/10, Train Loss: 0.4133, Train Accuracy: 0.7991


Training Epoch 4/10: 100%|██████████| 481/481 [00:18<00:00, 25.33it/s]


Epoch 4/10, Train Loss: 0.4007, Train Accuracy: 0.8019


Training Epoch 5/10: 100%|██████████| 481/481 [00:18<00:00, 25.56it/s]


Epoch 5/10, Train Loss: 0.3866, Train Accuracy: 0.8104


Training Epoch 6/10: 100%|██████████| 481/481 [00:18<00:00, 25.92it/s]


Epoch 6/10, Train Loss: 0.3799, Train Accuracy: 0.8164


Training Epoch 7/10: 100%|██████████| 481/481 [00:18<00:00, 25.82it/s]


Epoch 7/10, Train Loss: 0.3721, Train Accuracy: 0.8200


Training Epoch 8/10: 100%|██████████| 481/481 [00:18<00:00, 25.81it/s]


Epoch 8/10, Train Loss: 0.3604, Train Accuracy: 0.8247


Training Epoch 9/10: 100%|██████████| 481/481 [00:18<00:00, 25.71it/s]


Epoch 9/10, Train Loss: 0.3565, Train Accuracy: 0.8290


Training Epoch 10/10: 100%|██████████| 481/481 [00:18<00:00, 25.57it/s]


Epoch 10/10, Train Loss: 0.3552, Train Accuracy: 0.8277


Testing: 100%|██████████| 121/121 [00:04<00:00, 28.05it/s]

Test Loss: 0.6637, Test Accuracy: 0.5008





---

- Learning rate: 0.02
- Input size: 300x300
- Activation function: Sigmoid

In [None]:
input_size = 300 * 300
epochs = 10
model = MultiLayerNN(input_size, 'sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=last_two_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 24.15it/s]


Epoch 1/10, Train Loss: 0.6715, Train Accuracy: 0.5988


Training Epoch 2/10: 100%|██████████| 481/481 [00:20<00:00, 23.67it/s]


Epoch 2/10, Train Loss: 0.6186, Train Accuracy: 0.6469


Training Epoch 3/10: 100%|██████████| 481/481 [00:18<00:00, 25.50it/s]


Epoch 3/10, Train Loss: 0.5133, Train Accuracy: 0.7484


Training Epoch 4/10: 100%|██████████| 481/481 [00:18<00:00, 26.54it/s]


Epoch 4/10, Train Loss: 0.4482, Train Accuracy: 0.7837


Training Epoch 5/10: 100%|██████████| 481/481 [00:20<00:00, 23.75it/s]


Epoch 5/10, Train Loss: 0.4192, Train Accuracy: 0.8011


Training Epoch 6/10: 100%|██████████| 481/481 [00:20<00:00, 23.84it/s]


Epoch 6/10, Train Loss: 0.4067, Train Accuracy: 0.8125


Training Epoch 7/10: 100%|██████████| 481/481 [00:17<00:00, 26.93it/s]


Epoch 7/10, Train Loss: 0.3985, Train Accuracy: 0.8143


Training Epoch 8/10: 100%|██████████| 481/481 [00:19<00:00, 25.11it/s]


Epoch 8/10, Train Loss: 0.3922, Train Accuracy: 0.8149


Training Epoch 9/10: 100%|██████████| 481/481 [00:19<00:00, 25.24it/s]


Epoch 9/10, Train Loss: 0.3818, Train Accuracy: 0.8221


Training Epoch 10/10: 100%|██████████| 481/481 [00:20<00:00, 23.73it/s]


Epoch 10/10, Train Loss: 0.3848, Train Accuracy: 0.8213


Testing: 100%|██████████| 121/121 [00:04<00:00, 27.21it/s]

Test Loss: 0.4040, Test Accuracy: 0.8048





---

- Learning rate: 0.02
- Input size: 300x300
- Activation function: ReLU

In [None]:
input_size = 300 * 300
epochs = 10
model = MultiLayerNN(input_size, 'relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=last_two_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [00:19<00:00, 24.94it/s]


Epoch 1/10, Train Loss: 0.5589, Train Accuracy: 0.6901


Training Epoch 2/10: 100%|██████████| 481/481 [00:20<00:00, 23.77it/s]


Epoch 2/10, Train Loss: 0.4526, Train Accuracy: 0.7768


Training Epoch 3/10: 100%|██████████| 481/481 [00:20<00:00, 23.60it/s]


Epoch 3/10, Train Loss: 0.4392, Train Accuracy: 0.7846


Training Epoch 4/10: 100%|██████████| 481/481 [00:20<00:00, 23.69it/s]


Epoch 4/10, Train Loss: 0.4038, Train Accuracy: 0.8006


Training Epoch 5/10: 100%|██████████| 481/481 [00:20<00:00, 23.95it/s]


Epoch 5/10, Train Loss: 0.3956, Train Accuracy: 0.8053


Training Epoch 6/10: 100%|██████████| 481/481 [00:20<00:00, 23.72it/s]


Epoch 6/10, Train Loss: 0.3841, Train Accuracy: 0.8104


Training Epoch 7/10: 100%|██████████| 481/481 [00:20<00:00, 23.69it/s]


Epoch 7/10, Train Loss: 0.3783, Train Accuracy: 0.8142


Training Epoch 8/10: 100%|██████████| 481/481 [00:20<00:00, 23.74it/s]


Epoch 8/10, Train Loss: 0.3675, Train Accuracy: 0.8225


Training Epoch 9/10: 100%|██████████| 481/481 [00:21<00:00, 22.27it/s]


Epoch 9/10, Train Loss: 0.3627, Train Accuracy: 0.8263


Training Epoch 10/10: 100%|██████████| 481/481 [00:20<00:00, 23.96it/s]


Epoch 10/10, Train Loss: 0.3626, Train Accuracy: 0.8227


Testing: 100%|██████████| 121/121 [00:04<00:00, 26.98it/s]

Test Loss: 0.3787, Test Accuracy: 0.8220





---

### 2.2 Convolutional Neural Network Experiments

- Learning rate: 0.005
- Activation function: Sigmoid
- Batch size: 16

In [None]:
epochs = 10
model = ConvNN('sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [01:20<00:00,  5.99it/s]


Epoch 1/10, Train Loss: 0.6961, Train Accuracy: 0.5099


Training Epoch 2/10: 100%|██████████| 481/481 [01:16<00:00,  6.31it/s]


Epoch 2/10, Train Loss: 0.6945, Train Accuracy: 0.5160


Training Epoch 3/10: 100%|██████████| 481/481 [01:17<00:00,  6.23it/s]


Epoch 3/10, Train Loss: 0.6945, Train Accuracy: 0.5135


Training Epoch 4/10: 100%|██████████| 481/481 [01:18<00:00,  6.15it/s]


Epoch 4/10, Train Loss: 0.6921, Train Accuracy: 0.5221


Training Epoch 5/10: 100%|██████████| 481/481 [01:19<00:00,  6.02it/s]


Epoch 5/10, Train Loss: 0.6904, Train Accuracy: 0.5286


Training Epoch 6/10: 100%|██████████| 481/481 [01:19<00:00,  6.04it/s]


Epoch 6/10, Train Loss: 0.6926, Train Accuracy: 0.5190


Training Epoch 7/10: 100%|██████████| 481/481 [01:20<00:00,  6.01it/s]


Epoch 7/10, Train Loss: 0.6893, Train Accuracy: 0.5397


Training Epoch 8/10: 100%|██████████| 481/481 [01:20<00:00,  5.99it/s]


Epoch 8/10, Train Loss: 0.6888, Train Accuracy: 0.5379


Training Epoch 9/10: 100%|██████████| 481/481 [01:20<00:00,  5.98it/s]


Epoch 9/10, Train Loss: 0.6848, Train Accuracy: 0.5511


Training Epoch 10/10: 100%|██████████| 481/481 [01:21<00:00,  5.94it/s]


Epoch 10/10, Train Loss: 0.6773, Train Accuracy: 0.5765


Testing: 100%|██████████| 121/121 [00:16<00:00,  7.15it/s]

Test Loss: 0.6889, Test Accuracy: 0.4794





---

- Learning rate: 0.005
- Activation function: ReLU
- Batch size: 16

In [None]:
epochs = 10
model = ConvNN('relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [01:15<00:00,  6.36it/s]


Epoch 1/10, Train Loss: 0.3710, Train Accuracy: 0.8321


Training Epoch 2/10: 100%|██████████| 481/481 [01:18<00:00,  6.13it/s]


Epoch 2/10, Train Loss: 0.3107, Train Accuracy: 0.8692


Training Epoch 3/10: 100%|██████████| 481/481 [01:17<00:00,  6.17it/s]


Epoch 3/10, Train Loss: 0.2897, Train Accuracy: 0.8765


Training Epoch 4/10: 100%|██████████| 481/481 [01:19<00:00,  6.08it/s]


Epoch 4/10, Train Loss: 0.2760, Train Accuracy: 0.8822


Training Epoch 5/10: 100%|██████████| 481/481 [01:19<00:00,  6.02it/s]


Epoch 5/10, Train Loss: 0.2593, Train Accuracy: 0.8930


Training Epoch 6/10: 100%|██████████| 481/481 [01:20<00:00,  5.98it/s]


Epoch 6/10, Train Loss: 0.2519, Train Accuracy: 0.8956


Training Epoch 7/10: 100%|██████████| 481/481 [01:20<00:00,  6.00it/s]


Epoch 7/10, Train Loss: 0.2371, Train Accuracy: 0.8995


Training Epoch 8/10: 100%|██████████| 481/481 [01:20<00:00,  5.98it/s]


Epoch 8/10, Train Loss: 0.2353, Train Accuracy: 0.9032


Training Epoch 9/10: 100%|██████████| 481/481 [01:20<00:00,  6.00it/s]


Epoch 9/10, Train Loss: 0.2253, Train Accuracy: 0.9110


Training Epoch 10/10: 100%|██████████| 481/481 [01:20<00:00,  6.00it/s]


Epoch 10/10, Train Loss: 0.2146, Train Accuracy: 0.9162


Testing: 100%|██████████| 121/121 [00:10<00:00, 11.84it/s]

Test Loss: 0.2736, Test Accuracy: 0.8771





---

- Learning rate: 0.005
- Activation function: Sigmoid
- Batch size: 32

In [None]:
epochs = 10
model = ConvNN('sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs, batch_size=32)

Training Epoch 1/10: 100%|██████████| 241/241 [11:01<00:00,  2.74s/it]


Epoch 1/10, Train Loss: 0.6951, Train Accuracy: 0.5069


Training Epoch 2/10: 100%|██████████| 241/241 [10:32<00:00,  2.62s/it]


Epoch 2/10, Train Loss: 0.6948, Train Accuracy: 0.5150


Training Epoch 3/10: 100%|██████████| 241/241 [10:27<00:00,  2.60s/it]


Epoch 3/10, Train Loss: 0.6959, Train Accuracy: 0.5017


Training Epoch 4/10: 100%|██████████| 241/241 [10:27<00:00,  2.60s/it]


Epoch 4/10, Train Loss: 0.6910, Train Accuracy: 0.5332


Training Epoch 5/10: 100%|██████████| 241/241 [10:28<00:00,  2.61s/it]


Epoch 5/10, Train Loss: 0.6898, Train Accuracy: 0.5349


Training Epoch 6/10: 100%|██████████| 241/241 [10:28<00:00,  2.61s/it]


Epoch 6/10, Train Loss: 0.6867, Train Accuracy: 0.5539


Training Epoch 7/10: 100%|██████████| 241/241 [10:26<00:00,  2.60s/it]


Epoch 7/10, Train Loss: 0.6810, Train Accuracy: 0.5740


Training Epoch 8/10: 100%|██████████| 241/241 [10:27<00:00,  2.60s/it]


Epoch 8/10, Train Loss: 0.6711, Train Accuracy: 0.6066


Training Epoch 9/10: 100%|██████████| 241/241 [10:28<00:00,  2.61s/it]


Epoch 9/10, Train Loss: 0.6671, Train Accuracy: 0.6201


Training Epoch 10/10: 100%|██████████| 241/241 [10:30<00:00,  2.62s/it]


Epoch 10/10, Train Loss: 0.6465, Train Accuracy: 0.6577


Testing: 100%|██████████| 61/61 [00:10<00:00,  5.87it/s]

Test Loss: 0.6224, Test Accuracy: 0.7303





---

- Learning rate: 0.005
- Activation function: ReLU
- Batch size: 32

In [11]:
epochs = 10
model = ConvNN('relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.005)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs, batch_size=32)

Training Epoch 1/10: 100%|██████████| 241/241 [24:09<00:00,  6.01s/it]


Epoch 1/10, Train Loss: 0.3693, Train Accuracy: 0.8384


Training Epoch 2/10: 100%|██████████| 241/241 [02:13<00:00,  1.80it/s]


Epoch 2/10, Train Loss: 0.3057, Train Accuracy: 0.8660


Training Epoch 3/10: 100%|██████████| 241/241 [02:14<00:00,  1.80it/s]


Epoch 3/10, Train Loss: 0.2865, Train Accuracy: 0.8792


Training Epoch 4/10: 100%|██████████| 241/241 [02:14<00:00,  1.79it/s]


Epoch 4/10, Train Loss: 0.2736, Train Accuracy: 0.8852


Training Epoch 5/10: 100%|██████████| 241/241 [02:14<00:00,  1.79it/s]


Epoch 5/10, Train Loss: 0.2620, Train Accuracy: 0.8920


Training Epoch 6/10: 100%|██████████| 241/241 [02:13<00:00,  1.80it/s]


Epoch 6/10, Train Loss: 0.2514, Train Accuracy: 0.8955


Training Epoch 7/10: 100%|██████████| 241/241 [02:14<00:00,  1.80it/s]


Epoch 7/10, Train Loss: 0.2433, Train Accuracy: 0.9001


Training Epoch 8/10: 100%|██████████| 241/241 [02:14<00:00,  1.80it/s]


Epoch 8/10, Train Loss: 0.2353, Train Accuracy: 0.9007


Training Epoch 9/10: 100%|██████████| 241/241 [02:13<00:00,  1.80it/s]


Epoch 9/10, Train Loss: 0.2304, Train Accuracy: 0.9011


Training Epoch 10/10: 100%|██████████| 241/241 [02:13<00:00,  1.81it/s]


Epoch 10/10, Train Loss: 0.2276, Train Accuracy: 0.9073


Testing: 100%|██████████| 61/61 [11:54<00:00, 11.71s/it]

Test Loss: 0.2317, Test Accuracy: 0.9073





---

- Learning rate: 0.02
- Activation function: Sigmoid
- Batch size: 16

In [12]:
epochs = 10
model = ConvNN('sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [02:52<00:00,  2.79it/s]


Epoch 1/10, Train Loss: 0.6985, Train Accuracy: 0.5027


Training Epoch 2/10: 100%|██████████| 481/481 [02:12<00:00,  3.62it/s]


Epoch 2/10, Train Loss: 0.6975, Train Accuracy: 0.5081


Training Epoch 3/10: 100%|██████████| 481/481 [02:11<00:00,  3.67it/s]


Epoch 3/10, Train Loss: 0.6978, Train Accuracy: 0.5043


Training Epoch 4/10: 100%|██████████| 481/481 [02:10<00:00,  3.68it/s]


Epoch 4/10, Train Loss: 0.6988, Train Accuracy: 0.4910


Training Epoch 5/10: 100%|██████████| 481/481 [02:10<00:00,  3.68it/s]


Epoch 5/10, Train Loss: 0.6954, Train Accuracy: 0.5168


Training Epoch 6/10: 100%|██████████| 481/481 [02:10<00:00,  3.69it/s]


Epoch 6/10, Train Loss: 0.6967, Train Accuracy: 0.5092


Training Epoch 7/10: 100%|██████████| 481/481 [02:11<00:00,  3.67it/s]


Epoch 7/10, Train Loss: 0.6957, Train Accuracy: 0.5010


Training Epoch 8/10: 100%|██████████| 481/481 [02:10<00:00,  3.68it/s]


Epoch 8/10, Train Loss: 0.6967, Train Accuracy: 0.5068


Training Epoch 9/10: 100%|██████████| 481/481 [02:10<00:00,  3.69it/s]


Epoch 9/10, Train Loss: 0.6963, Train Accuracy: 0.5087


Training Epoch 10/10: 100%|██████████| 481/481 [02:11<00:00,  3.67it/s]


Epoch 10/10, Train Loss: 0.6962, Train Accuracy: 0.5036


Testing: 100%|██████████| 121/121 [00:18<00:00,  6.47it/s]

Test Loss: 0.7115, Test Accuracy: 0.4794





---

- Learning rate: 0.02
- Activation function: ReLU
- Batch size: 16

In [13]:
epochs = 10
model = ConvNN('relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [02:41<00:00,  2.97it/s]


Epoch 1/10, Train Loss: 0.3824, Train Accuracy: 0.8377


Training Epoch 2/10: 100%|██████████| 481/481 [02:07<00:00,  3.76it/s]


Epoch 2/10, Train Loss: 0.3037, Train Accuracy: 0.8753


Training Epoch 3/10: 100%|██████████| 481/481 [02:08<00:00,  3.74it/s]


Epoch 3/10, Train Loss: 0.2776, Train Accuracy: 0.8827


Training Epoch 4/10: 100%|██████████| 481/481 [02:08<00:00,  3.74it/s]


Epoch 4/10, Train Loss: 0.2599, Train Accuracy: 0.8919


Training Epoch 5/10: 100%|██████████| 481/481 [02:08<00:00,  3.73it/s]


Epoch 5/10, Train Loss: 0.2456, Train Accuracy: 0.9007


Training Epoch 6/10: 100%|██████████| 481/481 [02:08<00:00,  3.74it/s]


Epoch 6/10, Train Loss: 0.2343, Train Accuracy: 0.9034


Training Epoch 7/10: 100%|██████████| 481/481 [02:09<00:00,  3.71it/s]


Epoch 7/10, Train Loss: 0.2192, Train Accuracy: 0.9146


Training Epoch 8/10: 100%|██████████| 481/481 [02:08<00:00,  3.73it/s]


Epoch 8/10, Train Loss: 0.2044, Train Accuracy: 0.9181


Training Epoch 9/10: 100%|██████████| 481/481 [02:08<00:00,  3.73it/s]


Epoch 9/10, Train Loss: 0.2011, Train Accuracy: 0.9181


Training Epoch 10/10: 100%|██████████| 481/481 [02:09<00:00,  3.71it/s]


Epoch 10/10, Train Loss: 0.1898, Train Accuracy: 0.9279


Testing: 100%|██████████| 121/121 [00:18<00:00,  6.55it/s]

Test Loss: 0.2291, Test Accuracy: 0.9089





---

- Learning rate: 0.02
- Activation function: Sigmoid
- Batch size: 32

In [14]:
epochs = 10
model = ConvNN('sigmoid').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs, batch_size=32)

Training Epoch 1/10: 100%|██████████| 241/241 [02:45<00:00,  1.45it/s]


Epoch 1/10, Train Loss: 0.7003, Train Accuracy: 0.5100


Training Epoch 2/10: 100%|██████████| 241/241 [02:10<00:00,  1.84it/s]


Epoch 2/10, Train Loss: 0.6962, Train Accuracy: 0.5047


Training Epoch 3/10: 100%|██████████| 241/241 [02:10<00:00,  1.84it/s]


Epoch 3/10, Train Loss: 0.6955, Train Accuracy: 0.5142


Training Epoch 4/10: 100%|██████████| 241/241 [02:11<00:00,  1.83it/s]


Epoch 4/10, Train Loss: 0.6965, Train Accuracy: 0.5020


Training Epoch 5/10: 100%|██████████| 241/241 [02:11<00:00,  1.84it/s]


Epoch 5/10, Train Loss: 0.6954, Train Accuracy: 0.5020


Training Epoch 6/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 6/10, Train Loss: 0.6944, Train Accuracy: 0.5142


Training Epoch 7/10: 100%|██████████| 241/241 [02:10<00:00,  1.84it/s]


Epoch 7/10, Train Loss: 0.6959, Train Accuracy: 0.5035


Training Epoch 8/10: 100%|██████████| 241/241 [02:12<00:00,  1.83it/s]


Epoch 8/10, Train Loss: 0.6953, Train Accuracy: 0.5092


Training Epoch 9/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 9/10, Train Loss: 0.6945, Train Accuracy: 0.5118


Training Epoch 10/10: 100%|██████████| 241/241 [02:12<00:00,  1.82it/s]


Epoch 10/10, Train Loss: 0.6938, Train Accuracy: 0.5172


Testing: 100%|██████████| 61/61 [00:20<00:00,  2.97it/s]

Test Loss: 0.7024, Test Accuracy: 0.5206





---

- Learning rate: 0.02
- Activation function: ReLU
- Batch size: 32

In [15]:
epochs = 10
model = ConvNN('relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs, batch_size=32)

Training Epoch 1/10: 100%|██████████| 241/241 [02:42<00:00,  1.49it/s]


Epoch 1/10, Train Loss: 0.3853, Train Accuracy: 0.8365


Training Epoch 2/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 2/10, Train Loss: 0.2973, Train Accuracy: 0.8790


Training Epoch 3/10: 100%|██████████| 241/241 [02:11<00:00,  1.84it/s]


Epoch 3/10, Train Loss: 0.2760, Train Accuracy: 0.8825


Training Epoch 4/10: 100%|██████████| 241/241 [02:09<00:00,  1.85it/s]


Epoch 4/10, Train Loss: 0.2605, Train Accuracy: 0.8900


Training Epoch 5/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 5/10, Train Loss: 0.2478, Train Accuracy: 0.8988


Training Epoch 6/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 6/10, Train Loss: 0.2299, Train Accuracy: 0.9075


Training Epoch 7/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 7/10, Train Loss: 0.2297, Train Accuracy: 0.9076


Training Epoch 8/10: 100%|██████████| 241/241 [02:11<00:00,  1.84it/s]


Epoch 8/10, Train Loss: 0.2167, Train Accuracy: 0.9115


Training Epoch 9/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 9/10, Train Loss: 0.2050, Train Accuracy: 0.9166


Training Epoch 10/10: 100%|██████████| 241/241 [02:10<00:00,  1.85it/s]


Epoch 10/10, Train Loss: 0.2020, Train Accuracy: 0.9185


Testing: 100%|██████████| 61/61 [00:20<00:00,  3.00it/s]

Test Loss: 0.2481, Test Accuracy: 0.9011





---

### 2.3 My Convolutional Neural Network Experiment

I created a deeper and more complex CNN model based on the observations I made along the way. I actually wanted to try deeper, more complex models than this one but due to lack of resource and time, I am unable to do that.

In [17]:
class MyConvNN(nn.Module):
    def __init__(self, activation_func):
        super(MyConvNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 128, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(128)
        self.conv2 = nn.Conv2d(128, 256, 5, padding=2)
        self.bn2 = nn.BatchNorm2d(256)
        self.conv3 = nn.Conv2d(256, 256, 1)
        self.bn3 = nn.BatchNorm2d(256)
        self.fc1 = nn.Linear(256 * 75 * 75, 256)
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 1)
        self.activation_func = activation_func

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = activation_func_map[self.activation_func]()(x)
        x = F.adaptive_avg_pool2d(x, (150, 150))

        x = self.conv2(x)
        x = self.bn2(x)
        x = activation_func_map[self.activation_func]()(x)
        x = F.adaptive_avg_pool2d(x, (75, 75))

        x = self.conv3(x)
        x = self.bn3(x)
        x = activation_func_map[self.activation_func]()(x)

        x = x.flatten(start_dim=1)
        x = self.fc1(x)
        x = self.dropout(x)
        x = activation_func_map[self.activation_func]()(x)

        x = self.fc2(x)
        x = self.dropout(x)
        x = activation_func_map[self.activation_func]()(x)

        x = self.fc3(x)
        x = torch.sigmoid(x)

        return x

In [18]:
epochs = 10
model = MyConvNN('relu').to(DEVICE)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)
run_experiment(model, DEVICE, optimizer, criterion, augmentations=None, transformations=cnn_transforms, num_epochs=epochs)

Training Epoch 1/10: 100%|██████████| 481/481 [04:38<00:00,  1.73it/s]


Epoch 1/10, Train Loss: 0.3631, Train Accuracy: 0.8457


Training Epoch 2/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 2/10, Train Loss: 0.2949, Train Accuracy: 0.8745


Training Epoch 3/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 3/10, Train Loss: 0.2658, Train Accuracy: 0.8898


Training Epoch 4/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 4/10, Train Loss: 0.2458, Train Accuracy: 0.9015


Training Epoch 5/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 5/10, Train Loss: 0.2295, Train Accuracy: 0.9097


Training Epoch 6/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 6/10, Train Loss: 0.2249, Train Accuracy: 0.9109


Training Epoch 7/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 7/10, Train Loss: 0.2085, Train Accuracy: 0.9167


Training Epoch 8/10: 100%|██████████| 481/481 [03:57<00:00,  2.02it/s]


Epoch 8/10, Train Loss: 0.2031, Train Accuracy: 0.9223


Training Epoch 9/10: 100%|██████████| 481/481 [03:57<00:00,  2.03it/s]


Epoch 9/10, Train Loss: 0.1958, Train Accuracy: 0.9226


Training Epoch 10/10: 100%|██████████| 481/481 [03:57<00:00,  2.02it/s]


Epoch 10/10, Train Loss: 0.1843, Train Accuracy: 0.9252


Testing: 100%|██████████| 121/121 [00:21<00:00,  5.67it/s]

Test Loss: 0.2309, Test Accuracy: 0.9089





---

## 3- Results

| Experiment | Input Size | Model | Activation Function | Hidden Layers | Learning Rate | Batch Size | Accuracy |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 1 | 50*50 | MLP | Sigmoid | 128, 64 | 0.005 | 16 | 0.5659 |
| 2 | 50*50 | MLP | ReLU | 128, 64 | 0.005 | 16 | 0.8329 |
| 3 | 300*300 | MLP | Sigmoid | 128, 64 | 0.005 | 16 | 0.8142 |
| 4 | 300*300 | MLP | ReLU | 128, 64 | 0.005 | 16 | 0.8053 |
| 5 | 50*50 | MLP | Sigmoid | 128, 64 | 0.02 | 16 | 0.7616 |
| 6 | 50*50 | MLP | ReLU | 128, 64 | 0.02 | 16 | 0.5008 |
| 7 | 300*300 | MLP | Sigmoid | 128, 64 | 0.02 | 16 | 0.8048 |
| 8 | 300*300 | MLP | ReLU | 128, 64 | 0.02 | 16 | 0.8220 |
| 9 | 300*300| CNN | Sigmoid | 5x5, 3x3 |0.005 | 16 | 0.4794 |
| 10 | 300*300 | CNN | ReLU | 5x5, 3x3 | 0.005 | 16 | 0.8771 |
| 11 | 300*300 | CNN | Sigmoid | 5x5, 3x3 | 0.005 | 32 | 0.7303 |
| 12 | 300*300 | CNN | ReLU | 5x5, 3x3 | 0.005 | 32 | 0.9073 |
| 13 | 300*300 | CNN | Sigmoid | 5x5, 3x3 | 0.02 | 16 | 0.4794 |
| <span style="color: lightgreen;">14</span> | <span style="color: lightgreen;">300*300</span> | <span style="color: lightgreen;">CNN</span> | <span style="color: lightgreen;">ReLU</span> | <span style="color: lightgreen;">5x5, 3x3</span> | <span style="color: lightgreen;">0.02</span> | <span style="color: lightgreen;">16</span> | <span style="color: lightgreen;">0.9089</span> |
| 15 | 300*300 | CNN | Sigmoid | 5x5, 3x3 | 0.02 | 32 | 0.5206 |
| 16 | 300*300 | CNN | ReLU | 5x5, 3x3 | 0.02 | 32 | 0.9011 |
| <span style="color: lightgreen;">17</span> | <span style="color: lightgreen;">300*300</span> | <span style="color: lightgreen;">CNN</span> | <span style="color: lightgreen;">ReLU</span> | <span style="color: lightgreen;">3x3, 5x5, 1x1</span> | <span style="color: lightgreen;">0.001</span> | <span style="color: lightgreen;">16</span> | <span style="color: lightgreen;">0.9089</span> |

I actually have 2 winners here. By the looks of it, it does not surprise that they both have ReLU as their activation function, since Sigmoid does quite bad job when used in the middle layers. Another common thing is they are both CNN models (which again is not surprising). I actually believe that I can easily get a much better result by using more advanced augmentation and transformation technics, as well as a better model, but due to time and resource limitations I end this assignment here.