## LAB 2





# Training SimpleNN on CIFAR-10
In this project, you will use the SimpleNN model to perform image classification on CIFAR-10. CIFAR-10 orginally contains 60K images from 10 categories. We split it into 45K/5K/10K images to serve as train/valiation/test set. We only release the ground-truth labels of training/validation dataset to you.

In [1]:
# import necessary dependencies
import argparse
import os, sys
import time
import datetime
from tqdm import tqdm_notebook as tqdm

import torch
import torch.nn as nn
import torch.nn.functional as F

In [2]:
# define the SimpleNN mode;
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 8, 5)
        self.conv2 = nn.Conv2d(8, 16, 3)
        self.fc1   = nn.Linear(16*6*6, 120)
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, 10)

    def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out


Here is a sanity check to verify the implementation of SimpleNN.


In [3]:
#############################################
# your code here
# sanity check for the correctness of SimpleNN
dummy_input = torch.randn(1, 3, 32, 32)
model = SimpleNN()
output = model(dummy_input)


expected_output_shape = (1, 10)
if output.shape == expected_output_shape:
    print("Output shape check passed!")
else:
    print("Output shape check failed.")

def count_parameters(model):
    return sum(p.numel() for p in model.parameters())

total_parameters = count_parameters(model)

expected_parameters = (3 * 8 * 5 * 5 + 8) + (8 * 16 * 3 * 3 + 16) + (16 * 6 * 6 * 120 + 120) + (120 * 84 + 84) + (84 * 10 + 10)

if total_parameters == expected_parameters:
    print("Parameter count check passed!")
else:
    print("Parameter count check failed.")

#############################################

Output shape check passed!
Parameter count check passed!


## Step 1: Set up preprocessing functions
Preprocessing is very important as discussed in the lecture.
You will need to write preprocessing functions with the help of *torchvision.transforms* in this step.
You can find helpful tutorial/API at [here](https://pytorch.org/vision/stable/transforms.html).

### Question (a)



In [4]:
# useful libraries
import torchvision
import torchvision.transforms as transforms

#############################################
# your code here
mean = (0.4914, 0.4822, 0.4465)
std = (0.2023, 0.1994, 0.2010)
# specify preprocessing function

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

transform_val = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])
#############################################

## Step 2: Set up dataset and dataloader



In [5]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [6]:
import sys
sys.path.append('/content/drive/MyDrive/ECE661_HW2/tools')

In [7]:
# do NOT change these
from tools.dataset import CIFAR10
from torch.utils.data import DataLoader

# a few arguments, do NOT change these
DATA_ROOT = "./data"
TRAIN_BATCH_SIZE = 128
VAL_BATCH_SIZE = 100

#############################################
# your code here
# construct dataset
train_set = CIFAR10(
    root=DATA_ROOT,
    mode='train',
    download=True,
    transform=transform_train    # your code
)
val_set = CIFAR10(
    root=DATA_ROOT,
    mode='val',
    download=True,
    transform=transform_val    # your code
)

# construct dataloader
train_loader = DataLoader(
    train_set,
    batch_size=TRAIN_BATCH_SIZE,  # your code
    shuffle=torch.triu_indices,     # your code
    num_workers=4
)
val_loader = DataLoader(
    val_set,
    batch_size=VAL_BATCH_SIZE,  # your code
    shuffle=True,     # your code
    num_workers=4
)
#############################################

Downloading https://www.dropbox.com/s/s8orza214q45b23/cifar10_trainval_F22.zip?dl=1 to ./data/cifar10_trainval_F22.zip


141746176it [00:02, 50724262.59it/s]                               


Extracting ./data/cifar10_trainval_F22.zip to ./data
Files already downloaded and verified
Using downloaded and verified file: ./data/cifar10_trainval_F22.zip
Extracting ./data/cifar10_trainval_F22.zip to ./data
Files already downloaded and verified




## Step 3: Instantiate your SimpleNN model and deploy it to GPU devices.



In [8]:
# specify the device for computation
#############################################
# your code here
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using GPU")
else:
    device = torch.device("cpu")
    print("Using CPU")

model = SimpleNN()
model.to(device)
print(model)


#############################################

Using GPU
SimpleNN(
  (conv1): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


## Step 4: Set up the loss function and optimizer


In [9]:
import torch.nn as nn
import torch.optim as optim

# hyperparameters, do NOT change right now
# initial learning rate
INITIAL_LR = 0.01

# momentum for optimizer
MOMENTUM = 0.9

# L2 regularization strength
REG = 1e-4

#############################################
# your code here
# create loss function
criterion =nn.CrossEntropyLoss()

# Add optimizer
optimizer =optim.SGD(model.parameters(), lr=INITIAL_LR, momentum=MOMENTUM, weight_decay=REG)
#############################################

## Step 5: Start the training process.




In [10]:
# some hyperparameters
# total number of training epochs
EPOCHS = 30

# the folder where the trained model is saved
CHECKPOINT_FOLDER = "./saved_model"

# start the training/validation process
# the process should take about 5 minutes on a GTX 1070-Ti
# if the code is written efficiently.
best_val_acc = 0
current_learning_rate = INITIAL_LR

print("==> Training starts!")
print("="*50)
for i in range(0, EPOCHS):
    #######################
    # your code here
    # switch to train mode
    model.train()


    #######################

    print("Epoch %d:" %i)
    # this help you compute the training accuracy
    total_examples = 0
    correct_examples = 0

    train_loss = 0
    val_loss = 0

    # Train the model for 1 epoch.
    for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
         inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
         output= model(inputs)
         loss= criterion(output,targets)

         # zero the gradient

         optimizer.zero_grad()
         # backpropagation
         loss.backward()


        # apply gradient and update the weights
         optimizer.step()


        # count the number of correctly predicted samples in the current batch
         _, predicted = output.max(1)
         total_examples += targets.size(0)
         correct_examples += predicted.eq(targets).sum().item()
         train_loss += loss.item()

        ####################################

    avg_loss = train_loss / len(train_loader)
    avg_acc = correct_examples / total_examples
    print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
    model.eval()


    #######################

    # this help you compute the validation accuracy
    total_examples = 0
    correct_examples = 0

    val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
            inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
            output=model(inputs)
            loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
            _, predicted = output.max(1)
            total_examples += targets.size(0)
            correct_examples += predicted.eq(targets).sum().item()
            val_loss += loss.item()

            ####################################

    avg_loss = val_loss / len(val_loader)
    avg_acc = correct_examples / total_examples
    print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
    if avg_acc > best_val_acc:
        best_val_acc = avg_acc
        #if not os.path.exists(CHECKPOINT_FOLDER):
        #    os.makedirs(CHECKPOINT_FOLDER)
        #print("Saving ...")
        #state = {'state_dict': net.state_dict(),
        #         'epoch': i,
        #         'lr': current_learning_rate}
        #torch.save(state, os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth'))

    print('')

print("="*50)
print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!
Epoch 0:




Training loss: 1.9280, Training accuracy: 0.2856
Validation loss: 1.6041, Validation accuracy: 0.4048

Epoch 1:
Training loss: 1.5765, Training accuracy: 0.4195
Validation loss: 1.4467, Validation accuracy: 0.4816

Epoch 2:
Training loss: 1.4667, Training accuracy: 0.4658
Validation loss: 1.3632, Validation accuracy: 0.5082

Epoch 3:
Training loss: 1.3723, Training accuracy: 0.5030
Validation loss: 1.2984, Validation accuracy: 0.5390

Epoch 4:
Training loss: 1.3135, Training accuracy: 0.5288
Validation loss: 1.1896, Validation accuracy: 0.5730

Epoch 5:
Training loss: 1.2667, Training accuracy: 0.5464
Validation loss: 1.1532, Validation accuracy: 0.5992

Epoch 6:
Training loss: 1.2133, Training accuracy: 0.5692
Validation loss: 1.1385, Validation accuracy: 0.5998

Epoch 7:
Training loss: 1.1811, Training accuracy: 0.5795
Validation loss: 1.0781, Validation accuracy: 0.6214

Epoch 8:
Training loss: 1.1484, Training accuracy: 0.5907
Validation loss: 1.1199, Validation accuracy: 0.6080

E

## Question (b)

**Training the model with Batch Normalisation and Learning rate=0.01**

In [11]:
# define the SimpleNN mode with Batch Normalization
class SimpleNNWithBN(nn.Module):
    def __init__(self):
        super(SimpleNNWithBN, self).__init__()
        self.conv1 = nn.Conv2d(3, 8, 5)
        self.bn1 = nn.BatchNorm2d(8)
        self.conv2 = nn.Conv2d(8, 16, 3)
        self.bn2 = nn.BatchNorm2d(16)
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = F.max_pool2d(out, 2)
        out = F.relu(self.bn2(self.conv2(out)))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = F.relu(self.fc1(out))
        out = F.relu(self.fc2(out))
        out = self.fc3(out)
        return out

In [12]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using the GPU")
else:
    device = torch.device("cpu")
    print("Using the CPU")

model_2 = SimpleNNWithBN()
model_2.to(device)
print(model_2)

Using the GPU
SimpleNNWithBN(
  (conv1): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
  (bn1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
  (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


In [13]:
INITIAL_LR = 0.01
MOMENTUM = 0.9
REG = 1e-4
criterion =nn.CrossEntropyLoss()
optimizer =optim.SGD(model_2.parameters(), lr=INITIAL_LR, momentum=MOMENTUM, weight_decay=REG)


In [14]:

EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0
current_learning_rate = INITIAL_LR

print("==> Training starts!")
print("="*50)
for i in range(0, EPOCHS):

    model_2.train()
    print("Epoch %d:" %i)

    total_examples = 0
    correct_examples = 0

    train_loss = 0
    val_loss = 0

    # Train the model for 1 epoch.
    for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
         inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
         output= model_2(inputs)
         loss= criterion(output,targets)

         # zero the gradient

         optimizer.zero_grad()
         # backpropagation
         loss.backward()


        # apply gradient and update the weights
         optimizer.step()


        # count the number of correctly predicted samples in the current batch
         _, predicted = output.max(1)
         total_examples += targets.size(0)
         correct_examples += predicted.eq(targets).sum().item()
         train_loss += loss.item()

        ####################################

    avg_loss = train_loss / len(train_loader)
    avg_acc = correct_examples / total_examples
    print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
    model_2.eval()


    #######################

    # this help you compute the validation accuracy
    total_examples = 0
    correct_examples = 0

    val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
            inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
            output=model_2(inputs)
            loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
            _, predicted = output.max(1)
            total_examples += targets.size(0)
            correct_examples += predicted.eq(targets).sum().item()
            val_loss += loss.item()

            ####################################

    avg_loss = val_loss / len(val_loader)
    avg_acc = correct_examples / total_examples
    print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
    if avg_acc > best_val_acc:
        best_val_acc = avg_acc
        print("Saving model checkpoint...")
        checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
        torch.save({
            'epoch': i,
            'model_state_dict': model_2.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'best_val_acc': best_val_acc,
        }, checkpoint_path)

    print('')

print("="*50)
print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!
Epoch 0:




Training loss: 1.7301, Training accuracy: 0.3536
Validation loss: 1.4481, Validation accuracy: 0.4652
Saving model checkpoint...

Epoch 1:
Training loss: 1.4391, Training accuracy: 0.4729
Validation loss: 1.2930, Validation accuracy: 0.5262
Saving model checkpoint...

Epoch 2:
Training loss: 1.3313, Training accuracy: 0.5185
Validation loss: 1.2311, Validation accuracy: 0.5584
Saving model checkpoint...

Epoch 3:
Training loss: 1.2522, Training accuracy: 0.5512
Validation loss: 1.1493, Validation accuracy: 0.5956
Saving model checkpoint...

Epoch 4:
Training loss: 1.1964, Training accuracy: 0.5721
Validation loss: 1.1346, Validation accuracy: 0.5938

Epoch 5:
Training loss: 1.1511, Training accuracy: 0.5876
Validation loss: 1.0575, Validation accuracy: 0.6214
Saving model checkpoint...

Epoch 6:
Training loss: 1.1178, Training accuracy: 0.6001
Validation loss: 1.0044, Validation accuracy: 0.6350
Saving model checkpoint...

Epoch 7:
Training loss: 1.0887, Training accuracy: 0.6115
Valid

**Train Without Batch Normalisation with Learning rate=0.1**

In [15]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using the GPU")
else:
    device = torch.device("cpu")
    print("Using cpu")
    print("Using the CPU")

model_3 = SimpleNN()
model_3.to(device)
print(model_3)

Using the GPU
SimpleNN(
  (conv1): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


In [16]:
import torch.nn as nn
import torch.optim as optim
INITIAL_LR = 0.1
MOMENTUM = 0.9
REG = 1e-4
criterion =nn.CrossEntropyLoss()
optimizer =optim.SGD(model_3.parameters(), lr=INITIAL_LR, momentum=MOMENTUM, weight_decay=REG)


In [17]:
EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0
current_learning_rate = INITIAL_LR

print("==> Training starts!")
print("="*50)
for i in range(0, EPOCHS):

    model_3.train()
    print("Epoch %d:" %i)

    total_examples = 0
    correct_examples = 0

    train_loss = 0
    val_loss = 0

    # Train the model for 1 epoch.
    for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
         inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
         output= model_3(inputs)
         loss= criterion(output,targets)

         # zero the gradient

         optimizer.zero_grad()
         # backpropagation
         loss.backward()


        # apply gradient and update the weights
         optimizer.step()


        # count the number of correctly predicted samples in the current batch
         _, predicted = output.max(1)
         total_examples += targets.size(0)
         correct_examples += predicted.eq(targets).sum().item()
         train_loss += loss.item()

        ####################################

    avg_loss = train_loss / len(train_loader)
    avg_acc = correct_examples / total_examples
    print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
    model_3.eval()


    #######################

    # this help you compute the validation accuracy
    total_examples = 0
    correct_examples = 0

    val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
            inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
            output=model_3(inputs)
            loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
            _, predicted = output.max(1)
            total_examples += targets.size(0)
            correct_examples += predicted.eq(targets).sum().item()
            val_loss += loss.item()

            ####################################

    avg_loss = val_loss / len(val_loader)
    avg_acc = correct_examples / total_examples
    print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
    if avg_acc > best_val_acc:
        best_val_acc = avg_acc
        print("Saving model checkpoint...")
        checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
        torch.save({
            'epoch': i,
            'model_state_dict': model_3.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'best_val_acc': best_val_acc,
        }, checkpoint_path)

    print('')

print("="*50)
print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!
Epoch 0:




Training loss: 1.9152, Training accuracy: 0.2870
Validation loss: 1.7129, Validation accuracy: 0.3778
Saving model checkpoint...

Epoch 1:
Training loss: 1.7447, Training accuracy: 0.3626
Validation loss: 1.6381, Validation accuracy: 0.3900
Saving model checkpoint...

Epoch 2:
Training loss: 1.6667, Training accuracy: 0.3894
Validation loss: 1.6208, Validation accuracy: 0.4272
Saving model checkpoint...

Epoch 3:
Training loss: 1.6407, Training accuracy: 0.4082
Validation loss: 1.6214, Validation accuracy: 0.4326
Saving model checkpoint...

Epoch 4:
Training loss: 1.6325, Training accuracy: 0.4114
Validation loss: 1.5524, Validation accuracy: 0.4316

Epoch 5:
Training loss: 1.6442, Training accuracy: 0.4127
Validation loss: 1.6020, Validation accuracy: 0.4362
Saving model checkpoint...

Epoch 6:
Training loss: 1.6187, Training accuracy: 0.4197
Validation loss: 1.5319, Validation accuracy: 0.4576
Saving model checkpoint...

Epoch 7:
Training loss: 1.6099, Training accuracy: 0.4222
Valid

**Training the model with Batch Normalisation and Learning rate=0.1**

In [18]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using the GPU")
else:
    device = torch.device("cpu")
    print("Using the CPU")

model_4 = SimpleNNWithBN()
model_4.to(device)
print(model_4)

Using the GPU
SimpleNNWithBN(
  (conv1): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
  (bn1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
  (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


In [19]:
import torch.nn as nn
import torch.optim as optim
INITIAL_LR = 0.1
MOMENTUM = 0.9
REG = 1e-4
criterion =nn.CrossEntropyLoss()
optimizer =optim.SGD(model_4.parameters(), lr=INITIAL_LR, momentum=MOMENTUM, weight_decay=REG)

In [20]:
EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0
current_learning_rate = INITIAL_LR

print("==> Training starts!")
print("="*50)
for i in range(0, EPOCHS):

    model_4.train()
    print("Epoch %d:" %i)

    total_examples = 0
    correct_examples = 0

    train_loss = 0
    val_loss = 0

    # Train the model for 1 epoch.
    for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
         inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
         output= model_4(inputs)
         loss= criterion(output,targets)

         # zero the gradient

         optimizer.zero_grad()
         # backpropagation
         loss.backward()


        # apply gradient and update the weights
         optimizer.step()


        # count the number of correctly predicted samples in the current batch
         _, predicted = output.max(1)
         total_examples += targets.size(0)
         correct_examples += predicted.eq(targets).sum().item()
         train_loss += loss.item()

        ####################################

    avg_loss = train_loss / len(train_loader)
    avg_acc = correct_examples / total_examples
    print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
    model_4.eval()


    #######################

    # this help you compute the validation accuracy
    total_examples = 0
    correct_examples = 0

    val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
            inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
            output=model_4(inputs)
            loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
            _, predicted = output.max(1)
            total_examples += targets.size(0)
            correct_examples += predicted.eq(targets).sum().item()
            val_loss += loss.item()

            ####################################

    avg_loss = val_loss / len(val_loader)
    avg_acc = correct_examples / total_examples
    print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
    if avg_acc > best_val_acc:
        best_val_acc = avg_acc
        print("Saving model checkpoint...")
        checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
        torch.save({
            'epoch': i,
            'model_state_dict': model_4.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'best_val_acc': best_val_acc,
        }, checkpoint_path)

    print('')

print("="*50)
print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!
Epoch 0:




Training loss: 1.7229, Training accuracy: 0.3630
Validation loss: 1.6008, Validation accuracy: 0.4332
Saving model checkpoint...

Epoch 1:
Training loss: 1.5067, Training accuracy: 0.4529
Validation loss: 1.3677, Validation accuracy: 0.5164
Saving model checkpoint...

Epoch 2:
Training loss: 1.4087, Training accuracy: 0.4907
Validation loss: 1.2777, Validation accuracy: 0.5428
Saving model checkpoint...

Epoch 3:
Training loss: 1.3379, Training accuracy: 0.5214
Validation loss: 1.2672, Validation accuracy: 0.5504
Saving model checkpoint...

Epoch 4:
Training loss: 1.2980, Training accuracy: 0.5385
Validation loss: 1.2109, Validation accuracy: 0.5656
Saving model checkpoint...

Epoch 5:
Training loss: 1.2485, Training accuracy: 0.5586
Validation loss: 1.1841, Validation accuracy: 0.5850
Saving model checkpoint...

Epoch 6:
Training loss: 1.2243, Training accuracy: 0.5658
Validation loss: 1.1559, Validation accuracy: 0.5990
Saving model checkpoint...

Epoch 7:
Training loss: 1.1941, Trai

**Training the model with Swish**

In [21]:

class Swish(nn.Module):
    def forward(self, x):
        return x * torch.sigmoid(x)


class SimpleNNWithBNSwish(nn.Module):
    def __init__(self):
        super(SimpleNNWithBNSwish, self).__init__()
        self.conv1 = nn.Conv2d(3, 8, 5)
        self.bn1 = nn.BatchNorm2d(8)
        self.conv2 = nn.Conv2d(8, 16, 3)
        self.bn2 = nn.BatchNorm2d(16)
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        self.swish = Swish()

    def forward(self, x):
        out = self.swish(self.bn1(self.conv1(x)))
        out = F.max_pool2d(out, 2)
        out = self.swish(self.bn2(self.conv2(out)))
        out = F.max_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = self.swish(self.fc1(out))
        out = self.swish(self.fc2(out))
        out = self.fc3(out)
        return out


In [22]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using the GPU")
else:
    device = torch.device("cpu")
    print("Using the CPU")

model_5 = SimpleNNWithBNSwish()
model_5.to(device)
print(model_5)

Using the GPU
SimpleNNWithBNSwish(
  (conv1): Conv2d(3, 8, kernel_size=(5, 5), stride=(1, 1))
  (bn1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
  (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
  (swish): Swish()
)


In [23]:
INITIAL_LR = 0.1
MOMENTUM = 0.9
REG = 1e-4
criterion =nn.CrossEntropyLoss()
optimizer =optim.SGD(model_5.parameters(), lr=INITIAL_LR, momentum=MOMENTUM, weight_decay=REG)

In [24]:
EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0
current_learning_rate = INITIAL_LR

print("==> Training starts!")
print("="*50)
for i in range(0, EPOCHS):

    model_5.train()
    print("Epoch %d:" %i)

    total_examples = 0
    correct_examples = 0

    train_loss = 0
    val_loss = 0

    # Train the model for 1 epoch.
    for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
         inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
         output= model_5(inputs)
         loss= criterion(output,targets)

         # zero the gradient

         optimizer.zero_grad()
         # backpropagation
         loss.backward()


        # apply gradient and update the weights
         optimizer.step()


        # count the number of correctly predicted samples in the current batch
         _, predicted = output.max(1)
         total_examples += targets.size(0)
         correct_examples += predicted.eq(targets).sum().item()
         train_loss += loss.item()

        ####################################

    avg_loss = train_loss / len(train_loader)
    avg_acc = correct_examples / total_examples
    print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
    model_5.eval()


    #######################

    # this help you compute the validation accuracy
    total_examples = 0
    correct_examples = 0

    val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
            inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
            output=model_5(inputs)
            loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
            _, predicted = output.max(1)
            total_examples += targets.size(0)
            correct_examples += predicted.eq(targets).sum().item()
            val_loss += loss.item()

            ####################################

    avg_loss = val_loss / len(val_loader)
    avg_acc = correct_examples / total_examples
    print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
    if avg_acc > best_val_acc:
        best_val_acc = avg_acc
        print("Saving model checkpoint...")
        checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
        torch.save({
            'epoch': i,
            'model_state_dict': model_5.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'best_val_acc': best_val_acc,
        }, checkpoint_path)

    print('')

print("="*50)
print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!
Epoch 0:




Training loss: 1.6315, Training accuracy: 0.3952
Validation loss: 1.3093, Validation accuracy: 0.5262
Saving model checkpoint...

Epoch 1:
Training loss: 1.3303, Training accuracy: 0.5191
Validation loss: 1.3057, Validation accuracy: 0.5422
Saving model checkpoint...

Epoch 2:
Training loss: 1.2057, Training accuracy: 0.5672
Validation loss: 1.1096, Validation accuracy: 0.6092
Saving model checkpoint...

Epoch 3:
Training loss: 1.1270, Training accuracy: 0.6007
Validation loss: 1.0414, Validation accuracy: 0.6286
Saving model checkpoint...

Epoch 4:
Training loss: 1.0763, Training accuracy: 0.6207
Validation loss: 1.0256, Validation accuracy: 0.6414
Saving model checkpoint...

Epoch 5:
Training loss: 1.0407, Training accuracy: 0.6322
Validation loss: 1.0357, Validation accuracy: 0.6392

Epoch 6:
Training loss: 1.0231, Training accuracy: 0.6376
Validation loss: 0.9806, Validation accuracy: 0.6568
Saving model checkpoint...

Epoch 7:
Training loss: 0.9938, Training accuracy: 0.6512
Valid

In [25]:
import torch.nn as nn
import torch.optim as optim

In [26]:
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using the GPU")
else:
    device = torch.device("cpu")
    print("Using CPU")

Using the GPU


In [27]:
learning_rates = [1.0, 0.1, 0.05, 0.01, 0.005, 0.001]
MOMENTUM = 0.9
REG = 1e-4
criterion =nn.CrossEntropyLoss()

In [28]:
EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0


print("==> Training starts!")
print("="*50)

for lr in learning_rates:
    print("\n" * 2)
    print(f"Training with learning rate: {lr}")
    model_6 = SimpleNNWithBNSwish()
    model_6.to(device)


    optimizer = optim.SGD(model_6.parameters(), lr=lr, momentum=MOMENTUM, weight_decay=REG)
    for i in range(0, EPOCHS):
      model_6.train()
      print("Epoch %d:" %i)

      total_examples = 0
      correct_examples = 0

      train_loss = 0
      val_loss = 0

    # Train the model for 1 epoch.
      for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
          inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
          output= model_6(inputs)
          loss= criterion(output,targets)

         # zero the gradient

          optimizer.zero_grad()
         # backpropagation
          loss.backward()


        # apply gradient and update the weights
          optimizer.step()


        # count the number of correctly predicted samples in the current batch
          _, predicted = output.max(1)
          total_examples += targets.size(0)
          correct_examples += predicted.eq(targets).sum().item()
          train_loss += loss.item()

        ####################################

      avg_loss = train_loss / len(train_loader)
      avg_acc = correct_examples / total_examples
      print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
      model_6.eval()


    #######################

    # this help you compute the validation accuracy
      total_examples = 0
      correct_examples = 0

      val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
      with torch.no_grad():
          for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
              inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
              output=model_6(inputs)
              loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
              _, predicted = output.max(1)
              total_examples += targets.size(0)
              correct_examples += predicted.eq(targets).sum().item()
              val_loss += loss.item()

            ####################################

      avg_loss = val_loss / len(val_loader)
      avg_acc = correct_examples / total_examples
      print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
      if avg_acc > best_val_acc:
          best_val_acc = avg_acc
          print("Saving model checkpoint...")
          checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
          torch.save({
              'epoch': i,
              'model_state_dict': model_6.state_dict(),
              'optimizer_state_dict': optimizer.state_dict(),
              'best_val_acc': best_val_acc,
          }, checkpoint_path)

      print('')

    print("="*50)
    print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")

==> Training starts!



Training with learning rate: 1.0
Epoch 0:




Training loss: nan, Training accuracy: 0.1014
Validation loss: nan, Validation accuracy: 0.1028
Saving model checkpoint...

Epoch 1:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 2:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 3:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 4:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 5:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 6:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 7:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 8:
Training loss: nan, Training accuracy: 0.0999
Validation loss: nan, Validation accuracy: 0.1028

Epoch 9:
Training loss: nan,

In [30]:
import torch.optim as optim
import os

MOMENTUM = 0.9
criterion =nn.CrossEntropyLoss()
l2_strengths = [1e-2, 1e-3, 1e-4, 1e-5, 0.0]
LEARNING_RATE = 0.01
EPOCHS = 30
CHECKPOINT_FOLDER = "/content/drive/MyDrive/ECE661_HW2/"
best_val_acc = 0

for l2_strength in l2_strengths:
    print("\n" * 2)
    print(f"Training with L2 regularization strength: {l2_strength}")

    model_7 = SimpleNNWithBNSwish()
    model_7.to(device)

    # Define the optimizer with L2 regularization
    optimizer = optim.SGD(model_7.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM, weight_decay=l2_strength)
    for i in range(0, EPOCHS):
      model_7.train()
      print("Epoch %d:" %i)

      total_examples = 0
      correct_examples = 0

      train_loss = 0
      val_loss = 0

    # Train the model for 1 epoch.
      for batch_idx, (inputs, targets) in enumerate(train_loader):
        ####################################
        # your code here
        # copy inputs to device
          inputs, targets = inputs.to(device), targets.to(device)

         # compute the output and loss
          output= model_7(inputs)
          loss= criterion(output,targets)

         # zero the gradient

          optimizer.zero_grad()
         # backpropagation
          loss.backward()


        # apply gradient and update the weights
          optimizer.step()


        # count the number of correctly predicted samples in the current batch
          _, predicted = output.max(1)
          total_examples += targets.size(0)
          correct_examples += predicted.eq(targets).sum().item()
          train_loss += loss.item()

        ####################################

      avg_loss = train_loss / len(train_loader)
      avg_acc = correct_examples / total_examples
      print("Training loss: %.4f, Training accuracy: %.4f" %(avg_loss, avg_acc))

    # Validate on the validation dataset
    #######################
    # your code here
    # switch to eval mode
      model_7.eval()


    #######################

    # this help you compute the validation accuracy
      total_examples = 0
      correct_examples = 0

      val_loss = 0 # again, track the validation loss if you want

    # disable gradient during validation, which can save GPU memory
      with torch.no_grad():
          for batch_idx, (inputs, targets) in enumerate(val_loader):
            ####################################
            # your code here
            # copy inputs to device
              inputs, targets =inputs.to(device), targets.to(device)


            # compute the output and loss
              output=model_7(inputs)
              loss= criterion(output,targets)


            # count the number of correctly predicted samples in the current batch
              _, predicted = output.max(1)
              total_examples += targets.size(0)
              correct_examples += predicted.eq(targets).sum().item()
              val_loss += loss.item()

            ####################################

      avg_loss = val_loss / len(val_loader)
      avg_acc = correct_examples / total_examples
      print("Validation loss: %.4f, Validation accuracy: %.4f" % (avg_loss, avg_acc))

    # save the model checkpoint
      if avg_acc > best_val_acc:
          best_val_acc = avg_acc
          print("Saving model checkpoint...")
          checkpoint_path = os.path.join(CHECKPOINT_FOLDER, 'simplenn.pth')
          torch.save({
              'epoch': i,
              'model_state_dict': model_7.state_dict(),
              'optimizer_state_dict': optimizer.state_dict(),
              'best_val_acc': best_val_acc,
          }, checkpoint_path)

      print('')

    print("="*50)
    print(f"==> Optimization finished! Best validation accuracy: {best_val_acc:.4f}")







Training with L2 regularization strength: 0.01
Epoch 0:
Training loss: 1.8947, Training accuracy: 0.2927
Validation loss: 1.5811, Validation accuracy: 0.4178
Saving model checkpoint...

Epoch 1:
Training loss: 1.5528, Training accuracy: 0.4332
Validation loss: 1.4287, Validation accuracy: 0.4732
Saving model checkpoint...

Epoch 2:
Training loss: 1.4544, Training accuracy: 0.4718
Validation loss: 1.3275, Validation accuracy: 0.5238
Saving model checkpoint...

Epoch 3:
Training loss: 1.3977, Training accuracy: 0.4972
Validation loss: 1.3435, Validation accuracy: 0.5144

Epoch 4:
Training loss: 1.3701, Training accuracy: 0.5088
Validation loss: 1.3144, Validation accuracy: 0.5344
Saving model checkpoint...

Epoch 5:
Training loss: 1.3514, Training accuracy: 0.5178
Validation loss: 1.2641, Validation accuracy: 0.5464
Saving model checkpoint...

Epoch 6:
Training loss: 1.3321, Training accuracy: 0.5249
Validation loss: 1.2715, Validation accuracy: 0.5486
Saving model checkpoint...

Epoc

# Bonus: with learning rate decay

The following code can help you adjust the learning rate during training. You need to figure out how to incorporate this code into your training loop.
```python
    if i % DECAY_EPOCHS == 0 and i != 0:
        current_learning_rate = current_learning_rate * DECAY
        for param_group in optimizer.param_groups:
            param_group['lr'] = current_learning_rate
        print("Current learning rate has decayed to %f" %current_learning_rate)
```