# Part 1: Basic CNN

# Loading data
We begin by loading our data from the following places
- For the fashion.zip data set containing our images, we first upload the .zip file to Google Drive, and then mount in Google Colab in order to unzip it.

- For the .csv files containing the train and test values, we simply import from our computer.

In [5]:
# mount google drive
from google.colab import drive
drive.mount('/content/drive')

# load fashion image data from personal google drive
! unzip /content/drive/MyDrive/fashion.zip

# import test and train data from local system
from google.colab import files
upload_train = files.upload()
upload_test = files.upload()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Archive:  /content/drive/MyDrive/fashion.zip
replace images/10000.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: N


Saving train.csv to train.csv


Saving test.csv to test.csv
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Archive:  /content/drive/MyDrive/fashion.zip
replace images/10000.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: N


# Load training and test data
We then load the .csv files as data frame objects, noting that there are 40,441 rows in the training data, and 4,000 in the test data. To create a validation set, we randomly sample 4,000 images, or 10%, of the training data. This gives us a relatively even split and allows us to get a more accurate idea of how the neural network reacts to unseen data. In this step, we also separate the individual labels and convert them to integers for use in our CNN.

In [23]:
# load csvs as data frames
import pandas as pd
import csv

train_df = pd.read_csv('train.csv', delimiter="\t", quoting=csv.QUOTE_NONE, encoding='utf-8')
test_df = pd.read_csv('test.csv', delimiter="\t", quoting=csv.QUOTE_NONE, encoding='utf-8')
valid_df = train_df.sample(n = 4000, random_state = 999) # sample 4000 rows from test set for validation - with reproducible seed

# get label names for future use and convert to integer level
labels = train_df.sort_values('label')
class_names = list(labels.label.unique())

train_df['label_num'] = train_df['label'].apply(class_names.index)
test_df['label_num'] = test_df['label'].apply(class_names.index)
valid_df['label_num'] = valid_df['label'].apply(class_names.index)

# remove validation set from test dataframe
train_df = pd.concat([train_df, valid_df, valid_df]).drop_duplicates(keep=False)

# create total dataset out of others - extract labels for CNN
total_data = pd.concat([train_df, test_df, valid_df])
print(total_data.shape)

(44441, 4)


# Create train, test, and validation directories
Here, we simply divide the images into train, test, and valid folders, and move them to a separate 'Fashion' directory for convenience

In [15]:
import shutil
import os
import numpy as np

os.mkdir('Fashion')

# create directories for train, test and valid
os.mkdir('train')
os.mkdir('test')
os.mkdir('valid')

# copy images to training data set
for c in [str(x) for x in list(train_df['imageid'])]:
  get_image = os.path.join('/content/images', c + '.jpg')
  if not os.path.exists('train/'+c):
    move_image = shutil.copy(get_image,'/content/train/')

# do same for test data and validation data 
for c in [str(x) for x in list(test_df['imageid'])]:
  get_image = os.path.join('/content/images', c + '.jpg')
  if not os.path.exists('test/'+c):
    move_image = shutil.copy(get_image,'/content/test/')

for c in [str(x) for x in list(valid_df['imageid'])]:
  get_image = os.path.join('/content/images', c + '.jpg')
  if not os.path.exists('valid/'+c):
    move_image = shutil.copy(get_image,'/content/valid/')

# move data folders to Fashion directory
shutil.move('/content/train/', '/content/Fashion')
shutil.move('/content/test/', '/content/Fashion')
shutil.move('/content/valid/', '/content/Fashion')

'/content/Fashion/valid'

# Data exploration
Next, we perform some data exploration on our image data, where we confirm that we have 44,441 images present. We also take time to take a cursory look at some random images from the data set, as well as view the distribution of the labels and the sizes of each image.

Looking at the image sizes, we see that the majority of the images (around 99%, or 43,987/44,441) are 80 pixels high, 60 pixels wide, with three channels (RGB). However, we note that 431 images are 80x60, but only have one channel (grayscale), and that a handful of others are slightly smaller than 80x60, by only a few pixels. 

To remedy this, we resize all images to the standard size of 80x60 - any loss in quality is likely minimal, as the difference is only by a few pixels in each case. Additionally, since all images need to have the same number of input channels for a CNN, and as not to lose out on potentially valuable data, we will also transform all images such that they are grayscale - which has the added bonus of possibly speeding up computation time.



In [None]:
import imageio
import os
import glob
from collections import Counter
import random 
from google.colab.patches import cv2_imshow

# view random image from full data set
random_pic_file = random.choice(os.listdir('./images/'))
pic = imageio.imread('./images/'+ random_pic_file)
cv2_imshow(pic)
height, width, channels = pic.shape
print(f'height, width, and channel: {height} {width} {channels}')

# label distribution check
folder_path_options = ["./Fashion/train/", "./Fashion/test/", "./Fashion/valid/"]
for path in folder_path_options:
  # check how many training, testing, and validation images we have
  files = glob.glob(path+"*")
  file_count = len(files)
  print(f'There are {file_count} files in the folder at {path}')

# view distribution of training set
labels = train_df['label'].apply(class_names.index)
counts = Counter(labels)
total_count = len(labels)
print('-----TRAINING COUNT-----')
for value, count in sorted(counts.items()):
  distribution = count / total_count
  print(f'{value}: {count} ({distribution:.3%})')

# view distribution of validation set
labels = valid_df['label'].apply(class_names.index)
counts = Counter(labels)
total_count = len(labels)
print('-----VALID COUNT-----')
for value, count in sorted(counts.items()):
  distribution = count / total_count
  print(f'{value}: {count} ({distribution:.3%})')

# view how many different sizes of image are in the data set
image_sizes = []
for i in os.listdir('./images/'):
  pic = imageio.imread('./images/' + i)
  image_sizes.append(tuple(pic.shape)) # track each image's size in a list
  
print('-----FILE COUNTS-----')
print(Counter(image_sizes))

# Import necessary packages

In [8]:
import numpy as np
import pandas as pd
import torch
import os
import torch.nn as nn
import torchvision.transforms as transforms
from PIL import Image
import datetime

from torch.utils.data import ConcatDataset, DataLoader, Subset, Dataset
from torchvision.datasets import DatasetFolder, VisionDataset

from tqdm.auto import tqdm
import random

# Set up PyTorch

In [9]:
myseed = 999
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(myseed)
torch.manual_seed(myseed)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(myseed)

# Transform Data
Here, we simply define a function to carry out the transforms previously discussed in the data exploration section, which also transforms images into tensor objects.

In [10]:
train_tfm = transforms.Compose([
    transforms.Resize((80, 60)),
    transforms.Grayscale(),
    transforms.ToTensor()
])

test_tfm = transforms.Compose([
    transforms.Resize((80, 60)),
    transforms.Grayscale(),
    transforms.ToTensor()
])

# Dataset class

In [11]:
class FashionData(Dataset):
    def __init__(self, path, dat=total_data, tfm=test_tfm, files = None):
        super(FashionData).__init__()
        self.path = path
        self.files = sorted([os.path.join(path,x) for x in os.listdir(path) if x.endswith(".jpg")])
        if files != None:
            self.files = files
        print(f"One {path} sample",self.files[0])
        self.transform = tfm
        self.dat = dat
  
    def __len__(self):
        return len(self.files)
  
    def __getitem__(self,idx):
        fname = self.files[idx]
        im = Image.open(fname)
        im = self.transform(im)
        data = self.dat
        im_id = int(fname.split('/')[3].split('.jpg')[0])

        try:
            label = list(data[data['imageid'] == im_id]['label_num'])[0]
        except:
            label = -1
        return im,label

# CNN Model Class and Baseline Structure

Our motivation behind the structure of our CNN is fairly simple - how do we fit a given input image of size [1,80,60] into a fully connected feed forward model, while also making sure it doesn't take too long to train, given the number of images we have (36,441)? 

We do this by expanding the number of channels, doubling each time, using convolutional layers with a kernel size of 3 and padding and stride equal to 1, normalizing it using BatchNorm, and then using max pooling (with stride 2 and padding 2, this halves the resolution each time) to reduce the resolution until no longer possible (in this case, when the dimension of the image became (320,5,3)).

We then finally use a hidden layer with our input size of 320 neurons to reduce our final product once more, and classify it into one of our 13 classes.


In [12]:
class BaseCNN(nn.Module):
    def __init__(self,l1,l2):
        super(BaseCNN, self).__init__()
        self.cnn = nn.Sequential(
            
            # Layer 1
            nn.Conv2d(1, 80, 3, 1, 1), # [80, 80, 60]
            nn.BatchNorm2d(80),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 1), # [80, 40, 30]

            # Layer 2
            nn.Conv2d(80, 160, 3, 1, 1), # [160, 40, 30]
            nn.BatchNorm2d(160),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 1), # [160, 20, 15]

            # Layer 3
            nn.Conv2d(160, 320, 3, 1, 1), # [320, 20, 15]
            nn.BatchNorm2d(320),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 1), # [320, 10, 7]
            
            # Layer 4
            nn.Conv2d(320, 320, 3, 1, 1), # [320, 10, 7]
            nn.BatchNorm2d(320),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 1), # [320, 5, 3] 
        )
        self.fc = nn.Sequential(
            nn.Linear(320*30, l1),
            nn.ReLU(),
            nn.Linear(l1, l2),
            nn.ReLU(),
            nn.Linear(l2, 13)
        )
       
    def forward(self, x):
        out = self.cnn(x)
        out = out.view(out.size()[0], -1)
        return self.fc(out)


In [17]:
_exp_name = "sample"
batch_size = 32
data_dir = './Fashion/'

train_set = FashionData(os.path.join(data_dir,"train"), dat = total_data, tfm=train_tfm)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)
valid_set = FashionData(os.path.join(data_dir,"valid"), dat = total_data, tfm=test_tfm)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)

One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg


# Running model
For our model, we use the ADAM optimizer with default settings and learning rate (lr) set to $0.01$, iterated over 25 epochs, and with a batch size of 32. We will keep these numbers constant across all experiments unless otherwise stated or necessary, as not to cause confusion.

In [18]:
# "cuda" only when GPUs are available.
device = "cuda" if torch.cuda.is_available() else "cpu"

n_epochs = 25
patience = 300

# declare model
model = BaseCNN(640,320).to(device)

# criterion and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=1e-5) 

stale = 0
best_acc = 0

In [29]:
for epoch in range(n_epochs):

    model.train()
    train_loss = []
    train_accs = []

    for batch in tqdm(train_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # Forward the data. (Make sure data and model are on the same device.)
        logits = model(imgs.to(device))

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        train_loss.append(loss.item())
        train_accs.append(acc)
        
    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    # Print the information.
    print(f"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()

    # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(valid_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch
        #imgs = imgs.half()

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)
        #break

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)

    # Print the information.
    print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")


    # update logs
    if valid_acc > best_acc:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f} -> best")
    else:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")


    # save models
    if valid_acc > best_acc:
        print(f"Best model found at epoch {epoch}, saving model")
        torch.save(model.state_dict(), f"{_exp_name}_best.ckpt") # only save best to prevent output memory exceed error
        best_acc = valid_acc
        stale = 0
    else:
        stale += 1
        if stale > patience:
            print(f"No improvment {patience} consecutive epochs, early stopping")
            break

  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 001/025 ] loss = 1.40574, acc = 0.78268


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 001/025 ] loss = 0.53494, acc = 0.83775
[ Valid | 001/025 ] loss = 0.53494, acc = 0.83775 -> best
Best model found at epoch 0, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 002/025 ] loss = 0.36647, acc = 0.89145


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 002/025 ] loss = 0.36331, acc = 0.89050
[ Valid | 002/025 ] loss = 0.36331, acc = 0.89050 -> best
Best model found at epoch 1, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 003/025 ] loss = 0.30763, acc = 0.90934


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 003/025 ] loss = 0.32832, acc = 0.90275
[ Valid | 003/025 ] loss = 0.32832, acc = 0.90275 -> best
Best model found at epoch 2, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 004/025 ] loss = 0.27069, acc = 0.91790


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 004/025 ] loss = 0.30024, acc = 0.91275
[ Valid | 004/025 ] loss = 0.30024, acc = 0.91275 -> best
Best model found at epoch 3, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 005/025 ] loss = 0.24625, acc = 0.92515


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 005/025 ] loss = 0.43339, acc = 0.85750
[ Valid | 005/025 ] loss = 0.43339, acc = 0.85750


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 006/025 ] loss = 0.24253, acc = 0.92686


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 006/025 ] loss = 0.30952, acc = 0.90350
[ Valid | 006/025 ] loss = 0.30952, acc = 0.90350


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 007/025 ] loss = 0.22762, acc = 0.93124


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 007/025 ] loss = 0.26856, acc = 0.92350
[ Valid | 007/025 ] loss = 0.26856, acc = 0.92350 -> best
Best model found at epoch 6, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 008/025 ] loss = 0.21649, acc = 0.93365


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 008/025 ] loss = 0.26805, acc = 0.92625
[ Valid | 008/025 ] loss = 0.26805, acc = 0.92625 -> best
Best model found at epoch 7, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 009/025 ] loss = 0.21740, acc = 0.93519


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 009/025 ] loss = 0.23134, acc = 0.93200
[ Valid | 009/025 ] loss = 0.23134, acc = 0.93200 -> best
Best model found at epoch 8, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 010/025 ] loss = 0.21369, acc = 0.93641


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 010/025 ] loss = 0.35400, acc = 0.89325
[ Valid | 010/025 ] loss = 0.35400, acc = 0.89325


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 011/025 ] loss = 0.20674, acc = 0.93780


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 011/025 ] loss = 0.24406, acc = 0.92925
[ Valid | 011/025 ] loss = 0.24406, acc = 0.92925


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 012/025 ] loss = 0.19282, acc = 0.94214


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 012/025 ] loss = 0.24861, acc = 0.92975
[ Valid | 012/025 ] loss = 0.24861, acc = 0.92975


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 013/025 ] loss = 0.18814, acc = 0.94354


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 013/025 ] loss = 0.32409, acc = 0.92100
[ Valid | 013/025 ] loss = 0.32409, acc = 0.92100


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 014/025 ] loss = 0.19103, acc = 0.94289


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 014/025 ] loss = 0.47709, acc = 0.84875
[ Valid | 014/025 ] loss = 0.47709, acc = 0.84875


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 015/025 ] loss = 0.18468, acc = 0.94444


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 015/025 ] loss = 0.27769, acc = 0.92150
[ Valid | 015/025 ] loss = 0.27769, acc = 0.92150


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 016/025 ] loss = 0.18704, acc = 0.94476


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 016/025 ] loss = 0.25052, acc = 0.93850
[ Valid | 016/025 ] loss = 0.25052, acc = 0.93850 -> best
Best model found at epoch 15, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 017/025 ] loss = 0.18092, acc = 0.94607


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 017/025 ] loss = 0.25278, acc = 0.92100
[ Valid | 017/025 ] loss = 0.25278, acc = 0.92100


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 018/025 ] loss = 0.17551, acc = 0.94616


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 018/025 ] loss = 0.24759, acc = 0.93025
[ Valid | 018/025 ] loss = 0.24759, acc = 0.93025


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 019/025 ] loss = 0.17927, acc = 0.94590


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 019/025 ] loss = 0.25965, acc = 0.93525
[ Valid | 019/025 ] loss = 0.25965, acc = 0.93525


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 020/025 ] loss = 0.17396, acc = 0.94825


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 020/025 ] loss = 0.36535, acc = 0.92200
[ Valid | 020/025 ] loss = 0.36535, acc = 0.92200


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 021/025 ] loss = 0.16530, acc = 0.94970


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 021/025 ] loss = 0.26743, acc = 0.92900
[ Valid | 021/025 ] loss = 0.26743, acc = 0.92900


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 022/025 ] loss = 0.15861, acc = 0.95182


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 022/025 ] loss = 0.24377, acc = 0.93000
[ Valid | 022/025 ] loss = 0.24377, acc = 0.93000


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 023/025 ] loss = 0.16729, acc = 0.95150


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 023/025 ] loss = 0.28517, acc = 0.91450
[ Valid | 023/025 ] loss = 0.28517, acc = 0.91450


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 024/025 ] loss = 0.16309, acc = 0.95057


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 024/025 ] loss = 0.24373, acc = 0.93800
[ Valid | 024/025 ] loss = 0.24373, acc = 0.93800


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 025/025 ] loss = 0.15859, acc = 0.95297


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 025/025 ] loss = 0.22118, acc = 0.93650
[ Valid | 025/025 ] loss = 0.22118, acc = 0.93650


# Evaluate accuracy on best model

In [30]:
# set up test data loader
test_set = FashionData(os.path.join(data_dir,"test"), tfm=test_tfm)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)

model_best = BaseCNN(640,320).to(device)
model_best.load_state_dict(torch.load(f"{_exp_name}_best.ckpt"))
model_best.eval()
prediction = []

# predict based on test images
with torch.no_grad():
    for data,_ in test_loader:
        test_pred = model_best(data.to(device))
        test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
        prediction += test_label.squeeze().tolist()

One ./Fashion/test sample ./Fashion/test/10002.jpg


In [None]:
from sklearn.metrics import confusion_matrix

# make sure the dataframe with the actual values for the testing data is sorted the same way as the predictions
y = sorted([os.path.join('./Fashion/test',x) for x in os.listdir('./Fashion/test') if x.endswith(".jpg")])
for i in range(len(y)):
  y[i]=int(y[i].split('/')[3].split('.jpg')[0])

test_df = test_df.set_index('imageid')
test_df = test_df.reindex(y)

# display confusion matrix and print total accuracy
print(confusion_matrix(y_true = list(test_df['label_num']), y_pred = list(prediction), labels = None))

acc = 0
for i in range(len(list(prediction))):
  if list(test_df['label_num'])[i] == list(prediction)[i]:
    acc += 1
    
print('The final test accuracy is %.3f' % (acc/len(list(prediction))*100),'%')

# Baseline Conclusion
Looking at the performance of our best model, selected at epoch 15, we see that the model had a training accuracy of 94.476%, a validation accuracy of 93.850%, and a final test accuracy of 93.475%. For a completely unoptimized, basic CNN, these results appear to be satisfactory, albiet with room for improvement through methods such as data augmentation or hyperparameter tuning. We also notice some slight overfitting, where the training accuracy is higher than the validation and testing accuracy, but nothing too major.

# Part 2: Enhanced Models

# Hyperparameter tuning: Learning rate

The hyperparameter we are choosing to fine tune for model improvement is the learning rate. This is an important parameter because it, in essence, controls the speed at which the model updates and can have a large impact on results. For example, if the learning rate is too high, the optimizer may converge too quickly and we don't reach the optimal point, while if the learning rate is too low, the model may take too long to converge.

For our initial base model, we used a fairly standard learning rate of $lr = 0.01$, which yielded good results on its own. However, we want to determine if another learning rate would be better by using random search, over the interval $[0.000001, 0.1]$.



In [11]:
# install and initialize wandb
!pip install wandb
import wandb
wandb.login(relogin=True)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wandb
  Downloading wandb-0.14.0-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m27.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting GitPython!=3.1.29,>=1.0.0
  Downloading GitPython-3.1.31-py3-none-any.whl (184 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m184.3/184.3 KB[0m [31m21.8 MB/s[0m eta [36m0:00:00[0m
Collecting docker-pycreds>=0.4.0
  Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
Collecting setproctitle
  Downloading setproctitle-1.3.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30 kB)
Collecting pathtools
  Downloading pathtools-0.1.2.tar.gz (11 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentry-sdk>=1.0.0
  Downloading sentry_sdk-1.17.0-py2.py3-none-any.whl (189 kB)
[2K     [90m━━━━━━━━

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

# Initial parameters and config





In [12]:
from argparse import Namespace

# wandb config
sweep_config = {
    'method': 'random'
    }
metric = {
    'name': 'val_acc',
    'goal': 'maximize'   
    }
sweep_config['metric'] = metric
sweep_config['parameters'] = {}

sweep_config['parameters'].update({
    'project_name':{'value':'fashion'},
    'ckpt_path': {'value':'checkpoint.pt'}})

sweep_config['parameters'].update({
    'optim_type': {
        'values': ['Adam']
        }
    })

# values grid for learning rate
sweep_config['parameters'].update({
    'lr': {
        'distribution': 'log_uniform_values',
        'min': 1e-6,
        'max': 0.01
      },
})

# initial parameters
config = Namespace(
    project_name = 'fashion',
    batch_size = 32,
    lr = 0.01,
    epoch_num = 20,
    optim_type = 'Adam',
    ckpt_path = 'checkpoint.pt'
)

In [13]:
torch.backends.cudnn.deterministic = True
random.seed(hash("setting random seeds") % 2**32 - 1)
np.random.seed(hash("improves reproducibility") % 2**32 - 1)
torch.manual_seed(hash("by removing stochasticity") % 2**32 - 1)
torch.cuda.manual_seed_all(hash("so runs are repeatable") % 2**32 - 1)

In [14]:
def train(config = config):
    train_set = FashionData(os.path.join(data_dir,"train"), tfm=train_tfm)
    dl_train = DataLoader(train_set, batch_size=config.batch_size, shuffle=True, num_workers=0, pin_memory=True)
    valid_set = FashionData(os.path.join(data_dir,"valid"), tfm=test_tfm)
    dl_val = DataLoader(valid_set, batch_size=config.batch_size, shuffle=True, num_workers=0, pin_memory=True)
    model = BaseCNN(640,320).to(device)
    optimizer = torch.optim.__dict__[config.optim_type](params=model.parameters(), lr=config.lr)
    #======================================================================
    nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    wandb.init(project=config.project_name, config = config.__dict__, name = nowtime, save_code=True)
    model.run_id = wandb.run.id
    #======================================================================
    model.best_metric = -1.0
    criterion = criterion = nn.CrossEntropyLoss()

    for epoch in range(config.epoch_num):
        model = train_epoch(model,dl_train,optimizer,criterion)
        val_acc = eval_epoch(model,dl_val,criterion)
        if val_acc>model.best_metric:
            model.best_metric = val_acc
            torch.save(model.state_dict(),config.ckpt_path)   
        nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        print(f"epoch[{epoch}]@{nowtime} --> val_acc= {100 * val_acc:.2f}%")
        #======================================================================
        wandb.log({'epoch':epoch, 'val_acc': val_acc, 'best_val_acc':model.best_metric})
        #======================================================================        
    #======================================================================
    wandb.finish()
    #======================================================================
    return model
  
def train_epoch(model,dl_train,optimizer,criterion):
    model.train()
    for batch in tqdm(dl_train):
        # A batch consists of image data and corresponding labels.
        imgs, labels = batch
        # Forward the data. (Make sure data and model are on the same device.)
        logits = model(imgs.to(device))

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()        
    return model
  
def eval_epoch(model,val_loader,criterion):

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()

       # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(val_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)
        #break

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)
    return valid_acc


# Run tuning model

Due to time/processing constraints, we only use 20 epochs for sweep tuning instead of 25.

In [15]:
sweep_id = wandb.sweep(sweep_config, project=config.project_name)
wandb.agent(sweep_id, train, count = 5) # run 5 sweep trials on random lr value search

Create sweep with ID: nmqz8wd7
Sweep URL: https://wandb.ai/crns/fashion/sweeps/nmqz8wd7


[34m[1mwandb[0m: Agent Starting Run: 2z58d5en with config:
[34m[1mwandb[0m: 	ckpt_path: checkpoint.pt
[34m[1mwandb[0m: 	lr: 0.0002775896574420936
[34m[1mwandb[0m: 	optim_type: Adam
[34m[1mwandb[0m: 	project_name: fashion


One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg


[34m[1mwandb[0m: Currently logged in as: [33mcrns2023[0m ([33mcrns[0m). Use [1m`wandb login --relogin`[0m to force relogin


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[0]@2023-03-25 01:10:29 --> val_acc= 90.20%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[1]@2023-03-25 01:11:18 --> val_acc= 91.08%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[2]@2023-03-25 01:12:08 --> val_acc= 91.23%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[3]@2023-03-25 01:12:58 --> val_acc= 91.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[4]@2023-03-25 01:13:47 --> val_acc= 92.88%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[5]@2023-03-25 01:14:36 --> val_acc= 93.23%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[6]@2023-03-25 01:15:25 --> val_acc= 93.70%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[7]@2023-03-25 01:16:14 --> val_acc= 87.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[8]@2023-03-25 01:17:03 --> val_acc= 92.88%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[9]@2023-03-25 01:17:52 --> val_acc= 92.98%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[10]@2023-03-25 01:18:40 --> val_acc= 93.68%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[11]@2023-03-25 01:19:31 --> val_acc= 94.70%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[12]@2023-03-25 01:20:22 --> val_acc= 93.68%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[13]@2023-03-25 01:21:13 --> val_acc= 93.95%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[14]@2023-03-25 01:22:03 --> val_acc= 93.43%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[15]@2023-03-25 01:22:53 --> val_acc= 93.12%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[16]@2023-03-25 01:23:42 --> val_acc= 92.93%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[17]@2023-03-25 01:24:32 --> val_acc= 93.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[18]@2023-03-25 01:25:21 --> val_acc= 93.58%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[19]@2023-03-25 01:26:10 --> val_acc= 93.65%


VBox(children=(Label(value='1.058 MB of 1.058 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
best_val_acc,▁▂▃▃▅▆▆▆▆▆▆█████████
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
val_acc,▄▅▅▅▆▇▇▁▆▆▇█▇▇▇▇▆▇▇▇

0,1
best_val_acc,0.947
epoch,19.0
val_acc,0.9365


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: yms2tf6j with config:
[34m[1mwandb[0m: 	ckpt_path: checkpoint.pt
[34m[1mwandb[0m: 	lr: 3.8082280058437664e-06
[34m[1mwandb[0m: 	optim_type: Adam
[34m[1mwandb[0m: 	project_name: fashion


One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg




  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[0]@2023-03-25 01:27:20 --> val_acc= 84.85%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[1]@2023-03-25 01:28:09 --> val_acc= 89.15%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[2]@2023-03-25 01:28:58 --> val_acc= 91.20%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[3]@2023-03-25 01:29:47 --> val_acc= 92.90%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[4]@2023-03-25 01:30:36 --> val_acc= 88.75%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[5]@2023-03-25 01:31:24 --> val_acc= 93.30%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[6]@2023-03-25 01:32:13 --> val_acc= 92.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[7]@2023-03-25 01:33:02 --> val_acc= 92.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[8]@2023-03-25 01:33:51 --> val_acc= 92.55%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[9]@2023-03-25 01:34:40 --> val_acc= 93.55%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[10]@2023-03-25 01:35:31 --> val_acc= 92.88%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[11]@2023-03-25 01:36:21 --> val_acc= 92.93%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[12]@2023-03-25 01:37:12 --> val_acc= 93.70%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[13]@2023-03-25 01:38:02 --> val_acc= 92.20%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[14]@2023-03-25 01:38:51 --> val_acc= 93.75%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[15]@2023-03-25 01:39:41 --> val_acc= 93.20%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[16]@2023-03-25 01:40:31 --> val_acc= 94.08%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[17]@2023-03-25 01:41:21 --> val_acc= 94.15%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[18]@2023-03-25 01:42:10 --> val_acc= 94.05%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[19]@2023-03-25 01:42:59 --> val_acc= 94.35%


VBox(children=(Label(value='1.416 MB of 1.416 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
best_val_acc,▁▄▆▇▇▇▇▇▇▇▇▇████████
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
val_acc,▁▄▆▇▄▇▇▇▇▇▇▇█▆█▇████

0,1
best_val_acc,0.9435
epoch,19.0
val_acc,0.9435


[34m[1mwandb[0m: Agent Starting Run: 1843ix9g with config:
[34m[1mwandb[0m: 	ckpt_path: checkpoint.pt
[34m[1mwandb[0m: 	lr: 5.123645501221035e-05
[34m[1mwandb[0m: 	optim_type: Adam
[34m[1mwandb[0m: 	project_name: fashion


One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg




  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[0]@2023-03-25 01:44:01 --> val_acc= 87.38%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[1]@2023-03-25 01:44:50 --> val_acc= 90.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[2]@2023-03-25 01:45:39 --> val_acc= 87.95%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[3]@2023-03-25 01:46:28 --> val_acc= 85.15%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[4]@2023-03-25 01:47:18 --> val_acc= 92.53%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[5]@2023-03-25 01:48:07 --> val_acc= 91.73%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[6]@2023-03-25 01:48:56 --> val_acc= 92.63%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[7]@2023-03-25 01:49:45 --> val_acc= 92.80%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[8]@2023-03-25 01:50:34 --> val_acc= 93.10%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[9]@2023-03-25 01:51:23 --> val_acc= 93.33%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[10]@2023-03-25 01:52:12 --> val_acc= 92.85%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[11]@2023-03-25 01:53:02 --> val_acc= 93.23%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[12]@2023-03-25 01:53:51 --> val_acc= 93.53%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[13]@2023-03-25 01:54:41 --> val_acc= 92.63%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[14]@2023-03-25 01:55:30 --> val_acc= 93.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[15]@2023-03-25 01:56:19 --> val_acc= 93.80%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[16]@2023-03-25 01:57:09 --> val_acc= 93.45%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[17]@2023-03-25 01:57:58 --> val_acc= 93.72%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[18]@2023-03-25 01:58:47 --> val_acc= 93.75%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[19]@2023-03-25 01:59:36 --> val_acc= 94.08%


VBox(children=(Label(value='1.775 MB of 1.775 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
best_val_acc,▁▄▄▄▆▆▆▇▇▇▇▇▇▇▇█████
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
val_acc,▃▅▃▁▇▆▇▇▇▇▇▇█▇▇█████

0,1
best_val_acc,0.94075
epoch,19.0
val_acc,0.94075


[34m[1mwandb[0m: Agent Starting Run: q0mo0q40 with config:
[34m[1mwandb[0m: 	ckpt_path: checkpoint.pt
[34m[1mwandb[0m: 	lr: 2.2128264521222125e-05
[34m[1mwandb[0m: 	optim_type: Adam
[34m[1mwandb[0m: 	project_name: fashion


One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg




  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[0]@2023-03-25 02:00:40 --> val_acc= 85.88%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[1]@2023-03-25 02:01:30 --> val_acc= 90.30%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[2]@2023-03-25 02:02:19 --> val_acc= 91.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[3]@2023-03-25 02:03:08 --> val_acc= 91.60%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[4]@2023-03-25 02:03:58 --> val_acc= 88.00%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[5]@2023-03-25 02:04:49 --> val_acc= 90.38%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[6]@2023-03-25 02:05:40 --> val_acc= 92.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[7]@2023-03-25 02:06:31 --> val_acc= 92.28%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[8]@2023-03-25 02:07:20 --> val_acc= 92.80%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[9]@2023-03-25 02:08:10 --> val_acc= 91.28%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[10]@2023-03-25 02:08:59 --> val_acc= 94.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[11]@2023-03-25 02:09:48 --> val_acc= 94.28%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[12]@2023-03-25 02:10:37 --> val_acc= 93.30%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[13]@2023-03-25 02:11:27 --> val_acc= 92.98%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[14]@2023-03-25 02:12:17 --> val_acc= 94.65%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[15]@2023-03-25 02:13:09 --> val_acc= 93.65%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[16]@2023-03-25 02:14:00 --> val_acc= 94.30%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[17]@2023-03-25 02:14:52 --> val_acc= 94.38%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[18]@2023-03-25 02:15:44 --> val_acc= 94.28%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[19]@2023-03-25 02:16:35 --> val_acc= 93.10%


VBox(children=(Label(value='2.134 MB of 2.134 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
best_val_acc,▁▅▅▆▆▆▆▆▇▇██████████
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
val_acc,▁▅▅▆▃▅▆▆▇▅██▇▇█▇███▇

0,1
best_val_acc,0.9465
epoch,19.0
val_acc,0.931


[34m[1mwandb[0m: Agent Starting Run: qs0v4ahq with config:
[34m[1mwandb[0m: 	ckpt_path: checkpoint.pt
[34m[1mwandb[0m: 	lr: 0.0072272100194875006
[34m[1mwandb[0m: 	optim_type: Adam
[34m[1mwandb[0m: 	project_name: fashion


One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg




  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[0]@2023-03-25 02:17:41 --> val_acc= 82.53%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[1]@2023-03-25 02:18:32 --> val_acc= 90.08%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[2]@2023-03-25 02:19:23 --> val_acc= 86.40%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[3]@2023-03-25 02:20:14 --> val_acc= 91.20%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[4]@2023-03-25 02:21:05 --> val_acc= 91.68%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[5]@2023-03-25 02:21:56 --> val_acc= 91.78%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[6]@2023-03-25 02:22:46 --> val_acc= 92.60%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[7]@2023-03-25 02:23:37 --> val_acc= 92.83%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[8]@2023-03-25 02:24:28 --> val_acc= 92.98%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[9]@2023-03-25 02:25:19 --> val_acc= 93.28%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[10]@2023-03-25 02:26:08 --> val_acc= 92.25%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[11]@2023-03-25 02:26:57 --> val_acc= 93.15%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[12]@2023-03-25 02:27:47 --> val_acc= 93.30%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[13]@2023-03-25 02:28:36 --> val_acc= 93.72%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[14]@2023-03-25 02:29:26 --> val_acc= 94.12%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[15]@2023-03-25 02:30:15 --> val_acc= 93.85%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[16]@2023-03-25 02:31:04 --> val_acc= 93.15%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[17]@2023-03-25 02:31:53 --> val_acc= 93.95%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[18]@2023-03-25 02:32:42 --> val_acc= 94.38%


  0%|          | 0/1139 [00:00<?, ?it/s]

  0%|          | 0/125 [00:00<?, ?it/s]

epoch[19]@2023-03-25 02:33:31 --> val_acc= 94.68%


VBox(children=(Label(value='2.493 MB of 2.493 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
best_val_acc,▁▅▅▆▆▆▇▇▇▇▇▇▇▇██████
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
val_acc,▁▅▃▆▆▆▇▇▇▇▇▇▇▇██▇███

0,1
best_val_acc,0.94675
epoch,19.0
val_acc,0.94675


From our weights and biases test experiment, we find a highest test accuracy, of 94.7%, at a learning rate of 0.0002776. We next run our previous CNN with this learning rate instead, and see how it performs on our test data.

# Check accuracy of model with tuned parameters

In [18]:
# "cuda" only when GPUs are available.
device = "cuda" if torch.cuda.is_available() else "cpu"

n_epochs = 25
patience = 300

# declare model
model = BaseCNN(640,320).to(device)

# criterion and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=1e-5) 

stale = 0
best_acc = 0

for epoch in range(n_epochs):

    model.train()
    train_loss = []
    train_accs = []

    for batch in tqdm(train_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # Forward the data. (Make sure data and model are on the same device.)
        logits = model(imgs.to(device))

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        train_loss.append(loss.item())
        train_accs.append(acc)
        
    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    # Print the information.
    print(f"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()

    # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(valid_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch
        #imgs = imgs.half()

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)
        #break

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)

    # Print the information.
    print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")


    # update logs
    if valid_acc > best_acc:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f} -> best")
    else:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")


    # save models
    if valid_acc > best_acc:
        print(f"Best model found at epoch {epoch}, saving model")
        torch.save(model.state_dict(), f"{_exp_name}_best.ckpt") # only save best to prevent output memory exceed error
        best_acc = valid_acc
        stale = 0
    else:
        stale += 1
        if stale > patience:
            print(f"No improvment {patience} consecutive epochs, early stopping")
            break

  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 001/025 ] loss = 0.36645, acc = 0.88578


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 001/025 ] loss = 0.30789, acc = 0.91600
[ Valid | 001/025 ] loss = 0.30789, acc = 0.91600 -> best
Best model found at epoch 0, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 002/025 ] loss = 0.20492, acc = 0.93577


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 002/025 ] loss = 0.20171, acc = 0.93625
[ Valid | 002/025 ] loss = 0.20171, acc = 0.93625 -> best
Best model found at epoch 1, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 003/025 ] loss = 0.16084, acc = 0.94824


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 003/025 ] loss = 0.18313, acc = 0.94050
[ Valid | 003/025 ] loss = 0.18313, acc = 0.94050 -> best
Best model found at epoch 2, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 004/025 ] loss = 0.12965, acc = 0.95821


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 004/025 ] loss = 0.16377, acc = 0.95075
[ Valid | 004/025 ] loss = 0.16377, acc = 0.95075 -> best
Best model found at epoch 3, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 005/025 ] loss = 0.11192, acc = 0.96472


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 005/025 ] loss = 0.17581, acc = 0.94625
[ Valid | 005/025 ] loss = 0.17581, acc = 0.94625


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 006/025 ] loss = 0.08845, acc = 0.97089


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 006/025 ] loss = 0.20889, acc = 0.94750
[ Valid | 006/025 ] loss = 0.20889, acc = 0.94750


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 007/025 ] loss = 0.07423, acc = 0.97593


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 007/025 ] loss = 0.15753, acc = 0.95525
[ Valid | 007/025 ] loss = 0.15753, acc = 0.95525 -> best
Best model found at epoch 6, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 008/025 ] loss = 0.06824, acc = 0.97733


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 008/025 ] loss = 0.14970, acc = 0.95600
[ Valid | 008/025 ] loss = 0.14970, acc = 0.95600 -> best
Best model found at epoch 7, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 009/025 ] loss = 0.05164, acc = 0.98329


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 009/025 ] loss = 0.19524, acc = 0.95775
[ Valid | 009/025 ] loss = 0.19524, acc = 0.95775 -> best
Best model found at epoch 8, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 010/025 ] loss = 0.04667, acc = 0.98486


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 010/025 ] loss = 0.17578, acc = 0.96050
[ Valid | 010/025 ] loss = 0.17578, acc = 0.96050 -> best
Best model found at epoch 9, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 011/025 ] loss = 0.04105, acc = 0.98651


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 011/025 ] loss = 0.20091, acc = 0.95125
[ Valid | 011/025 ] loss = 0.20091, acc = 0.95125


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 012/025 ] loss = 0.03440, acc = 0.98863


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 012/025 ] loss = 0.23394, acc = 0.95075
[ Valid | 012/025 ] loss = 0.23394, acc = 0.95075


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 013/025 ] loss = 0.02821, acc = 0.99077


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 013/025 ] loss = 0.18149, acc = 0.96000
[ Valid | 013/025 ] loss = 0.18149, acc = 0.96000


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 014/025 ] loss = 0.03071, acc = 0.99037


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 014/025 ] loss = 0.19343, acc = 0.95700
[ Valid | 014/025 ] loss = 0.19343, acc = 0.95700


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 015/025 ] loss = 0.02381, acc = 0.99207


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 015/025 ] loss = 0.20752, acc = 0.96075
[ Valid | 015/025 ] loss = 0.20752, acc = 0.96075 -> best
Best model found at epoch 14, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 016/025 ] loss = 0.02279, acc = 0.99267


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 016/025 ] loss = 0.20473, acc = 0.96300
[ Valid | 016/025 ] loss = 0.20473, acc = 0.96300 -> best
Best model found at epoch 15, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 017/025 ] loss = 0.02161, acc = 0.99284


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 017/025 ] loss = 0.18989, acc = 0.96525
[ Valid | 017/025 ] loss = 0.18989, acc = 0.96525 -> best
Best model found at epoch 16, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 018/025 ] loss = 0.01920, acc = 0.99421


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 018/025 ] loss = 0.23422, acc = 0.95575
[ Valid | 018/025 ] loss = 0.23422, acc = 0.95575


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 019/025 ] loss = 0.02075, acc = 0.99333


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 019/025 ] loss = 0.22805, acc = 0.96100
[ Valid | 019/025 ] loss = 0.22805, acc = 0.96100


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 020/025 ] loss = 0.01529, acc = 0.99542


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 020/025 ] loss = 0.28353, acc = 0.95750
[ Valid | 020/025 ] loss = 0.28353, acc = 0.95750


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 021/025 ] loss = 0.02000, acc = 0.99399


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 021/025 ] loss = 0.19999, acc = 0.95550
[ Valid | 021/025 ] loss = 0.19999, acc = 0.95550


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 022/025 ] loss = 0.01424, acc = 0.99542


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 022/025 ] loss = 0.25829, acc = 0.95400
[ Valid | 022/025 ] loss = 0.25829, acc = 0.95400


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 023/025 ] loss = 0.01656, acc = 0.99481


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 023/025 ] loss = 0.22141, acc = 0.95625
[ Valid | 023/025 ] loss = 0.22141, acc = 0.95625


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 024/025 ] loss = 0.01348, acc = 0.99586


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 024/025 ] loss = 0.23076, acc = 0.95825
[ Valid | 024/025 ] loss = 0.23076, acc = 0.95825


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 025/025 ] loss = 0.01446, acc = 0.99539


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 025/025 ] loss = 0.27823, acc = 0.96175
[ Valid | 025/025 ] loss = 0.27823, acc = 0.96175


In [26]:
# set up test data loader
test_set = FashionData(os.path.join(data_dir,"test"), tfm=test_tfm)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)

model_best = BaseCNN(640,320).to(device)
model_best.load_state_dict(torch.load(f"{_exp_name}_best.ckpt"))
model_best.eval()
prediction = []

# predict based on test images
with torch.no_grad():
    for data,_ in test_loader:
        test_pred = model_best(data.to(device))
        test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
        prediction += test_label.squeeze().tolist()

One ./Fashion/test sample ./Fashion/test/10002.jpg


In [27]:
from sklearn.metrics import confusion_matrix

# make sure the dataframe with the actual values for the testing data is sorted the same way as the predictions
y = sorted([os.path.join('./Fashion/test',x) for x in os.listdir('./Fashion/test') if x.endswith(".jpg")])
for i in range(len(y)):
  y[i]=int(y[i].split('/')[3].split('.jpg')[0])

test_df = test_df.reindex(y)

# display confusion matrix and print total accuracy
print(confusion_matrix(y_true = list(test_df['label_num']), y_pred = list(prediction), labels = None))

acc = 0
for i in range(len(list(prediction))):
  if list(test_df['label_num'])[i] == list(prediction)[i]:
    acc += 1
    
print('The final test accuracy is %.3f' % (acc/len(list(prediction))*100),'%')

[[ 259    0    0    0    0    0    0    2    0    0    0    5    0]
 [   0  219    0    0    0    0    0    4    0    0    2    0    0]
 [   0    0  114    0    0    0    0    0    0    0    0    0    0]
 [   0    0    0   96    0    0    1    2    0    0    0    0    0]
 [   0    0    0    0  147    0    0    0    0    1   11    0    0]
 [   0    0    0    0    0  100    0    4    0    2    0    0    0]
 [   0    1    0    0    0    0   28    8    0    0    0    0    0]
 [   3    5    1    4    0    1    0  508    4    4   15    1    3]
 [   0    0    0    0    0    0    0   12   54   13    0    0    0]
 [   1    0    0    0    0    0    0    5    2  660    0    0    0]
 [   0    2    0    0    3    1    0   19    0    4 1357    0    0]
 [   2    0    0    0    0    0    0    0    0    0    0   68    0]
 [   2    0    0    0    0    0    0    1    0    0    0    0  239]]
The final test accuracy is 96.225 %


For the tuned model, our best model, at epoch 15, had a training accuracy of 99.284%, a validation accuracy of 96.525%, and a testing accuracy of 96.225%, which is a noticeable improvement over the previous base model by around 2% in the validation and testing accuracies. However, we note that there is potentially some slight overfitting going on with regards to the training accuracy, which is about 3% higher than both the testing and validation accuracy. We may want to take future steps to remedy this, and achieve even further improvement

# Data Augmentation

Next, we look at ways that we can create new images for training, by augmenting random training images using a series of data transforms. This may prevent overfitting of the model.

When considering potential transforms to apply to the images, we note that most of our images are fairly straightforward and distinct - images of fashion products, generally on white backgrounds, taken from store listings, at a relatively small image size (80x60). Therefore, for our purposes, we may want to avoid any sort of cropping transform that probably wouldn't give any valuable data to the model, and might actually harm performance. Instead, we look to implement a translation of the image - potentially a horizontal or vertical flip, or a rotation a certain number of degrees. 

After some experimentation to see what works well, we settled on a random rotation anywhere from 30 degrees counter-clockwise to 30 degrees clockwise, which doesn't drastically alter the data but offers enough variation in the transformed images to prevent overfitting.

# New transformation function


In [19]:
new_train_tfm = transforms.Compose([
    transforms.Resize((80, 60)),
    transforms.Grayscale(),
    transforms.RandomRotation(30), # rotate random images 30 degrees in either direction
    transforms.ToTensor(),
])

In [20]:
batch_size = 32

train_set = FashionData(os.path.join(data_dir,"train"), dat = total_data, tfm=new_train_tfm)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)
valid_set = FashionData(os.path.join(data_dir,"valid"), dat = total_data, tfm=test_tfm)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)

One ./Fashion/train sample ./Fashion/train/10000.jpg
One ./Fashion/valid sample ./Fashion/valid/10006.jpg


# Run augmented model

In [21]:
# "cuda" only when GPUs are available.
device = "cuda" if torch.cuda.is_available() else "cpu"

n_epochs = 25
patience = 300

# declare model
model = BaseCNN(640,320).to(device)

# criterion and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=1e-5) 

stale = 0
best_acc = 0
for epoch in range(n_epochs):

    model.train()
    train_loss = []
    train_accs = []

    for batch in tqdm(train_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # Forward the data. (Make sure data and model are on the same device.)
        logits = model(imgs.to(device))

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        train_loss.append(loss.item())
        train_accs.append(acc)
        
    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    # Print the information.
    print(f"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()

    # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(valid_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch
        #imgs = imgs.half()

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)
        #break

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)

    # Print the information.
    print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")

    # update logs
    if valid_acc > best_acc:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f} -> best")
    else:
        with open(f"./{_exp_name}_log.txt","a"):
            print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")

    # save models
    if valid_acc > best_acc:
        print(f"Best model found at epoch {epoch}, saving model")
        torch.save(model.state_dict(), f"{_exp_name}_best.ckpt") # only save best to prevent output memory exceed error
        best_acc = valid_acc
        stale = 0
    else:
        stale += 1
        if stale > patience:
            print(f"No improvment {patience} consecutive epochs, early stopping")
            break

  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 001/025 ] loss = 1.86835, acc = 0.64893


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 001/025 ] loss = 0.72407, acc = 0.79525
[ Valid | 001/025 ] loss = 0.72407, acc = 0.79525 -> best
Best model found at epoch 0, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 002/025 ] loss = 0.57390, acc = 0.82527


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 002/025 ] loss = 0.48279, acc = 0.85675
[ Valid | 002/025 ] loss = 0.48279, acc = 0.85675 -> best
Best model found at epoch 1, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 003/025 ] loss = 0.44438, acc = 0.86826


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 003/025 ] loss = 0.35758, acc = 0.89100
[ Valid | 003/025 ] loss = 0.35758, acc = 0.89100 -> best
Best model found at epoch 2, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 004/025 ] loss = 0.37158, acc = 0.88846


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 004/025 ] loss = 0.41072, acc = 0.87125
[ Valid | 004/025 ] loss = 0.41072, acc = 0.87125


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 005/025 ] loss = 0.34489, acc = 0.89791


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 005/025 ] loss = 0.37021, acc = 0.89650
[ Valid | 005/025 ] loss = 0.37021, acc = 0.89650 -> best
Best model found at epoch 4, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 006/025 ] loss = 0.33601, acc = 0.90001


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 006/025 ] loss = 0.49590, acc = 0.85800
[ Valid | 006/025 ] loss = 0.49590, acc = 0.85800


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 007/025 ] loss = 0.32942, acc = 0.90194


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 007/025 ] loss = 0.29806, acc = 0.90675
[ Valid | 007/025 ] loss = 0.29806, acc = 0.90675 -> best
Best model found at epoch 6, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 008/025 ] loss = 0.31370, acc = 0.90547


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 008/025 ] loss = 0.30077, acc = 0.90950
[ Valid | 008/025 ] loss = 0.30077, acc = 0.90950 -> best
Best model found at epoch 7, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 009/025 ] loss = 0.30383, acc = 0.90729


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 009/025 ] loss = 0.33358, acc = 0.89725
[ Valid | 009/025 ] loss = 0.33358, acc = 0.89725


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 010/025 ] loss = 0.30426, acc = 0.90856


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 010/025 ] loss = 0.33829, acc = 0.90175
[ Valid | 010/025 ] loss = 0.33829, acc = 0.90175


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 011/025 ] loss = 0.30053, acc = 0.90852


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 011/025 ] loss = 0.30950, acc = 0.91250
[ Valid | 011/025 ] loss = 0.30950, acc = 0.91250 -> best
Best model found at epoch 10, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 012/025 ] loss = 0.29589, acc = 0.90888


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 012/025 ] loss = 0.39758, acc = 0.89050
[ Valid | 012/025 ] loss = 0.39758, acc = 0.89050


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 013/025 ] loss = 0.28770, acc = 0.91332


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 013/025 ] loss = 0.32593, acc = 0.90750
[ Valid | 013/025 ] loss = 0.32593, acc = 0.90750


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 014/025 ] loss = 0.28228, acc = 0.91314


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 014/025 ] loss = 0.30838, acc = 0.91675
[ Valid | 014/025 ] loss = 0.30838, acc = 0.91675 -> best
Best model found at epoch 13, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 015/025 ] loss = 0.28238, acc = 0.91389


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 015/025 ] loss = 0.36380, acc = 0.89975
[ Valid | 015/025 ] loss = 0.36380, acc = 0.89975


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 016/025 ] loss = 0.27786, acc = 0.91568


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 016/025 ] loss = 0.38243, acc = 0.88475
[ Valid | 016/025 ] loss = 0.38243, acc = 0.88475


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 017/025 ] loss = 0.27204, acc = 0.91607


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 017/025 ] loss = 0.30122, acc = 0.90800
[ Valid | 017/025 ] loss = 0.30122, acc = 0.90800


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 018/025 ] loss = 0.27637, acc = 0.91678


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 018/025 ] loss = 0.30907, acc = 0.90125
[ Valid | 018/025 ] loss = 0.30907, acc = 0.90125


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 019/025 ] loss = 0.27086, acc = 0.91746


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 019/025 ] loss = 0.26438, acc = 0.92000
[ Valid | 019/025 ] loss = 0.26438, acc = 0.92000 -> best
Best model found at epoch 18, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 020/025 ] loss = 0.26976, acc = 0.91755


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 020/025 ] loss = 0.28824, acc = 0.91450
[ Valid | 020/025 ] loss = 0.28824, acc = 0.91450


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 021/025 ] loss = 0.26803, acc = 0.91588


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 021/025 ] loss = 146.08657, acc = 0.33675
[ Valid | 021/025 ] loss = 146.08657, acc = 0.33675


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 022/025 ] loss = 0.26535, acc = 0.92040


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 022/025 ] loss = 0.28367, acc = 0.91350
[ Valid | 022/025 ] loss = 0.28367, acc = 0.91350


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 023/025 ] loss = 0.26687, acc = 0.91879


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 023/025 ] loss = 0.27322, acc = 0.91600
[ Valid | 023/025 ] loss = 0.27322, acc = 0.91600


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 024/025 ] loss = 0.26227, acc = 0.92027


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 024/025 ] loss = 0.24771, acc = 0.92475
[ Valid | 024/025 ] loss = 0.24771, acc = 0.92475 -> best
Best model found at epoch 23, saving model


  0%|          | 0/1139 [00:00<?, ?it/s]

[ Train | 025/025 ] loss = 0.25685, acc = 0.92284


  0%|          | 0/125 [00:00<?, ?it/s]

[ Valid | 025/025 ] loss = 0.33771, acc = 0.90750
[ Valid | 025/025 ] loss = 0.33771, acc = 0.90750


In [25]:
# set up test data loader
test_set = FashionData(os.path.join(data_dir,"test"), tfm=test_tfm)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)

model_best = BaseCNN(640,320).to(device)
model_best.load_state_dict(torch.load(f"{_exp_name}_best.ckpt"))
model_best.eval()
prediction = []

# predict based on test images
with torch.no_grad():
    for data,_ in test_loader:
        test_pred = model_best(data.to(device))
        test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
        prediction += test_label.squeeze().tolist()

from sklearn.metrics import confusion_matrix

# make sure the dataframe with the actual values for the testing data is sorted the same way as the predictions
y = sorted([os.path.join('./Fashion/test',x) for x in os.listdir('./Fashion/test') if x.endswith(".jpg")])
for i in range(len(y)):
  y[i]=int(y[i].split('/')[3].split('.jpg')[0])

test_df = test_df.reindex(y)

# display confusion matrix and print total accuracy
print(confusion_matrix(y_true = list(test_df['label_num']), y_pred = list(prediction), labels = None))

acc = 0
for i in range(len(list(prediction))):
  if list(test_df['label_num'])[i] == list(prediction)[i]:
    acc += 1
    
print('The final test accuracy is %.3f' % (acc/len(list(prediction))*100),'%')

One ./Fashion/test sample ./Fashion/test/10002.jpg
[[ 245    0    0    1    0    2    0    4    0    1    1   12    0]
 [   1  207    0    0    1    5    0    6    0    0    2    3    0]
 [   0    0  114    0    0    0    0    0    0    0    0    0    0]
 [   1    0    0   73    0    4    0   21    0    0    0    0    0]
 [   0    2    0    0  131    2    0    1    0    0   22    1    0]
 [   0    0    0    0    0  101    0    3    0    1    1    0    0]
 [   2    1    0    0    0    0    6   27    0    1    0    0    0]
 [  11   15    2    7    2    7    1  449    3    2   44    2    4]
 [   0    0    0    0    0    0    0   14   54   11    0    0    0]
 [   0    0    0    0    0    0    0   16    4  648    0    0    0]
 [   4    5    0    0   11    4    0   15    0    2 1345    0    0]
 [   3    1    0    0    0    0    0    0    0    0    0   66    0]
 [   2    0    0    0    0    0    0    6    0    0    2    0  232]]
The final test accuracy is 91.775 %


# Improvement Conclusion

From our augmented experiment, we see that our best model at epoch 23 gives us a training accuracy of 92.027%, a validation accuracy of 92.475%, and a testing accuracy of 91.775%; slightly worse than our initial model. However, we note that our validation accuracy is actually higher than our training accuracy - indicating that the added images helped with overfitting, and that perhaps if given more training epochs (to help the training set accumulate more augmented image data), our accuracy could surpass the base CNN.

In conclusion, comparing our two improved models, they both improved on something with the baseline, while also falling further short in another thing. For the tuned learning rate model, we increased accuracy significantly but made the slight overfitting worse, while for the augmented data model, we did the opposite - accuracy got worse, but we fixed the overfitting. Perhaps for a future model, we could combine these two in order to see if these effects overlap for a truly superior prediction.