# Image Classification with Convolution Neural Network (CNN) Assignment

1. Construct Kaggle Dataset 
2. Construct a simple CNN
3. Set hyperparameters (optimizer, criterion, num epochs)
4. Write train / validate code

In [4]:
# Import libraries to use for Deep Learning 

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split, Subset
from torchvision import datasets, transforms
from torchvision.io import read_image
from torchsummary import summary
import pandas as pd
from PIL import Image
import os 

import cv2 as cv2

In [27]:
!pip install gdown && gdown 'https://drive.google.com/uc?id=1rctM1HDoc24XOcRzsYyTSavaFrvuoKZc' && unzip ./archive.zip -d ./sports
from google.colab import drive
drive.mount('/content/drive')

#!unzip "/content/drive/MyDrive/archive.zip" -d "/content/drive/MyDrive/sports"

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Access denied with the following error:

 	Too many users have viewed or downloaded this file recently. Please
	try accessing the file again later. If the file you are trying to
	access is particularly large or is shared with many people, it may
	take up to 24 hours to be able to view or download the file. If you
	still can't access a file after 24 hours, contact your domain
	administrator. 

You may still be able to access the file from the browser:

	 https://drive.google.com/uc?id=1rctM1HDoc24XOcRzsYyTSavaFrvuoKZc 

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


#1. (Assignment) Construct Kaggle Dataset

- Please construct custom dataset dealt in the class.
- Do not use `torch.utils.data.ImageFolder`.
- The structure of Custom Dataset follows 
- Tips) use `sports.csv` files to get data. (it contains filepath, labels and which dataset each data belongs to)
- Tips) use `class_dict.csv` to get the index of each class - numeric values, not string.
- Tips) there are some grayscale (1-channel) images. I recommend to use `cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)` to make it 3-channel image.

```
class CustomDataset(torch.utils.data.Dataset):
    # Inherit torch.utils.data.Dataset class

    def __init__(self,):
        # Initialize the dataset (handling data paths, check input and target data, data augmentation, etc.)

    def __len__(self):
        # Return the number of data or sample in dataset 
    
    def __getitem__(self, index):
        # Return the input and target by index
```


In [12]:
### PLEASE WRITE YOUR CODE BELOW.

class CustomDataset(Dataset):

    def __init__(self,root_dir,img_file,class_file,data,transform=None):
        # Initialize the dataset (handling data paths, check input and target data, data augmentation, etc.)
        self.root_dir = root_dir
        self.classes = pd.read_csv(class_file)
        self.sports_file = pd.read_csv(img_file)
        self.images_train_file = self.sports_file[self.sports_file["data set"]==data]        
        self.transform = transform 

    def __len__(self):
        # Return the number of data or sample in dataset 
        return len(self.images_train_file) 

    def __getitem__(self, index):
        # Return the input and target by index
        img_path = os.path.join(self.root_dir, self.images_train_file.iloc[index, 0])        
        img_label = self.images_train_file.iloc[index, 1]
        row = self.classes.loc[self.classes['class'] == img_label]
        label = torch.tensor(row["class_index"].values[0])        
        image = Image.open(img_path).convert('RGB')      
                
        if self.transform:
            image = self.transform(image)
        
        return (image,label)
        
### END OF THE CODE.

In [13]:
### PLEASE WRITE YOUR CODE BELOW.

train_dataset = CustomDataset(
    root_dir = '/content/drive/MyDrive/sports/',
    img_file='/content/drive/MyDrive/sports/sports.csv',
    class_file='/content/drive/MyDrive/sports/class_dict.csv',
    data="train",
    transform=transforms.ToTensor(),
    
)
valid_dataset = CustomDataset(
    root_dir = '/content/drive/MyDrive/sports/',
    img_file='/content/drive/MyDrive/sports/sports.csv',
    class_file='/content/drive/MyDrive/sports/class_dict.csv',
    data="valid",
    transform=transforms.ToTensor(),
    
)
test_dataset = CustomDataset(
    root_dir = '/content/drive/MyDrive/sports/',
    img_file='/content/drive/MyDrive/sports/sports.csv',
    class_file='/content/drive/MyDrive/sports/class_dict.csv',
    data="test",
    transform=transforms.ToTensor(),
    
)
### YOU CAN USE ANY TRANSFORMS YOU WANT. MAKE IT RUNNABLE!

train_loader = DataLoader(dataset=train_dataset, batch_size=128, shuffle=True)
valid_loader = DataLoader(dataset=valid_dataset, batch_size=128, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=128, shuffle=True)

### END OF THE CODE.

##2. (Assignment) Construct a network - Simple CNN

- Please construct 4 convolution blocks with following sequences.


```
first layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 16, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

second layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 32, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

thrid layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 64, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

fourth layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 128, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

classifier = [Linear -> ReLU -> Linear]

- flatten the output tensor.
- first linear layer returns output dimension of 5012
- second linear layer returns output dimension of number of classes

```


In [14]:
### PLEASE WRITE YOUR CODE BELOW.

class SimpleCNN(nn.Module):
    def __init__(self, in_channels, num_classes):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=in_channels, out_channels=16, kernel_size=3,
                               stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3,
                               stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3,
                               stride=1, padding=1)
        self.conv4 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3,
                               stride=1, padding=1)
        
        self.avg_pool = nn.AvgPool2d(kernel_size=2, stride=2)
        

        self.bn1 = nn.BatchNorm2d(num_features=16)
        self.bn2 = nn.BatchNorm2d(num_features=32)
        self.bn3 = nn.BatchNorm2d(num_features=64)
        self.bn4 = nn.BatchNorm2d(num_features=128)

        self.dropout = nn.Dropout(p=0.5)

        self.activation = nn.ReLU()

        self.linear = nn.Linear(in_features=128*14*14, out_features=5012)
        self.classifier = nn.Linear(in_features=5012, out_features=num_classes)

    def forward(self, x):
        out = self.avg_pool(self.dropout(self.activation(self.bn1(self.conv1(x)))))
        out = self.avg_pool(self.dropout(self.activation(self.bn2(self.conv2(out)))))
        out = self.avg_pool(self.dropout(self.activation(self.bn3(self.conv3(out)))))
        out = self.avg_pool(self.dropout(self.activation(self.bn4(self.conv4(out)))))

        out = out.view(out.size(0), -1)        

        out = self.linear(out)
        out = self.activation(out)
        out = self.classifier(out)

        return out

### END OF THE CODE.

In [15]:
model = SimpleCNN(in_channels=3, num_classes=100).cuda()
summary(model, (3, 224, 224), device='cuda')

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 16, 224, 224]             448
       BatchNorm2d-2         [-1, 16, 224, 224]              32
              ReLU-3         [-1, 16, 224, 224]               0
           Dropout-4         [-1, 16, 224, 224]               0
         AvgPool2d-5         [-1, 16, 112, 112]               0
            Conv2d-6         [-1, 32, 112, 112]           4,640
       BatchNorm2d-7         [-1, 32, 112, 112]              64
              ReLU-8         [-1, 32, 112, 112]               0
           Dropout-9         [-1, 32, 112, 112]               0
        AvgPool2d-10           [-1, 32, 56, 56]               0
           Conv2d-11           [-1, 64, 56, 56]          18,496
      BatchNorm2d-12           [-1, 64, 56, 56]             128
             ReLU-13           [-1, 64, 56, 56]               0
          Dropout-14           [-1, 64,

##3. (Assignment) Set hyperparameters

- Set the total number of epochs to be 50 and the learning rate to be 0.001.

- Use any optimizers you want. Please refer [here](https://pytorch.org/docs/stable/optim.html) for furter details.
    - Remember different optimizers have different hyperparameters.
- Set the loss function to be cross entropy loss.

In [16]:
### PLEASE FILL OUT THE HYPERPARAMETERS
### NOTE THAT YOU SHOULD SET DIFFERENT PARAMETERS FOR DIFFERENT OPTIMIZERS.

lr = 0.001
epochs = 50

## OPTIMIZER HYPERPARAMETERS - PLEASE ADD/REMOVE DEPENDS ON OPTIMIZER.
betas = (0.9, 0.999)
momentum = None

## WHEN USING GPU, PUT `.cuda()` on model and criterion.

model = SimpleCNN(in_channels=3, num_classes=100).cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=lr, betas=betas)
criterion = nn.CrossEntropyLoss().cuda()

##4. (Assignment) Write train / validation code

- For each epoch, we train and validate the model.
- Note that the validation dataset is not included in test set. 
- Please refer to the following procedure:


    for each epoch:
        model.train()
        get input and target data from train loader
        
        optmizer.zero_grad()             # reset the gradient 
        pred = model(input)

        loss = criterion(pred, target)   # compute the loss
        loss.backward()                  # backprop
        optimizer.step()                 # update the model weights

        model.eval()                     # set the evaluation mode (turn off batchnorm, dropout)
        with torch.no_grad():
            get the input and target data from validation loader

            pred = model(input)
            compute the validation loss  # Optional 
            calculate the validation accuracy
            save the model w.r.t. validation accuracy



In [17]:
def train(model, optimizer, criterion, data_loader, epoch):
    model.train()
    total_loss = 0.0
    for idx, batch in enumerate(data_loader):
        img, target = batch[0].cuda(), batch[1].cuda()

        ### PLEASE WRITE YOUR CODE BELOW.
        # Initialize the optimizer
        optimizer.zero_grad()

        # Make a prediction
        output = model(img)

        # Calculate loss with prediction and target
        loss = criterion(output, target)

        # Compute the gradient
        loss.backward()

        # Update Parameters
        optimizer.step()

        ### END OF THE CODE.

        total_loss += loss.item() 

        if idx % 10 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch + 1, idx * img.size(0), len(data_loader.dataset),
                100. * idx * img.size(0) / len(data_loader.dataset), 
                loss.data))

    return total_loss / len(data_loader)

def validate(model, criterion, data_loader):
    model.eval()
    val_loss = 0.0
    val_acc = 0.0

    with torch.no_grad():
        for idx, batch in enumerate(data_loader):
            img, target = batch[0].cuda(), batch[1].cuda()

            ### PLEASE WRITE YOUR CODE BELOW.

            # Make a prediction
            output = model(img)

            # Calculate validation loss (although it is optional)
            loss = criterion(output, target)

            # Get the right prediction - make sure naming the prediction as 'predicted'
            _, predicted = torch.max(output.data, 1) 

            ### END OF THE CODE.

            val_loss += loss.item()
            val_acc += (predicted == target).sum().item()

        total_val_acc = val_acc / len(data_loader.dataset)
        print('\nValidation set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            val_loss / len(data_loader), val_acc, len(data_loader.dataset),
            100. * total_val_acc))
    
    return total_val_acc

In [28]:
def test(model, data_loader):
    model.eval()
    test_acc = 0.0

    with torch.no_grad():
        for idx, batch in enumerate(data_loader):
            img, target = batch[0].cuda(), batch[1].cuda()

            ### PLEASE WRITE YOUR CODE BELOW.

            # Make a prediction
            output = model(img)

            # Calculate validation loss (although it is optional)
            loss = criterion(output, target)

            # Get the right prediction - make sure naming the prediction as 'predicted'
            _, predicted = torch.max(output.data, 1) 
            test_acc += (predicted == target).sum().item() 

            ### END OF THE CODE.

        print('\n Test set:  Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_acc, len(data_loader.dataset),
            100. * test_acc / len(data_loader.dataset)))

In [20]:
for epoch in range(epochs):

    ### PLEASE WRITE YOUR CODE BELOW.
    
    # Train your model with train dataloader

    train_loss = train(model, optimizer, criterion, train_loader, epoch)

    # Validate your model with validation dataloader

    validation_accuracy = validate(model, criterion, valid_loader)

    ### END OF THE CODE.


Validation set: Average loss: 4.3078, Accuracy: 16.0/500 (3%)


Validation set: Average loss: 3.8555, Accuracy: 39.0/500 (8%)


Validation set: Average loss: 3.5665, Accuracy: 62.0/500 (12%)


Validation set: Average loss: 3.3565, Accuracy: 74.0/500 (15%)


Validation set: Average loss: 3.1399, Accuracy: 105.0/500 (21%)


Validation set: Average loss: 3.3707, Accuracy: 105.0/500 (21%)


Validation set: Average loss: 2.9310, Accuracy: 127.0/500 (25%)


Validation set: Average loss: 2.7146, Accuracy: 155.0/500 (31%)


Validation set: Average loss: 2.6912, Accuracy: 156.0/500 (31%)


Validation set: Average loss: 2.5712, Accuracy: 168.0/500 (34%)


Validation set: Average loss: 2.5948, Accuracy: 184.0/500 (37%)


Validation set: Average loss: 2.6672, Accuracy: 171.0/500 (34%)


Validation set: Average loss: 2.8663, Accuracy: 160.0/500 (32%)


Validation set: Average loss: 2.9929, Accuracy: 158.0/500 (32%)


Validation set: Average loss: 2.5064, Accuracy: 195.0/500 (39%)


Validation set:

In [29]:
# Test your model with test dataloader
None

In [30]:
### PLEASE EXECUTE THE FOLLOWING CELL BEFORE SUBMITTING YOUR CODE

### DO NOT MODIFY THIS CELL
print(train_dataset[0][0].size())
print(model(torch.rand(1, 3, 224, 224, device='cuda')).size())
test(model, test_loader)
### DO NOT MODIFY THIS CELL

torch.Size([3, 224, 224])
torch.Size([1, 100])

 Test set:  Accuracy: 218.0/500 (44%)

