<a href="https://colab.research.google.com/github/UNIST-LIM-Lab-course/cnn-assignment-JJukE/blob/main/CNN_assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Image Classification with Convolution Neural Network (CNN) Assignment

1. Construct Kaggle Dataset 
2. Construct a simple CNN
3. Set hyperparameters (optimizer, criterion, num epochs)
4. Write train / validate code

In [1]:
# Import libraries to use for Deep Learning 

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split, Subset
from torchvision import datasets, transforms
from torchvision.io import read_image
from torchsummary import summary
import pandas as pd
from PIL import Image
import os 

import cv2 as cv2

In [4]:
!pip install gdown && gdown 'https://drive.google.com/uc?id=1rctM1HDoc24XOcRzsYyTSavaFrvuoKZc' && unzip ./archive.zip -d ./sports

[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
  inflating: ./sports/train/rings/058.jpg  
  inflating: ./sports/train/rings/059.jpg  
  inflating: ./sports/train/rings/060.jpg  
  inflating: ./sports/train/rings/061.jpg  
  inflating: ./sports/train/rings/062.jpg  
  inflating: ./sports/train/rings/063.jpg  
  inflating: ./sports/train/rings/064.jpg  
  inflating: ./sports/train/rings/065.jpg  
  inflating: ./sports/train/rings/066.jpg  
  inflating: ./sports/train/rings/067.jpg  
  inflating: ./sports/train/rings/068.jpg  
  inflating: ./sports/train/rings/069.jpg  
  inflating: ./sports/train/rings/070.jpg  
  inflating: ./sports/train/rings/071.jpg  
  inflating: ./sports/train/rings/072.jpg  
  inflating: ./sports/train/rings/073.jpg  
  inflating: ./sports/train/rings/074.jpg  
  inflating: ./sports/train/rings/075.jpg  
  inflating: ./sports/train/rings/076.jpg  
  inflating: ./sports/train/rings/077.jpg  
  inflating: ./sports/train/rings/078.jpg  
  inflating: ./sports/trai

#1. (Assignment) Construct Kaggle Dataset

- Please construct custom dataset dealt in the class.
- Do not use `torch.utils.data.ImageFolder`.
- The structure of Custom Dataset follows 
- Tips) use `sports.csv` files to get data. (it contains filepath, labels and which dataset each data belongs to)
- Tips) use `class_dict.csv` to get the index of each class - numeric values, not string.
- Tips) there are some grayscale (1-channel) images. I recommend to use `cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)` to make it 3-channel image.

```
class CustomDataset(torch.utils.data.Dataset):
    # Inherit torch.utils.data.Dataset class

    def __init__(self,):
        # Initialize the dataset (handling data paths, check input and target data, data augmentation, etc.)

    def __len__(self):
        # Return the number of data or sample in dataset 
    
    def __getitem__(self, index):
        # Return the input and target by index
```


In [2]:
### PLEASE WRITE YOUR CODE BELOW.

class CustomDataset(Dataset):

    def __init__(self, root_dir='./sports/', data_type='train'):
        # Initialize the dataset (handling data paths, check input and target data, data augmentation, etc.)
        csv_path = os.path.join(root_dir, 'sports.csv')
        data_df = pd.read_csv(csv_path)

        label_path = os.path.join(root_dir, 'class_dict.csv')
        label_df = pd.read_csv(label_path)

        images = []
        class_indices = []

        transform = transforms.Compose([
            transforms.ToTensor()
            # transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])

        for i in range(len(data_df)):
          if data_type in data_df['filepaths'][i]:
            image = cv2.imread(os.path.join(root_dir, data_df['filepaths'][i]))
            if image.shape[2] != 3:
              image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB)
            images.append(transform(image))
            class_indices.append(label_df['class_index'][label_df['class'] == data_df['labels'][i]].item())        

        self.data = images
        self.classes = class_indices

    def __len__(self):
        # Return the number of data or sample in dataset  
        return len(self.data)

    def __getitem__(self, index):
        # Return the input and target by index
        input = self.data[index]
        target = self.classes[index]
        return input, target
### END OF THE CODE.

In [5]:
### PLEASE WRITE YOUR CODE BELOW.

train_dataset = CustomDataset(root_dir='./sports/', data_type='train')
valid_dataset = CustomDataset(root_dir='./sports/', data_type='valid')
test_dataset = CustomDataset(root_dir='./sports/', data_type='test')

print(train_dataset.__len__(), valid_dataset.__len__(), test_dataset.__len__())

### YOU CAN USE ANY TRANSFORMS YOU WANT. MAKE IT RUNNABLE!

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=32, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

### END OF THE CODE.

13572 500 500


##2. (Assignment) Construct a network - Simple CNN

- Please construct 4 convolution blocks with following sequences.


```
first layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 16, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

second layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 32, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

thrid layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 64, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

fourth layer = [2D Conv -> BatchNorm -> ReLU -> Dropout -> Pooling]

- 2D Convolution with 3x3 kernel size, returns output dimension of 128, use stride and padding = 1.
- use whatever pooling you want with 2x2 kernel size and stride of 2.

classifier = [Linear -> ReLU -> Linear]

- flatten the output tensor.
- first linear layer returns output dimension of 5012
- second linear layer returns output dimension of number of classes

```


In [12]:
### PLEASE WRITE YOUR CODE BELOW.

class SimpleCNN(nn.Module):
    def __init__(self, in_channels, num_classes):
        super().__init__()

        self.layer1 = nn.Sequential(
          nn.Conv2d(in_channels=in_channels, out_channels=16, kernel_size=3, stride=1, padding=1),
          nn.BatchNorm2d(num_features=16),
          nn.ReLU(),
          nn.Dropout(p=0.8),
          nn.AvgPool2d(kernel_size=2, stride=2)
        )

        self.layer2 = nn.Sequential(
          nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=1, padding=1),
          nn.BatchNorm2d(num_features=32),
          nn.ReLU(),
          nn.Dropout(p=0.8),
          nn.AvgPool2d(kernel_size=2, stride=2)
        )

        self.layer3 = nn.Sequential(
          nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1),
          nn.BatchNorm2d(num_features=64),
          nn.ReLU(),
          nn.Dropout(p=0.8),
          nn.AvgPool2d(kernel_size=2, stride=2)
        )

        self.layer4 = nn.Sequential(
          nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1),
          nn.BatchNorm2d(num_features=128),
          nn.ReLU(),
          nn.Dropout(p=0.8),
          nn.AvgPool2d(kernel_size=2, stride=2)
        )

        self.classifier = nn.Sequential(
          nn.Linear(in_features=14*14*128, out_features=5012),
          nn.ReLU(),
          nn.Linear(in_features=5012, out_features=num_classes)
        )

    def forward(self, x):
        
        out = self.layer1(x)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)

        out = out.view(out.size(0), -1)
        out = self.classifier(out)

        return out

### END OF THE CODE.

In [13]:
model = SimpleCNN(in_channels=3, num_classes=100).cuda()
summary(model, (3, 224, 224), device='cuda')

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 16, 224, 224]             448
       BatchNorm2d-2         [-1, 16, 224, 224]              32
              ReLU-3         [-1, 16, 224, 224]               0
           Dropout-4         [-1, 16, 224, 224]               0
         AvgPool2d-5         [-1, 16, 112, 112]               0
            Conv2d-6         [-1, 32, 112, 112]           4,640
       BatchNorm2d-7         [-1, 32, 112, 112]              64
              ReLU-8         [-1, 32, 112, 112]               0
           Dropout-9         [-1, 32, 112, 112]               0
        AvgPool2d-10           [-1, 32, 56, 56]               0
           Conv2d-11           [-1, 64, 56, 56]          18,496
      BatchNorm2d-12           [-1, 64, 56, 56]             128
             ReLU-13           [-1, 64, 56, 56]               0
          Dropout-14           [-1, 64,

##3. (Assignment) Set hyperparameters

- Set the total number of epochs to be 50 and the learning rate to be 0.001.

- Use any optimizers you want. Please refer [here](https://pytorch.org/docs/stable/optim.html) for furter details.
    - Remember different optimizers have different hyperparameters.
- Set the loss function to be cross entropy loss.

In [14]:
### PLEASE FILL OUT THE HYPERPARAMETERS
### NOTE THAT YOU SHOULD SET DIFFERENT PARAMETERS FOR DIFFERENT OPTIMIZERS.

lr = 0.001
epochs = 50

## OPTIMIZER HYPERPARAMETERS - PLEASE ADD/REMOVE DEPENDS ON OPTIMIZER.
betas = (0.9, 0.999)

## WHEN USING GPU, PUT `.cuda()` on model and criterion.

model = SimpleCNN(in_channels=3, num_classes=100).cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=lr, betas=betas)
criterion = nn.CrossEntropyLoss().cuda()

##4. (Assignment) Write train / validation code

- For each epoch, we train and validate the model.
- Note that the validation dataset is not included in test set. 
- Please refer to the following procedure:


    for each epoch:
        model.train()
        get input and target data from train loader
        
        optmizer.zero_grad()             # reset the gradient 
        pred = model(input)

        loss = criterion(pred, target)   # compute the loss
        loss.backward()                  # backprop
        optimizer.step()                 # update the model weights

        model.eval()                     # set the evaluation mode (turn off batchnorm, dropout)
        with torch.no_grad():
            get the input and target data from validation loader

            pred = model(input)
            compute the validation loss  # Optional 
            calculate the validation accuracy
            save the model w.r.t. validation accuracy



In [15]:
def train(model, optimizer, criterion, data_loader, epoch):
    model.train()
    total_loss = 0.0
    for idx, batch in enumerate(data_loader):
        img, target = batch[0].cuda(), batch[1].cuda()

        ### PLEASE WRITE YOUR CODE BELOW.
        # Initialize the optimizer
        optimizer.zero_grad()

        # Make a prediction
        pred = model(img)

        # Calculate loss with prediction and target
        loss = criterion(pred, target)

        # Compute the gradient
        loss.backward()

        # Update Parameters
        optimizer.step()

        ### END OF THE CODE.

        total_loss += loss.item() 

        if idx % 10 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch + 1, idx * img.size(0), len(data_loader.dataset),
                100. * idx * img.size(0) / len(data_loader.dataset), 
                loss.data))

    return total_loss / len(data_loader)

def validate(model, criterion, data_loader):
    model.eval()
    val_loss = 0.0
    val_acc = 0.0

    with torch.no_grad():
        for idx, batch in enumerate(data_loader):
            img, target = batch[0].cuda(), batch[1].cuda()

            ### PLEASE WRITE YOUR CODE BELOW.

            # Make a prediction
            pred = model(img)

            # Calculate validation loss (although it is optional)
            loss = criterion(pred, target)

            # Get the right prediction - make sure naming the prediction as 'predicted'
            _, predicted = torch.max(pred.data, 1)

            ### END OF THE CODE.

            val_loss += loss.item()
            val_acc += (predicted == target).sum().item()

        total_val_acc = val_acc / len(data_loader.dataset)
        print('\nValidation set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            val_loss / len(data_loader), val_acc, len(data_loader.dataset),
            100. * total_val_acc))
    
    return total_val_acc

In [16]:
def test(model, data_loader):
    model.eval()
    test_loss = 0.0
    test_acc = 0.0

    with torch.no_grad():
        for idx, batch in enumerate(data_loader):
            img, target = batch[0].cuda(), batch[1].cuda()

            ### PLEASE WRITE YOUR CODE BELOW.

            # Make a prediction
            output = model(img)

            # Calculate validation loss (although it is optional)
            loss = criterion(output, target)

            # Get the right prediction - make sure naming the prediction as 'predicted'
            _, predicted = torch.max(output.data, 1)
            test_loss += loss.item()
            test_acc += (predicted == target).sum().item()

            ### END OF THE CODE.

        print('\n Test set:  Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_acc, len(data_loader.dataset),
            100. * test_acc / len(data_loader.dataset)))

In [17]:
for epoch in range(epochs):

    ### PLEASE WRITE YOUR CODE BELOW.
    
    # Train your model with train dataloader
    train_loss = train(model, optimizer, criterion, train_loader, epoch)

    # Validate your model with validation dataloader
    validation_accuracy = validate(model, criterion, valid_loader)

    ### END OF THE CODE.


Validation set: Average loss: 4.0055, Accuracy: 33.0/500 (7%)


Validation set: Average loss: 3.7822, Accuracy: 57.0/500 (11%)


Validation set: Average loss: 3.5503, Accuracy: 76.0/500 (15%)


Validation set: Average loss: 3.2536, Accuracy: 89.0/500 (18%)


Validation set: Average loss: 2.9339, Accuracy: 132.0/500 (26%)


Validation set: Average loss: 2.9522, Accuracy: 125.0/500 (25%)


Validation set: Average loss: 2.8970, Accuracy: 125.0/500 (25%)


Validation set: Average loss: 2.7737, Accuracy: 144.0/500 (29%)


Validation set: Average loss: 2.6698, Accuracy: 144.0/500 (29%)


Validation set: Average loss: 2.6081, Accuracy: 166.0/500 (33%)


Validation set: Average loss: 2.6345, Accuracy: 175.0/500 (35%)


Validation set: Average loss: 2.4463, Accuracy: 179.0/500 (36%)


Validation set: Average loss: 2.6471, Accuracy: 165.0/500 (33%)


Validation set: Average loss: 2.8153, Accuracy: 156.0/500 (31%)


Validation set: Average loss: 2.3745, Accuracy: 195.0/500 (39%)


Validation set

In [18]:
# Test your model with test dataloader
test(model, test_loader)


 Test set:  Accuracy: 268.0/500 (54%)



In [19]:
### PLEASE EXECUTE THE FOLLOWING CELL BEFORE SUBMITTING YOUR CODE

### DO NOT MODIFY THIS CELL
print(train_dataset[0][0].size())
print(model(torch.rand(1, 3, 224, 224, device='cuda')).size())
test(model, test_loader)
### DO NOT MODIFY THIS CELL

torch.Size([3, 224, 224])
torch.Size([1, 100])

 Test set:  Accuracy: 268.0/500 (54%)

