<a href="https://colab.research.google.com/github/hugoalfedoputra-ub/ml/blob/main/nn_course/T2/Tugas_2_Studi_Kasus_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Studi Kasus CNN
Deep Learning B

Hugo Alfedo Putra\
225150201111013

04 Oktober 2024

Colab ini berdasarkan tutorial dari PyTorch: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html dan milik yunjey: https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/01-basics/feedforward_neural_network/main.py.

In [None]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Deklarasi parameter untuk model yang akan dilatih dengan dataset MNIST dan CIFAR-10 dibuat mirip terlebih dahulu berdasarkan tutorial dari PyTorch. Perbedaan pada parameter model MNIST adalah besar kernel convolution, di mana pada CIFAR-10 digunakan 5 sedangkan pada MNIST digunakan 4 karena dimensi data MNIST yang lebih kecil daripada data CIFAR-10.

In [None]:
# Parameter untuk MNIST
mnist_input_size = 28
mnist_hidden_size = [120, 84]
mnist_num_classes = 10
mnist_num_epochs = 5
mnist_batch_size = 100
mnist_learning_rate = 0.001
mnist_in_channels = 1 # karena kedalamannya 1 (grayscale)
mnist_conv1_out_channels = 6
mnist_conv2_out_channels = 16
mnist_conv_kernel_size = 4
mnist_pool_kernel_size = 2
mnist_stride = 2

In [None]:
# Parameter untuk CIFAR
cifar_input_size = 32
cifar_hidden_size = [120, 84]
cifar_num_classes = 10
cifar_num_epochs = 3
cifar_batch_size = 4
cifar_learning_rate = 0.001
cifar_in_channels = 3 # karena kedalamannya 3 (RGB)
cifar_conv1_out_channels = 6
cifar_conv2_out_channels = 16
cifar_conv_kernel_size = 5
cifar_pool_kernel_size = 2
cifar_stride = 2

# Download dataset

In [None]:
# MNIST dataset
transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0,), (1,))
])

mnist_train_set = torchvision.datasets.MNIST(root='../../data',
                                           train=True,
                                           transform=transforms.ToTensor(),
                                           download=True)

mnist_test_set = torchvision.datasets.MNIST(root='../../data',
                                          train=False,
                                          transform=transforms.ToTensor())

# Data loader
mnist_train_loader = torch.utils.data.DataLoader(dataset=mnist_train_set,
                                           batch_size=mnist_batch_size,
                                           shuffle=True)

mnist_test_loader = torch.utils.data.DataLoader(dataset=mnist_test_set,
                                          batch_size=mnist_batch_size,
                                          shuffle=False)

In [None]:
# CIFAR-10 dataset
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

cifar_train_set = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
cifar_test_set = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)

# Data loader
cifar_train_loader = torch.utils.data.DataLoader(cifar_train_set, batch_size=cifar_batch_size,
                                          shuffle=True, num_workers=2)
cifar_test_loader = torch.utils.data.DataLoader(cifar_test_set, batch_size=cifar_batch_size,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified


# Definisi kelas Net

Dilakukan modifikasi dari tutorial PyTorch agar dapat dimodifikasi sesuai dengan parameter-parameter khusus untuk dataset tertentu. Terlihat pula pada deklarasi self.net bahwa arsitektur NN sbb.:

1. Layer input
2. Layer convolutional pertama yang diaktivasi dengan ReLU
3. Layer pooling pertama
4. Layer convolutional kedua yang diaktivasi dengan ReLU
5. Layer pooling kedua
6. Layer FC (dense) pertama hasil flattening dari pooling kedua yang diaktivasi dengan ReLU
7. Layer FC kedua yang diaktivasi dengan ReLU
8. Layer FC ketiga sebagai output dengan 10 kelas (tanpa aktivasi untuk training karena output akan diteruskan ke loss, optimasi, dan backpropagation)

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
  def __init__(
      self,
      input_size,
      hidden_size,
      in_channels,
      conv1_out_channels,
      conv2_out_channels,
      conv_kernel_size,
      pool_kernel_size,
      num_classes,
      stride=2
      ):
    super().__init__()
    self.conv1 = nn.Conv2d(in_channels, conv1_out_channels, conv_kernel_size)
    self.pool = nn.MaxPool2d(pool_kernel_size, stride)
    self.conv2 = nn.Conv2d(conv1_out_channels, conv2_out_channels, conv_kernel_size)
    self.fc1 = nn.Linear(
        self._get_flattened_size(input_size, conv1_out_channels, conv2_out_channels, conv_kernel_size, pool_kernel_size, stride),
        hidden_size[0])
    self.fc2 = nn.Linear(hidden_size[0], hidden_size[1])
    self.fc3 = nn.Linear(hidden_size[1], num_classes)
    self.net = nn.Sequential(
        self.conv1, nn.ReLU(),
        self.pool,
        self.conv2, nn.ReLU(),
        self.pool,
        nn.Flatten(),
        self.fc1, nn.ReLU(),
        self.fc2, nn.ReLU(),
        self.fc3
    )

  def _get_flattened_size(self, input_size, conv1_out, conv2_out, kernel_size, pool_size, stride):
    conv1_out_size = (input_size - kernel_size) + 1
    pool1_out_size = (conv1_out_size - pool_size) // stride + 1
    conv2_out_size = (pool1_out_size - kernel_size) + 1
    pool2_out_size = (conv2_out_size - pool_size) // stride + 1
    return conv2_out * pool2_out_size * pool2_out_size

  def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = torch.flatten(x, 1)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

  # Berdasarkan https://www.d2l.ai/chapter_convolutional-neural-networks/lenet.html
  # untuk merincikan layer apa saja pada net dan shape-nya
  def layer_summary(self, input_shape):
    X = torch.randn(*input_shape)
    for layer in self.net:
      X = layer(X)
      print(layer.__class__.__name__, 'output shape:\t', X.shape)

# Deklarasi model untuk dataset MNIST

In [None]:
mnist_model = Net(
    mnist_input_size,
    mnist_hidden_size,
    mnist_in_channels,
    mnist_conv1_out_channels,
    mnist_conv2_out_channels,
    mnist_conv_kernel_size,
    mnist_pool_kernel_size,
    mnist_num_classes,
    mnist_stride).to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(mnist_model.parameters(), lr=mnist_learning_rate)

# Deklarasi model untuk dataset CIFAR-10

In [None]:
cifar_model = Net(
    cifar_input_size,
    cifar_hidden_size,
    cifar_in_channels,
    cifar_conv1_out_channels,
    cifar_conv2_out_channels,
    cifar_conv_kernel_size,
    cifar_pool_kernel_size,
    cifar_num_classes,
    cifar_stride).to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(cifar_model.parameters(), lr=cifar_learning_rate, momentum=0.9)

# Definisi fungsi train dan test model

In [None]:
def train_and_test_model(model, train_loader, num_epochs, input_size, test_loader):
  # Train the model
  total_step = len(train_loader)
  losses = []
  for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
      # Move tensors to the configured device
      images = images.to(device)
      labels = labels.to(device)

      # Forward pass
      outputs = model(images)
      loss = criterion(outputs, labels)

      # Backward and optimize
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()
      losses.append(loss.item())

      if (i+1) % 100 == 0:
        print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

  # Test the model
  # In test phase, we don't need to compute gradients (for memory efficiency)
  with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
      images = images.to(device)
      labels = labels.to(device)
      outputs = model(images)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

# Hasil

## Informasi arsitektur net MNIST

In [None]:
mnist_model.layer_summary((mnist_batch_size, mnist_in_channels, mnist_input_size, mnist_input_size))

Conv2d output shape:	 torch.Size([100, 6, 25, 25])
ReLU output shape:	 torch.Size([100, 6, 25, 25])
MaxPool2d output shape:	 torch.Size([100, 6, 12, 12])
Conv2d output shape:	 torch.Size([100, 16, 9, 9])
ReLU output shape:	 torch.Size([100, 16, 9, 9])
MaxPool2d output shape:	 torch.Size([100, 16, 4, 4])
Flatten output shape:	 torch.Size([100, 256])
Linear output shape:	 torch.Size([100, 120])
ReLU output shape:	 torch.Size([100, 120])
Linear output shape:	 torch.Size([100, 84])
ReLU output shape:	 torch.Size([100, 84])
Linear output shape:	 torch.Size([100, 10])


Formatnya apabila terdapat empat parameter berupa:\
`[batch size, banyak channel, tinggi, lebar]`

Pada dua parameter menunjukkan batch size dan banyaknya neuron saja.

## Hasil train dan test model MNIST

In [None]:
train_and_test_model(mnist_model, mnist_train_loader, mnist_num_epochs, mnist_input_size, mnist_test_loader)

Epoch [1/5], Step [100/600], Loss: 2.3087
Epoch [1/5], Step [200/600], Loss: 2.3003
Epoch [1/5], Step [300/600], Loss: 2.3053
Epoch [1/5], Step [400/600], Loss: 2.3077
Epoch [1/5], Step [500/600], Loss: 2.2956
Epoch [1/5], Step [600/600], Loss: 2.3121
Epoch [2/5], Step [100/600], Loss: 2.3264
Epoch [2/5], Step [200/600], Loss: 2.3056
Epoch [2/5], Step [300/600], Loss: 2.3125
Epoch [2/5], Step [400/600], Loss: 2.3025
Epoch [2/5], Step [500/600], Loss: 2.2955
Epoch [2/5], Step [600/600], Loss: 2.3079
Epoch [3/5], Step [100/600], Loss: 2.2898
Epoch [3/5], Step [200/600], Loss: 2.3178
Epoch [3/5], Step [300/600], Loss: 2.3082
Epoch [3/5], Step [400/600], Loss: 2.3005
Epoch [3/5], Step [500/600], Loss: 2.3071
Epoch [3/5], Step [600/600], Loss: 2.3148
Epoch [4/5], Step [100/600], Loss: 2.2954
Epoch [4/5], Step [200/600], Loss: 2.3064
Epoch [4/5], Step [300/600], Loss: 2.2986
Epoch [4/5], Step [400/600], Loss: 2.3127
Epoch [4/5], Step [500/600], Loss: 2.2955
Epoch [4/5], Step [600/600], Loss:

Hasil menunjukkan akurasi 16.13%: tidak cukup lebih baik dibandingan menjawab secara acak (akurasi ~10% menganggap probabilitas uniform). Hal ini dapat disebabkan oleh hidden layer dengan neuron yang terlalu sedikit (untuk contoh, disamakan dengan inisialisasi CIFAR-10, yaitu 120 padahal tutorial yunjey menggunakan 500) dan ukuran kernel size yang terlalu besar untuk ukuran data 28x28. Maka dilakukan perubahan pada model dan pelatihan ulang sehingga mendapatkan hasil sbb.:

In [None]:
mnist_model = Net(
    mnist_input_size,
    [600, 500],
    mnist_in_channels,
    mnist_conv1_out_channels,
    mnist_conv2_out_channels,
    3,
    mnist_pool_kernel_size,
    mnist_num_classes,
    mnist_stride).to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(mnist_model.parameters(), lr=mnist_learning_rate)

In [None]:
mnist_model.layer_summary((mnist_batch_size, mnist_in_channels, mnist_input_size, mnist_input_size))

Conv2d output shape:	 torch.Size([100, 6, 26, 26])
ReLU output shape:	 torch.Size([100, 6, 26, 26])
MaxPool2d output shape:	 torch.Size([100, 6, 13, 13])
Conv2d output shape:	 torch.Size([100, 16, 11, 11])
ReLU output shape:	 torch.Size([100, 16, 11, 11])
MaxPool2d output shape:	 torch.Size([100, 16, 5, 5])
Flatten output shape:	 torch.Size([100, 400])
Linear output shape:	 torch.Size([100, 600])
ReLU output shape:	 torch.Size([100, 600])
Linear output shape:	 torch.Size([100, 500])
ReLU output shape:	 torch.Size([100, 500])
Linear output shape:	 torch.Size([100, 10])


In [None]:
train_and_test_model(mnist_model, mnist_train_loader, mnist_num_epochs, mnist_input_size, mnist_test_loader)

Epoch [1/5], Step [100/600], Loss: 0.2715
Epoch [1/5], Step [200/600], Loss: 0.1280
Epoch [1/5], Step [300/600], Loss: 0.2722
Epoch [1/5], Step [400/600], Loss: 0.1339
Epoch [1/5], Step [500/600], Loss: 0.2270
Epoch [1/5], Step [600/600], Loss: 0.1560
Epoch [2/5], Step [100/600], Loss: 0.0332
Epoch [2/5], Step [200/600], Loss: 0.1687
Epoch [2/5], Step [300/600], Loss: 0.0489
Epoch [2/5], Step [400/600], Loss: 0.0673
Epoch [2/5], Step [500/600], Loss: 0.0286
Epoch [2/5], Step [600/600], Loss: 0.0202
Epoch [3/5], Step [100/600], Loss: 0.0269
Epoch [3/5], Step [200/600], Loss: 0.0673
Epoch [3/5], Step [300/600], Loss: 0.0242
Epoch [3/5], Step [400/600], Loss: 0.1110
Epoch [3/5], Step [500/600], Loss: 0.2090
Epoch [3/5], Step [600/600], Loss: 0.0101
Epoch [4/5], Step [100/600], Loss: 0.0072
Epoch [4/5], Step [200/600], Loss: 0.0243
Epoch [4/5], Step [300/600], Loss: 0.0221
Epoch [4/5], Step [400/600], Loss: 0.0307
Epoch [4/5], Step [500/600], Loss: 0.0806
Epoch [4/5], Step [600/600], Loss:

Akurasi setelah menggunakan banyak neuron yang mirip dengan contoh yunjey adalah 98.89%, bahkan lebih baik daripada model milik yunjey. Dari sini dapat diimplikasikan bahwa convolution efektif dalam membantu klasifikasi dengan dense network; pada Tugas 1 ditunjukkan bahwa penambahan hidden layer saja relatif mempunyai dampak buruk pada akurasi model.

## Informasi arsitektur net CIFAR-10

In [None]:
cifar_model.layer_summary((cifar_batch_size, cifar_in_channels, cifar_input_size, cifar_input_size))

Conv2d output shape:	 torch.Size([4, 6, 28, 28])
ReLU output shape:	 torch.Size([4, 6, 28, 28])
MaxPool2d output shape:	 torch.Size([4, 6, 14, 14])
Conv2d output shape:	 torch.Size([4, 16, 10, 10])
ReLU output shape:	 torch.Size([4, 16, 10, 10])
MaxPool2d output shape:	 torch.Size([4, 16, 5, 5])
Flatten output shape:	 torch.Size([4, 400])
Linear output shape:	 torch.Size([4, 120])
ReLU output shape:	 torch.Size([4, 120])
Linear output shape:	 torch.Size([4, 84])
ReLU output shape:	 torch.Size([4, 84])
Linear output shape:	 torch.Size([4, 10])


Formatnya apabila terdapat empat parameter berupa:\
`[batch size, banyak channel, tinggi, lebar]`

Pada dua parameter menunjukkan batch size dan banyaknya neuron saja.

## Hasil train dan test model CIFAR-10

In [None]:
train_and_test_model(cifar_model, cifar_train_loader, cifar_num_epochs, cifar_input_size, cifar_test_loader)

Epoch [1/3], Step [100/12500], Loss: 2.3240
Epoch [1/3], Step [200/12500], Loss: 2.3064
Epoch [1/3], Step [300/12500], Loss: 2.2545
Epoch [1/3], Step [400/12500], Loss: 2.3129
Epoch [1/3], Step [500/12500], Loss: 2.3256
Epoch [1/3], Step [600/12500], Loss: 2.3452
Epoch [1/3], Step [700/12500], Loss: 2.3155
Epoch [1/3], Step [800/12500], Loss: 2.3149
Epoch [1/3], Step [900/12500], Loss: 2.2596
Epoch [1/3], Step [1000/12500], Loss: 2.3020
Epoch [1/3], Step [1100/12500], Loss: 2.1961
Epoch [1/3], Step [1200/12500], Loss: 2.1139
Epoch [1/3], Step [1300/12500], Loss: 2.2170
Epoch [1/3], Step [1400/12500], Loss: 2.1694
Epoch [1/3], Step [1500/12500], Loss: 1.9377
Epoch [1/3], Step [1600/12500], Loss: 1.8786
Epoch [1/3], Step [1700/12500], Loss: 1.7861
Epoch [1/3], Step [1800/12500], Loss: 1.9958
Epoch [1/3], Step [1900/12500], Loss: 1.6890
Epoch [1/3], Step [2000/12500], Loss: 2.3929
Epoch [1/3], Step [2100/12500], Loss: 1.7779
Epoch [1/3], Step [2200/12500], Loss: 2.7478
Epoch [1/3], Step [

Hasil akurasi berdasarkan tutorial PyTorch adalah 58.28%; lebih baik daripada tutorial tersebut karena ada penambahan epoch dari 2 menjadi 3. Penambahan epoch itu pula meningkatkan akurasi model.

# Simpan model

Sesuai arahan tutorial PyTorch dan junjey

In [None]:
MNIST_PATH = './mnist_net.pth'
torch.save(mnist_model.state_dict(), MNIST_PATH)

In [None]:
CIFAR_PATH = './cifar_net.pth'
torch.save(cifar_model.state_dict(), CIFAR_PATH)