# 06. Pytorch Transfer Learning.

transfer learning adalah teknik yang digunakan untuk mempercepat proses training model deep learning. Dengan menggunakan transfer learning, kita dapat menggunakan model yang sudah dilatih sebelumnya untuk menyelesaikan tugas yang berbeda. Dengan menggunakan model yang sudah dilatih sebelumnya, kita dapat menghemat waktu dan sumber daya yang dibutuhkan untuk melatih model dari awal.

- pretrained model = foundation model

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)

2.5.1+cu124
0.20.1+cu124


In [2]:
## 1. Download going modular with pytorch 

import matplotlib.pyplot as plt
from going_modular import  data_setup, engine, model_builder, utils

In [3]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [4]:
!nvidia-smi

Wed Nov 27 10:32:26 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.22                 Driver Version: 552.22         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1050      WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   46C    P0             N/A / ERR!  |       0MiB /   4096MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## 1. Get data

kita perlu dataset pizza, steak, sushi untuk buat model transfer learning.

In [5]:
import requests
import zipfile
from pathlib import Path

# setup path to a data folder
data_path = Path("data/")
image_path = data_path / 'pizza_steak_sushi'

# if the image folder doesn't exist, download and prepare it...
if image_path.is_dir():
    print(f'{image_path} directory sudah ada, skipping download')
else:
    print(f'{image_path} tidak ada, mulai download')
    image_path.mkdir(parents=True, exist_ok=True)
    
# download the dataset from mrdbourke github
with open(data_path / 'pizza_steak_sushi.zip', mode="wb") as f:
    requests = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print('Downloading dataset')
    f.write(requests.content)
    
# unzip dataset
with zipfile.ZipFile(data_path / 'pizza_steak_sushi.zip', 'r') as zip_ref:
    print('Unzipping dataset')
    zip_ref.extractall(image_path)

data\pizza_steak_sushi directory sudah ada, skipping download
Downloading dataset
Unzipping dataset


In [6]:
# setup directory path
train_dir = image_path / 'train'
test_dir = image_path / 'test'

train_dir, test_dir

(WindowsPath('data/pizza_steak_sushi/train'),
 WindowsPath('data/pizza_steak_sushi/test'))

## 2. Create datasets and dataloaders

udah ada data sekarang kita buat dataset dan dataloader

kita pake `data_setup.py` yang udah ada

In [7]:
from going_modular import data_setup

### 2.1 Create transform untuk `torchvision.models` (manual creation)


In [8]:
from torchvision import transforms

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

manual_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    normalize
])

In [9]:
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir, 
                                                                               test_dir=test_dir, 
                                                                               transform=manual_transform,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x2474b57a8c0>,
 <torch.utils.data.dataloader.DataLoader at 0x24770b9fe50>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Create transform untuk `torchvision.models` (auto creation)

dari torchvision versi 0.13+ sudah ada fitur auto creation yang memungkinkan kita untuk membuat transform dengan mudah.

In [10]:
# get a set of pretrained model weigths
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # default artinya adalah best available weights
weights

EfficientNet_B0_Weights.IMAGENET1K_V1

In [11]:
# get the transforms used to create the model
auto_transform = weights.transforms()
auto_transform

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

In [12]:
# create dataloader with the auto_transform
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir, 
                                                                               test_dir=test_dir, 
                                                                               transform=auto_transform,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x2474b579d50>,
 <torch.utils.data.dataloader.DataLoader at 0x24770bcee00>,
 ['pizza', 'steak', 'sushi'])

## 3.  Getting a pretrained model

ada banyak tempat untuk mendapatkan pretrained model, salah satunya adalah
1. PyTorch domain library
2. libraries like timm (torch image models)
3. huggingface hub (plenty of models)
4. paperswithcode (for the best model for a specific task)

### 3.1 model pretrained mana yang harus digunakan?

harus eksperimen, eksperimen, eksperimen

ada 3 cara untuk menentukan model yang akan digunakan:
1. speed - how fast the model is
2. size - how big the model is
3. performance - how well the model performs on the task

where does the model live?
- in the cloud?
- or in device?

accomodate the model to the problem you are trying to solve.

untuk kasus problem ini kita akan menggunakan `EfficientNet B0` model, this is the best in terms of speed, size, and performance.

### 3.2 Setting up a pretrained model

want to create an instance of pretrained EfficientNet model B0


In [13]:
# the new way to create a pretrained model instance
weights = torchvision.models.EfficientNet_B3_Weights.IMAGENET1K_V1
model = torchvision.models.efficientnet_b3(weights=weights)
model

Downloading: "https://download.pytorch.org/models/efficientnet_b3_rwightman-b3899882.pth" to C:\Users\ibnuk/.cache\torch\hub\checkpoints\efficientnet_b3_rwightman-b3899882.pth
100%|██████████| 47.2M/47.2M [00:13<00:00, 3.57MB/s]


EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 40, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(40, 40, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=40, bias=False)
            (1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(40, 10, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(10, 40, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActiv

### 3.3 Getting a summary of the model using `torchinfo`

In [22]:
# print the summary with torch info
from torchinfo import summary

summary(model=model,
        input_size=(1,3,224,224), # batch size, channels, height, width
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=['var_names'])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 1000]            --                   True
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1536, 7, 7]      --                   True
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 40, 112, 112]    --                   True
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 40, 112, 112]    1,080                True
│    │    └─BatchNorm2d (1)                                  [1, 40, 112, 112]    [1, 40, 112, 112]    80                   True
│    │    └─SiLU (2)                                         [1, 40, 112, 112]    [1, 40, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 40, 112, 112]    [1, 24, 112,