# 06. PyTorch Transfer Learning

Apa itu transfer learning?

Transfer learning melibatkan pengambilan parameters dari satu model yang dipakai di datasetlain, dan kita aplikasikan ke masalah kita

- Pretrained Model = Foundation Models


In [1]:
# For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+
try:
    import torch
    import torchvision
    assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
    assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+"
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")
except:
    print(f"[INFO] torch/torchvision versions not as required, installing nightly versions.")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

[INFO] torch/torchvision versions not as required, installing nightly versions.
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
torch version: 2.6.0+cu118
torchvision version: 0.21.0+cu118


Sekarang kita sudah mendapatkan versi yang kita inginkan dari torch dan torchversion.


In [2]:
# Continue with regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    %pip install -q torchinfo
    from torchinfo import summary

# # Try to import the going_modular directory, download it from GitHub if it doesn't work
# try:
#     from going_modular.going_modular import data_setup, engine
# except:
#     # Get the going_modular scripts
#     print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
#     !git clone https://github.com/mrdbourke/pytorch-deep-learning
#     !mv pytorch-deep-learning/going_modular .
#     !rm -rf pytorch-deep-learning
#     from going_modular.going_modular import data_setup, engine

In [3]:
from going_modular.going_modular import data_setup, engine

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# Setup device agnostik
device = "cuda" if torch.cuda.is_available() else "cpu"

## 1. Get Data

Mari kita ambil data pizza, steak, sushi kita


In [5]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it... 
if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)
    
    # Download pizza, steak, sushi data
    with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
        request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
        print("Downloading pizza, steak, sushi data...")
        f.write(request.content)

    # Unzip pizza, steak, sushi data
    with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
        print("Unzipping pizza, steak, sushi data...") 
        zip_ref.extractall(image_path)

    # Remove .zip file
    os.remove(data_path / "pizza_steak_sushi.zip")

data\pizza_steak_sushi directory exists.


In [6]:
# Setup Dirs
train_dir = image_path / "train"
test_dir = image_path / "test"

## 2. Buat Datasets dan DataLoaders

Sekrang setelah kita mendapatkan data, kita akan mengubahnya menjadi PyTorch Dataloaders.

Untuk melakukannya kita bisa menggunakan `data_setup.py`

Ketika kita memuat data, kita harus berfikir tentang bagaimana transformnya

Dan dengan `torchvision` ada 2 cara:

1. Transforms yang dibuat manual - Kita yang mendefenisikan sendiri
2. Transform yang terbuat otomatis - Transforms yang di definiskan oleh model yang kita gunakan sekarang

Kita menggunakan pretrained model, adalah penting, data yang melewati itu diubah ke cara yang sama denga model tersebut dilatih


In [7]:
from going_modular.going_modular import data_setup

### 2.1 Membuat transformasi untuk `torchvision.model` (Manual)

Semua model yang sudah dilatih sebelumnya mengharapkan gambar input dinormalisasi dengan cara yang sama, yaitu mini-batch dari gambar RGB 3-saluran dengan bentuk (3 x H x W), di mana H dan W diharapkan setidaknya 224.

Gambar-gambar tersebut harus dimuat dalam rentang [0, 1] dan kemudian dinormalisasi menggunakan mean = [0.485, 0.456, 0.406] dan std = [0.229, 0.224, 0.225].


In [8]:
from torchvision import transforms

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

manual_transforms = transforms.Compose([
  transforms.Resize((224,224)),
  transforms.ToTensor(),
  normalize
])

In [9]:
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=manual_transforms,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x2b9c8af6c90>,
 <torch.utils.data.dataloader.DataLoader at 0x2b9c220edd0>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Membuat transform untuk `torchvision.model` (Otomatis)

Catatan: Mulai dari torchvision v0.13+, ada pembaruan tentang bagaimana transformasi data dapat dibuat menggunakan torchvision.models.Saya menyebut metode sebelumnya sebagai "pembuatan manual" dan metode baru sebagai "pembuatan otomatis".


In [12]:
import torchvision
torchvision.__version__

'0.21.0+cu118'

In [13]:
# Ambil pretrained model weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # "DEFAULT" weights yang bagus
weights

EfficientNet_B0_Weights.IMAGENET1K_V1

In [14]:
# Ambil trasnform untuk membnuat pretrained weights
auto_transforms = weights.transforms()
auto_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

In [16]:
# BUat DataLoader dengan tranforms otomatis
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=auto_transforms,batch_size=32)
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x2b9c2440490>,
 <torch.utils.data.dataloader.DataLoader at 0x2b9c8e871d0>,
 ['pizza', 'steak', 'sushi'])

Eksperimen, eksperimen, eksperimen!

Gagasan utama dari transfer learning adalah mengambil model yang sudah berkinerja baik dari ruang masalah yang serupa dengan milikmu, lalu sesuaikan dengan masalahmu sendiri.

Ada tiga hal yang perlu dipertimbangkan:

- Kecepatan - Seberapa cepat model tersebut perlu berjalan?
- Ukuran - Seberapa besar model tersebut?
- Kinerja - Seberapa baik model tersebut berfungsi pada masalah yang dipilih? (misalnya, seberapa baik model tersebut mengklasifikasikan gambar makanan? untuk FoodVision Mini)

Di mana model tersebut berada?

- Pada perangkat: Seperti pada mobil tanpa pengemudi, di mana pemrosesan real-time sangat penting.
- Pada server: Untuk skenario di mana model dapat di-hosting di server dan diakses dari jarak jauh.

Eksperimen dengan faktor-faktor ini akan membantumu menemukan keseimbangan terbaik untuk kasus penggunaanmu. Jika ada yang ingin ditanyakan lebih lanjut, beri tahu saya ya! 😊


## 3.2 Menyiapkan model pretrained


In [50]:
# Metode lama untuk membuat pretrained model
# model = torchvision.models.efficientnet_b0(pretrained=True)

# Metode bary untuk menyiapkan model pretrained
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT
model = torchvision.models.efficientnet_b0(weights=weights).to(device=device)
model

EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

In [51]:
model.features

Sequential(
  (0): Conv2dNormActivation(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): SiLU(inplace=True)
  )
  (1): Sequential(
    (0): MBConv(
      (block): Sequential(
        (0): Conv2dNormActivation(
          (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
          (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): SiLU(inplace=True)
        )
        (1): SqueezeExcitation(
          (avgpool): AdaptiveAvgPool2d(output_size=1)
          (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
          (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
          (activation): SiLU(inplace=True)
          (scale_activation): Sigmoid()
        )
        (2): Conv2dNormActivation(
          (0): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), 

In [52]:
model.classifier

# 1000 karena dia mengklasifikasikan 1000 gambar

Sequential(
  (0): Dropout(p=0.2, inplace=True)
  (1): Linear(in_features=1280, out_features=1000, bias=True)
)

### 3.3 Ambil kesimpulan dari model kita `torchinfo.summary())`


In [53]:
from torchinfo import summary

summary(model=model,
        input_size=(1,3,224,224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 1000]            --                   True
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1280, 7, 7]      --                   True
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   True
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    864                  True
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    64                   True
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 16, 112,

### 3.4 Membekukan model base dan merubah lyyaer ouput untuk menyyyesuaikan kebutuhan kita

Dengan fitur ekstrasi, kita bisa membekukan base layer, dan mengubah output layer


In [54]:
# Bekukan semua base layer EffNetB0
for param in model.features.parameters():
  param.requires_grad = False

In [55]:
# Update classifier
from torch import nn

torch.manual_seed(42)
torch.cuda.manual_seed(42)

model.classifier = nn.Sequential(
  nn.Dropout(p=0.2, inplace=True),
  nn.Linear(in_features=1280, out_features=len(class_names), bias=True)
)

model.classifier

Sequential(
  (0): Dropout(p=0.2, inplace=True)
  (1): Linear(in_features=1280, out_features=3, bias=True)
)

In [56]:
summary(model=model,
        input_size=(1,3,224,224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 3]               --                   Partial
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1280, 7, 7]      --                   False
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   False
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    (864)                False
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    (64)                 False
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 1

In [58]:
## 4. Train model

In [59]:
# Definisikan loss dan optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),
                             lr=0.001)

In [61]:
# import fungsi latih
from going_modular.going_modular import engine

# set manual seed
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Siapkan latih dan simpan
results = engine.train(model=model,
                       train_dataloader=train_dataloader,
                       test_dataloader=test_dataloader,
                       optimizer=optimizer,
                       loss_fn=loss_fn,
                       epochs=20,
                       device=device)

 20%|██        | 1/5 [00:23<01:34, 23.67s/it]

Epoch: 1 | train_loss: 0.6375 | train_acc: 0.8828 | test_loss: 0.6660 | test_acc: 0.8759


 20%|██        | 1/5 [00:32<02:10, 32.72s/it]


KeyboardInterrupt: 