# 07 PyTorch Experiment Tracking

Pembelajaran Mesin adalah seebuah hal yang sangat expiremental

Unttuk menemukan eksperimen mana yang layak untuk dikejar, disitulah **epxeriment tracking** diperlukan.


In [58]:
import torch
from torch import nn
import torchvision
from torchvision import transforms
from torchinfo import summary
from going_modular import data_setup, engine

import matplotlib.pyplot as plt

print(torch.__version__)
print(torchvision.__version__)

2.6.0+cu118
0.21.0+cu118


### Helper Function


In [11]:
# Set seeds
def set_seeds(seed: int=42):
    """Sets random sets for torch operations.

    Args:
        seed (int, optional): Random seed to set. Defaults to 42.
    """
    # Set the seed for general torch operations
    torch.manual_seed(seed)
    # Set the seed for CUDA torch operations (ones that happen on the GPU)
    torch.cuda.manual_seed(seed)

In [13]:
set_seeds()

In [12]:
# Siapkan device agnostic
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

## 1. Ambil data

Ingin mengambil pizza, steak, sushi

Kita akan menjalankan eksperimen FoodVision Mini dan melihat model mana yang mantap!


In [33]:
import os
import zipfile

from pathlib import Path

import requests

def download_data(source:str,
                  destination: str,
                  remove_source: bool = True) -> Path:
  """Mengunduh dataset yang di zip, dan mengunzip sesuai destinasi"""

  #Setup path ke data folder
  data_path = Path("data/")
  image_path = data_path / destination

  # Jika folder gambar tidak ada, buat
  if image_path.is_dir():
    print(f"{image_path} sudah ada")
  else:
    print(f"Membuat {image_path}")
    image_path.mkdir(parents=True, exist_ok=True)

    # Download data target
    target_file = Path(source).name
    with open(data_path / target_file, "wb") as f:
      response = requests.get(source)
      print(f"Downloading {target_file} from {source}...")
      f.write(response.content)

    # Unzip target file
    with zipfile.ZipFile(data_path / target_file, "r") as zip_ref:
      print(f"Unzipping {target_file} data...")
      zip_ref.extractall(image_path)

    if remove_source:
      os.remove(data_path/target_file)

  return image_path

In [35]:
image_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/refs/heads/main/data/pizza_steak_sushi.zip",
                           destination="pizza_steak_sushi")
image_path

Membuat data\pizza_steak_sushi
Downloading pizza_steak_sushi.zip from https://github.com/mrdbourke/pytorch-deep-learning/raw/refs/heads/main/data/pizza_steak_sushi.zip...
Unzipping pizza_steak_sushi.zip data...


WindowsPath('data/pizza_steak_sushi')

## 2. Buat datasets dan Dataloaders


### 2.1 Buat DataLoaders dengan manual transforms

Tujuan dengan transform adalah memastikan data kustom kita di format dengan bentuk yang sesuai dengan model pretrained kita


In [37]:
# Buat direktori

train_dir = image_path / "train"
test_dir = image_path / "test"

train_dir, test_dir

(WindowsPath('data/pizza_steak_sushi/train'),
 WindowsPath('data/pizza_steak_sushi/test'))

In [38]:
# Buat ImageNet normalisasi levels

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

# Buat pipeline transform manual
from torchvision import transforms
manual_transforms = transforms.Compose([
  transforms.Resize((224,224)),
  transforms.ToTensor(),
  normalize
])
print(f"Membuat transforms secara manual: {manual_transforms}")

Membuat transforms secara manual: Compose(
    Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=True)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)


In [39]:
# Buat DataLoaders

train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=manual_transforms,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x227a2267c10>,
 <torch.utils.data.dataloader.DataLoader at 0x227a234da50>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Buat DataLoaders dengan transform otomatis

Prinsip yang sama diterapkan denggan transform otomatis, dimana kita ingin data kustom kita memliki format yang sama dengan yang dibutuhkan oleh pretrained moodel kita


In [42]:
# Buat direktori
train_dir = image_path / "train"
test_dir = image_path / "test"

# Siapkan pretrained weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT

# Ambil transfrom dari weights nya
automatic_transforms = weights.transforms()
print(f"Membuat transforms secara otomatis: {automatic_transforms}")

# Buat DataLoaders
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=automatic_transforms,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

Membuat transforms secara otomatis: ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)


(<torch.utils.data.dataloader.DataLoader at 0x2279add5b50>,
 <torch.utils.data.dataloader.DataLoader at 0x2279ad60b50>,
 ['pizza', 'steak', 'sushi'])

## 3. Buatkan pretrained model, bekukan base layer, dan ubah classifier head nya


In [55]:
# Note: Ini adalah bagaiman pretrained model akan dibuat dengan torchvision v0.12

# model = torchvision.models.efficientnet_b0(pretrained=True).to(device) # Cara lama

model = torchvision.models.efficientnet_b0(weights=weights).to(device)
model

EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

In [56]:
# Bekukan semua layyer base, dengan menyetting requres_grad ke false
for param in model.features.parameters():
  param.requires_grad = False

In [59]:
# Siapkan perubahan classifeier head
model.classifier = nn.Sequential(
  nn.Dropout(p=0.2, inplace=True),
  nn.Linear(in_features=1280, 
            out_features=len(class_names),
            bias=True).to(device)
)

In [60]:
from torchinfo import summary

summary(model, 
        input_size=(32, 3, 224, 224), # make sure this is "input_size", not "input_shape" (batch_size, color_channels, height, width)
        verbose=0,
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"]
)

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [32, 3, 224, 224]    [32, 3]              --                   Partial
├─Sequential (features)                                      [32, 3, 224, 224]    [32, 1280, 7, 7]     --                   False
│    └─Conv2dNormActivation (0)                              [32, 3, 224, 224]    [32, 32, 112, 112]   --                   False
│    │    └─Conv2d (0)                                       [32, 3, 224, 224]    [32, 32, 112, 112]   (864)                False
│    │    └─BatchNorm2d (1)                                  [32, 32, 112, 112]   [32, 32, 112, 112]   (64)                 False
│    │    └─SiLU (2)                                         [32, 32, 112, 112]   [32, 32, 112, 112]   --                   --
│    └─Sequential (1)                                        [32, 32, 112, 112]   [32, 

### 4. latih single model dan pelajari hasilnya


In [63]:
# Define loss function optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),
                             lr=0.001)

### Siapkan TensorBoard untuk Tracking Model


In [65]:
try:
    from torch.utils.tensorboard import SummaryWriter
except:
    print("[INFO] Couldn't find tensorboard... installing it.")
    %pip install -q tensorboard

[INFO] Couldn't find tensorboard... installing it.
Note: you may need to restart the kernel to use updated packages.


In [68]:
# Siapkan SummaryWriter
from torch.utils.tensorboard import SummaryWriter

# Create a writer with all default settings
writer = SummaryWriter()

In [67]:
from typing import Dict, List
from tqdm.auto import tqdm

from going_modular.engine import train_step, test_step

In [76]:
def train(model: torch.nn.Module, 
          train_dataloader: torch.utils.data.DataLoader, 
          test_dataloader: torch.utils.data.DataLoader, 
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device) -> Dict[str, List]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Args:
    model: A PyTorch model to be trained and tested.
    train_dataloader: A DataLoader instance for the model to be trained on.
    test_dataloader: A DataLoader instance for the model to be tested on.
    optimizer: A PyTorch optimizer to help minimize the loss function.
    loss_fn: A PyTorch loss function to calculate loss on both datasets.
    epochs: An integer indicating how many epochs to train for.
    device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
    A dictionary of training and testing loss as well as training and
    testing accuracy metrics. Each metric has a value in a list for 
    each epoch.
    In the form: {train_loss: [...],
              train_acc: [...],
              test_loss: [...],
              test_acc: [...]} 
    For example if training for epochs=2: 
             {train_loss: [2.0616, 1.0537],
              train_acc: [0.3945, 0.3945],
              test_loss: [1.2641, 1.5706],
              test_acc: [0.3400, 0.2973]} 
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
    }
    
    # Make sure model on target device
    model.to(device)

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                          dataloader=train_dataloader,
                                          loss_fn=loss_fn,
                                          optimizer=optimizer,
                                          device=device)
        test_loss, test_acc = test_step(model=model,
          dataloader=test_dataloader,
          loss_fn=loss_fn,
          device=device)

        # Print out what's happening
        print(
          f"Epoch: {epoch+1} | "
          f"train_loss: {train_loss:.4f} | "
          f"train_acc: {train_acc:.4f} | "
          f"test_loss: {test_loss:.4f} | "
          f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        ### Baru: Experiment Tracking
        writer.add_scalar(tag="Train Loss",
                          scalar_value=train_loss,
                          global_step=epoch)
        writer.add_scalar(tag="Test Loss",
                          scalar_value=test_loss,
                          global_step=epoch)
        

        writer.add_scalar(tag="Train Accuracy",
                          scalar_value=train_acc,
                          global_step=epoch)
        writer.add_scalar(tag="Test Accuracy",
                          scalar_value=test_acc,
                          global_step=epoch)
        
        
        
        writer.add_graph(model=model,
                         input_to_model=torch.randn(32,3,224,224).to(device))
        
    # Tutup writerna
    writer.close()

    # Return the filled results at the end of the epochs
    return results


In [77]:
# Train model
# Note: tidak menggunakan engine.train() karena kita mengupdate train() dengan fungsi diatas

set_seeds()
results = train(model=model,
                train_dataloader=train_dataloader,
                test_dataloader=test_dataloader,
                optimizer=optimizer,
                loss_fn=loss_fn,
                epochs=5,
                device=device)

  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.5614 | train_acc: 0.9258 | test_loss: 0.6622 | test_acc: 0.8153


 20%|██        | 1/5 [00:04<00:19,  4.82s/it]

Epoch: 2 | train_loss: 0.6204 | train_acc: 0.7969 | test_loss: 0.6000 | test_acc: 0.8258


 40%|████      | 2/5 [00:09<00:13,  4.46s/it]

Epoch: 3 | train_loss: 0.5663 | train_acc: 0.7891 | test_loss: 0.5310 | test_acc: 0.8561


 60%|██████    | 3/5 [00:13<00:08,  4.34s/it]

Epoch: 4 | train_loss: 0.5098 | train_acc: 0.8008 | test_loss: 0.5462 | test_acc: 0.8769


 80%|████████  | 4/5 [00:17<00:04,  4.34s/it]

Epoch: 5 | train_loss: 0.5549 | train_acc: 0.7695 | test_loss: 0.5827 | test_acc: 0.8144


100%|██████████| 5/5 [00:21<00:00,  4.37s/it]
