# Praktikum Deep Learning — Transfer Learning (Pengantar & Demo Konsep)

_Notebook ini memandu praktikum singkat untuk memahami konsep inti transfer learning sebelum studi kasus medis._

> Gunakan menu **View → Table of Contents** di JupyterLab atau ekstensi serupa agar heading otomatis tersusun sebagai ToC.

## Daftar Isi
- A. Judul & Tujuan Pembelajaran
- B. Recall & Icebreaker
- C. Ringkasan Teori
- D. Setup Lingkungan
- E. Konfigurasi (Code)
- F. Template DataModule (Code)
- G. Bangun Model Pretrained (Code)
- H. Loop Train Generic (Code)
- I. Plot & Logging (Code)
- J. Ringkasan & Diskusi
- K. Checklist Praktikum
- L. Referensi


## A. Judul & Tujuan Pembelajaran

**Learning Objectives**
- Memahami konsep Transfer Learning dan motivasinya (hemat data, waktu, sumber daya).
- Memahami kelebihan/manfaat utama (efisiensi, mengurangi kebutuhan data, potensi performa lebih baik, mitigasi overfitting).
- Mengenali jenis-jenis Transfer Learning (Inductive, Transductive, Unsupervised).
- Membedakan feature extraction (freeze) vs fine-tuning dan kapan menggunakannya.
- Menyiapkan template eksperimen (tanpa dataset dahulu) yang bisa dipakai ulang.


## B. Recall & Icebreaker

- Apa tantangan training CNN dari nol dengan data kecil?
- Kenapa reuse knowledge dari model besar itu masuk akal?
- Kapan cukup freeze? Kapan perlu fine-tune beberapa layer terakhir?


## C. Ringkasan Teori

Transfer learning adalah pendekatan reuse model terlatih sebagai titik awal agar hemat komputasi, data, dan waktu; kita tidak selalu harus memulai training dari nol.

**Manfaat & Kelebihan:** efisiensi training, kebutuhan data lebih sedikit, potensi performa lebih baik, dan membantu mengurangi risiko overfitting pada dataset kecil.

**Jenis-jenis Transfer Learning:**
1. *Inductive* — data target berlabel dan memiliki tugas berbeda, model sumber memberi prior pengetahuan (contoh: ImageNet → klasifikasi medis).
2. *Transductive* — tugas sama tetapi distribusi data berbeda, sering muncul saat domain shift.
3. *Unsupervised* — memanfaatkan representasi yang dipelajari tanpa label ke domain baru yang juga tidak berlabel.

**Fine-tuning vs Freeze:** Feature extraction (freeze) cepat karena hanya melatih kepala baru, cocok saat dataset kecil atau mirip dengan domain sumber. Fine-tuning membuka sebagian/seluruh backbone untuk menyesuaikan representasi, memberikan fleksibilitas lebih tetapi butuh data lebih banyak dan kontrol regularisasi untuk menghindari overfitting.

**Use cases populer:** Computer Vision (ImageNet → domain khusus seperti medis atau industri), NLP (BERT/GPT untuk berbagai downstream task), Speech (ASR pretrained → dialek baru).

> Catatan: Materi ini diselaraskan dengan slide **“Deep Learning 06 — Transfer Learning”** untuk TM; gunakan slide tersebut sebagai referensi narasi lengkap.


## D. Setup Lingkungan

Inisialisasi dependensi, cek versi Python/GPU, set seed deterministik, dan pastikan direktori output siap sebelum eksperimen.


In [1]:
import json
import os
import platform
import random
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from torch import nn, optim
from torch.utils.data import DataLoader, Dataset
from torchvision import models, transforms
from torchvision.transforms import functional as F_transforms
import yaml

# Fungsi utilitas agar hasil eksperimen deterministik untuk kebutuhan praktikum.
def set_seed(seed: int = 42) -> None:
    """Set random seeds untuk numpy, random, dan torch agar hasil reproducible."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)


def get_device(preference: str = "cuda_if_available") -> torch.device:
    """Pilih device berdasarkan preferensi dan ketersediaan CUDA."""
    if preference == "cuda_if_available" and torch.cuda.is_available():
        return torch.device("cuda")
    return torch.device("cpu")

# Deteksi root project (notebook berada di folder notebooks/).
project_root = Path.cwd().resolve()
if project_root.name == "notebooks":
    project_root = project_root.parent

paths_to_create = [
    project_root / "outputs" / "figures",
    project_root / "outputs" / "reports",
    project_root / "models",
]
for path in paths_to_create:
    path.mkdir(parents=True, exist_ok=True)

# Simpan cache weight torchvision ke folder models/ agar rapih (dan reusable jika tersedia).
os.environ.setdefault("TORCH_HOME", str(project_root / "models"))

print(f"Project root: {project_root}")
print(f"Python version: {platform.python_version()}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")

seed_value = 42
set_seed(seed_value)

device = get_device()
print(f"Seed set to: {seed_value}")
print(f"Selected device: {device}")

ModuleNotFoundError: No module named 'torch'

## E. Konfigurasi (Code)

Memuat file YAML konfigurasi eksperimen, menampilkannya sebagai tabel, serta menyesuaikan seed dan device sesuai preferensi praktikum.


In [None]:
# Membaca konfigurasi template agar eksperimen mudah direplikasi.
config_path = project_root / "configs" / "training.yaml"
with config_path.open("r", encoding="utf-8") as f:
    config = yaml.safe_load(f)

# Menormalkan nilai num_workers agar aman dijalankan lintas OS / notebook.
num_workers_cfg = int(config.get("num_workers", 0))
if os.name == "nt":
    num_workers_cfg = 0  # Windows + notebook lebih stabil single-thread loader.
if num_workers_cfg < 0:
    num_workers_cfg = 0
config["num_workers"] = num_workers_cfg

config_df = pd.DataFrame(list(config.items()), columns=["parameter", "value"]).set_index("parameter")
display(config_df)

seed_value = config.get("seed", seed_value)
set_seed(seed_value)
device = get_device(config.get("device", "cuda_if_available"))
print(f"Konfigurasi dimuat dari: {config_path}")
print(f"Seed aktif: {seed_value}")
print(f"Device aktif: {device}")
print(f"num_workers efektif: {config['num_workers']}")

## F. Template DataModule (Code)

Template DataModule sederhana berbasis dummy tensor untuk mendemokan alur. **TODO:** ganti dengan `torchvision.datasets.ImageFolder` atau dataset medis pada sesi studi kasus berikutnya. Transformasi menggunakan standar ImageNet agar konsisten dengan backbone pretrained.


In [None]:
# Dataset dummy agar loop training dapat dijalankan tanpa dataset eksternal.
class DummyRandomDataset(Dataset):
    """Membuat sampel RGB acak dan label dummy untuk simulasi pipeline."""
    def __init__(self, num_samples: int, num_classes: int, transform=None, seed: int = 42) -> None:
        self.num_samples = num_samples
        self.num_classes = num_classes
        self.transform = transform
        gen = torch.Generator().manual_seed(seed)
        self.images = torch.rand(num_samples, 3, 256, 256, generator=gen)
        self.labels = torch.randint(0, num_classes, (num_samples,), generator=gen)

    def __len__(self) -> int:
        return self.num_samples

    def __getitem__(self, idx: int):
        image = self.images[idx]
        label = self.labels[idx]
        pil_image = F_transforms.to_pil_image(image)
        if self.transform is not None:
            image = self.transform(pil_image)
        else:
            image = transforms.ToTensor()(pil_image)
        return image, label


class SimpleImageDataModule:
    """Kerangka DataModule minimal untuk praktikum transfer learning."""
    def __init__(self, batch_size: int, num_workers: int, num_classes: int = 2, seed: int = 42) -> None:
        self.batch_size = batch_size
        self.num_workers = max(0, int(num_workers))
        if os.name == "nt":
            self.num_workers = 0  # Hindari multiprocessing issue pada Windows notebooks.
        self.num_classes = num_classes
        self.seed = seed
        self.train_dataset = None
        self.val_dataset = None
        self.train_transforms = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        ])
        self.val_transforms = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        ])

    def setup(self) -> None:
        """Membuat dataset dummy untuk train/val agar loop model dapat dieksekusi."""
        self.train_dataset = DummyRandomDataset(
            num_samples=32,
            num_classes=self.num_classes,
            transform=self.train_transforms,
            seed=self.seed,
        )
        self.val_dataset = DummyRandomDataset(
            num_samples=16,
            num_classes=self.num_classes,
            transform=self.val_transforms,
            seed=self.seed + 1,
        )

    def train_dataloader(self) -> DataLoader:
        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True, num_workers=self.num_workers)

    def val_dataloader(self) -> DataLoader:
        return DataLoader(self.val_dataset, batch_size=self.batch_size, shuffle=False, num_workers=self.num_workers)


NUM_CLASSES = 2  # Ubah saat dataset sebenarnya tersedia.
datamodule = SimpleImageDataModule(
    batch_size=config["batch_size"],
    num_workers=config["num_workers"],
    num_classes=NUM_CLASSES,
    seed=seed_value,
)
datamodule.setup()
train_batch = next(iter(datamodule.train_dataloader()))
print(f"Contoh batch train: images {train_batch[0].shape}, labels {train_batch[1].shape}")

## G. Bangun Model Pretrained (Code)

Mengambil backbone pretrained (`resnet18`), memisahkan feature extractor vs classifier, serta menyiapkan fungsi freeze/unfreeze untuk mode feature extraction dan fine-tuning.


In [None]:
# Utilitas pemodelan untuk memisahkan backbone dan classifier.
def build_backbone(name: str = "resnet18", pretrained: bool = True, num_classes: int = NUM_CLASSES):
    """Membangun backbone torchvision dan memisahkan classifier head."""
    if name != "resnet18":
        raise ValueError("Demo ini saat ini hanya mendukung resnet18 sebagai baseline ringan.")

    base_model = None
    weights_info = "random-init"
    if pretrained:
        try:
            if hasattr(models, "ResNet18_Weights"):
                base_model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
                weights_info = "ResNet18_Weights.DEFAULT"
            else:
                base_model = models.resnet18(pretrained=True)
                weights_info = "pretrained=True"
        except Exception as exc:
            print(f"Gagal memuat weight pretrained (offline?): {exc}")
            weights_info = "random-init (fallback)"

    if base_model is None:
        if hasattr(models, "ResNet18_Weights"):
            base_model = models.resnet18(weights=None)
        else:
            base_model = models.resnet18(pretrained=False)

    feature_extractor = nn.Sequential(*list(base_model.children())[:-1])
    in_features = base_model.fc.in_features
    classifier = nn.Linear(in_features, num_classes)
    return feature_extractor, classifier, weights_info


class TransferLearner(nn.Module):
    """Model wrapper yang memisahkan feature extractor dan classifier."""
    def __init__(self, feature_extractor: nn.Module, classifier: nn.Module) -> None:
        super().__init__()
        self.feature_extractor = feature_extractor
        self.classifier = classifier

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        features = self.feature_extractor(x)
        features = torch.flatten(features, 1)
        return self.classifier(features)


def set_feature_extractor_grad(feature_extractor: nn.Module, freeze_until: str = "all") -> None:
    """Atur parameter backbone yang dapat di-train sesuai kebijakan freeze."""
    freeze_until = freeze_until.lower()
    if freeze_until not in {"all", "layer4", "none"}:
        raise ValueError("freeze_until harus salah satu dari: all | layer4 | none")

    for param in feature_extractor.parameters():
        param.requires_grad = False

    if freeze_until == "layer4":
        # Layer ke-7 pada sequential merupakan block layer4 pada ResNet18.
        for name, module in feature_extractor.named_children():
            if name == "7":
                for param in module.parameters():
                    param.requires_grad = True
    elif freeze_until == "none":
        for param in feature_extractor.parameters():
            param.requires_grad = True


def count_trainable_parameters(model: nn.Module) -> int:
    return sum(p.numel() for p in model.parameters() if p.requires_grad)


feature_extractor, classifier, weights_info = build_backbone(
    name=config.get("pretrained_backbone", "resnet18"),
    pretrained=True,
    num_classes=NUM_CLASSES,
)
print(f"Backbone weight source: {weights_info}")
model = TransferLearner(feature_extractor, classifier).to(device)

# Mode feature extraction: seluruh backbone di-freeze.
set_feature_extractor_grad(model.feature_extractor, freeze_until="all")
fe_params = count_trainable_parameters(model)
print(f"Trainable params (feature extraction): {fe_params}")

# Mode fine-tuning: buka sesuai konfigurasi freeze_until.
set_feature_extractor_grad(model.feature_extractor, freeze_until=config.get("freeze_until", "all"))
ft_params = count_trainable_parameters(model)
print(f"Trainable params (fine-tuning policy '{config.get('freeze_until', 'all')}'): {ft_params}")

## H. Loop Train Generic (Code)

Loop training minimalis untuk mendemokan dua tahap: feature extraction (melatih classifier saja) dan fine-tuning (opsional membuka sebagian backbone). Logging bersifat sederhana karena dataset masih dummy.


In [None]:
# Fungsi training sederhana agar pipeline dapat dijalankan end-to-end.
def train_one_epoch(model: nn.Module, dataloader: DataLoader, criterion, optimizer, device: torch.device):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in dataloader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        preds = outputs.argmax(dim=1)
        correct += (preds == labels).sum().item()
        total += labels.size(0)

    epoch_loss = running_loss / max(total, 1)
    epoch_acc = correct / max(total, 1)
    return epoch_loss, epoch_acc


def evaluate(model: nn.Module, dataloader: DataLoader, criterion, device: torch.device):
    model.eval()
    running_loss = 0.0
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in dataloader:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)
            running_loss += loss.item() * images.size(0)
            preds = outputs.argmax(dim=1)
            correct += (preds == labels).sum().item()
            total += labels.size(0)
    epoch_loss = running_loss / max(total, 1)
    epoch_acc = correct / max(total, 1)
    return epoch_loss, epoch_acc


def run_training_cycles(model: nn.Module, datamodule: SimpleImageDataModule, config: dict, device: torch.device):
    history = []
    criterion = nn.CrossEntropyLoss()
    train_loader = datamodule.train_dataloader()
    val_loader = datamodule.val_dataloader()

    # Tahap 1: Feature Extraction (freeze backbone).
    set_feature_extractor_grad(model.feature_extractor, freeze_until="all")
    optimizer_fe = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=config["lr_feature_extraction"], weight_decay=config["weight_decay"])
    for epoch in range(1, config["num_epochs_feature_extraction"] + 1):
        train_loss, train_acc = train_one_epoch(model, train_loader, criterion, optimizer_fe, device)
        val_loss, val_acc = evaluate(model, val_loader, criterion, device)
        history.append({
            "stage": "feature_extraction",
            "epoch": epoch,
            "train_loss": float(train_loss),
            "train_acc": float(train_acc),
            "val_loss": float(val_loss),
            "val_acc": float(val_acc),
        })

    # Tahap 2: Fine-tuning (opsional membuka layer backbone).
    fine_tune_epochs = config.get("num_epochs_fine_tuning", 0)
    if fine_tune_epochs > 0:
        set_feature_extractor_grad(model.feature_extractor, freeze_until=config.get("freeze_until", "all"))
        optimizer_ft = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=config["lr_fine_tuning"], weight_decay=config["weight_decay"])
        for epoch in range(1, fine_tune_epochs + 1):
            train_loss, train_acc = train_one_epoch(model, train_loader, criterion, optimizer_ft, device)
            val_loss, val_acc = evaluate(model, val_loader, criterion, device)
            history.append({
                "stage": "fine_tuning",
                "epoch": epoch,
                "train_loss": float(train_loss),
                "train_acc": float(train_acc),
                "val_loss": float(val_loss),
                "val_acc": float(val_acc),
            })

    return history


model = model.to(device)
history = run_training_cycles(model, datamodule, config, device)
print(f"Selesai training demo dengan {len(history)} entry riwayat.")


## I. Plot & Logging (Code)

Visualisasi metrik dummy dan simpan artefak (plot serta ringkasan JSON) ke folder `outputs/` agar mudah diperiksa setelah praktikum.


In [None]:
# Menyusun history ke DataFrame untuk analisis cepat.
history_df = pd.DataFrame(history)
if not history_df.empty:
    display(history_df)

    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    for stage in history_df["stage"].unique():
        subset = history_df[history_df["stage"] == stage]
        axes[0].plot(subset["epoch"], subset["train_loss"], marker="o", label=f"{stage} train")
        axes[0].plot(subset["epoch"], subset["val_loss"], marker="s", label=f"{stage} val")
        axes[1].plot(subset["epoch"], subset["train_acc"], marker="o", label=f"{stage} train")
        axes[1].plot(subset["epoch"], subset["val_acc"], marker="s", label=f"{stage} val")
    axes[0].set_title("Loss per Stage")
    axes[0].set_xlabel("Epoch")
    axes[0].set_ylabel("Loss")
    axes[0].legend()
    axes[1].set_title("Accuracy per Stage")
    axes[1].set_xlabel("Epoch")
    axes[1].set_ylabel("Accuracy")
    axes[1].legend()
    plt.tight_layout()

    fig_path = project_root / config["fig_dir"] / "loss_accuracy_demo.png"
    fig.savefig(fig_path, bbox_inches="tight")
    plt.close(fig)
    print(f"Plot tersimpan di: {fig_path}")
else:
    print("History kosong: tidak ada data untuk divisualisasikan.")

summary = {
    "device": str(device),
    "config": config,
    "history": history,
}
summary_path = project_root / config["log_dir"] / "run_summary.json"
with summary_path.open("w", encoding="utf-8") as f:
    json.dump(summary, f, indent=2)
print(f"Ringkasan run tersimpan di: {summary_path}")

model_path = project_root / config["save_dir"] / "demo_model_state_dict.pt"
torch.save(model.state_dict(), model_path)
print(f"Model checkpoint dummy tersimpan di: {model_path}")


## J. Ringkasan & Diskusi

| Mode | Kecepatan | Kebutuhan Data | Risiko Overfitting | Kapan Dipilih |
| --- | --- | --- | --- | --- |
| Feature Extraction (Freeze) | Cepat (hanya train head) | Rendah | Rendah | Dataset kecil, domain mirip |
| Fine-Tuning Parsial | Sedang (beberapa layer dibuka) | Menengah | Menengah | Saat butuh adaptasi moderat, layer akhir di-unfreeze |
| Fine-Tuning Penuh | Paling lambat | Tinggi | Tinggi | Domain sangat berbeda, data cukup besar |

Pertanyaan refleksi:
- Jika domain target sangat berbeda, apakah freeze masih efektif?
- Bagian mana yang kemungkinan besar perlu di-unfreeze duluan?


## K. Checklist Praktikum

- [ ] Pahami definisi & manfaat transfer learning.
- [ ] Jelaskan perbedaan freeze vs fine-tune.
- [ ] Ubah konfigurasi `freeze_until` dan jelaskan pengaruhnya pada jumlah parameter yang di-train.
- [ ] Tunjukkan lokasi file output (`outputs/figures` & `outputs/reports`).


## L. Referensi

- Slide: **Deep Learning 06 — Transfer Learning (Tatap Muka)** — ringkasan definisi, manfaat, jenis-jenis, serta strategi fine-tuning vs freeze yang digunakan dalam praktikum ini.
