# üì¶ PREPROCESSING UNTUK RESNET18

---

## ‚ö†Ô∏è CATATAN PENTING: ResNet18 di TensorFlow

**ResNet18 TIDAK tersedia di `tensorflow.keras.applications`!**

Keras hanya menyediakan: ResNet50, ResNet101, ResNet152.

### Solusi yang digunakan:
Kita menggunakan **ResNet18 dari TensorFlow Hub** yang merupakan model pre-trained dari TensorFlow Model Garden.

### Perbandingan Parameter:

| Model | Parameter | Cocok untuk Dataset Kecil? |
|-------|-----------|----------------------------|
| ResNet18 | ~11.7 juta | ‚úÖ Ya - lebih kecil |
| ResNet50 | ~25.6 juta | ‚ö†Ô∏è Sedang |
| EfficientNet-B0 | ~5.3 juta | ‚úÖ Ya - paling kecil |

### Preprocessing ResNet18:
ResNet18 dari TF Hub menggunakan normalisasi yang sama dengan ResNet50:
- Mode "caffe": RGB ‚Üí BGR, subtract ImageNet mean
- Kita akan gunakan preprocessing manual untuk kontrol lebih baik

---

In [9]:
# =============================================================================
# CELL 1: IMPORT LIBRARIES
# =============================================================================

import tensorflow as tf
import tensorflow_hub as hub
import pathlib
import numpy as np

print(f"TensorFlow version: {tf.__version__}")
print(f"TensorFlow Hub version: {hub.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")

TensorFlow version: 2.20.0
TensorFlow Hub version: 0.16.1
GPU Available: []


In [10]:
# =============================================================================
# CELL 2: KONFIGURASI PATH DAN PARAMETER
# =============================================================================

base_dir = pathlib.Path(".")
data_split_dir = base_dir / 'dataset_final'

# Parameter gambar
# ResNet18 menggunakan input 224x224 (sama dengan ResNet50)
IMG_HEIGHT = 224
IMG_WIDTH = 224
BATCH_SIZE = 32
SEED = 42
AUTOTUNE = tf.data.AUTOTUNE
NUM_CLASSES = 4

print("Konfigurasi:")
print(f"  - Image Size: {IMG_HEIGHT}x{IMG_WIDTH}")
print(f"  - Batch Size: {BATCH_SIZE}")
print(f"  - Num Classes: {NUM_CLASSES}")
print(f"  - Data Path: {data_split_dir}")

Konfigurasi:
  - Image Size: 224x224
  - Batch Size: 32
  - Num Classes: 4
  - Data Path: dataset_final


In [11]:
# =============================================================================
# CELL 3: MEMUAT DATASET
# =============================================================================

print("="*60)
print("PROSES PEMUATAN DATASET")
print("="*60)

# 1. Load Training Dataset
print(f"\n[1/3] Memuat {data_split_dir / 'train'}")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_split_dir / 'train',
    labels="inferred",
    label_mode="int",
    color_mode="rgb",
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE,
    shuffle=True,
    seed=SEED
)

# 2. Load Validation Dataset
print(f"\n[2/3] Memuat {data_split_dir / 'val'}")
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_split_dir / 'val',
    labels="inferred",
    label_mode="int",
    color_mode="rgb",
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE,
    shuffle=False,
    seed=SEED
)

# 3. Load Test Dataset
print(f"\n[3/3] Memuat {data_split_dir / 'test'}")
test_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_split_dir / 'test',
    labels="inferred",
    label_mode="int",
    color_mode="rgb",
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE,
    shuffle=False,
    seed=SEED
)

# Simpan nama kelas
class_names = train_ds.class_names

print(f"\n" + "="*60)
print("INFORMASI KELAS & MAPPING")
print("="*60)
print(f"Kelas ditemukan: {class_names}")
print(f"\n{'Index':<10} | {'Folder':<10} | {'Kandungan Aflatoksin'}")
print("-" * 50)
for idx, name in enumerate(class_names):
    print(f"{idx:<10} | {name:<10} | {name} PPB")

PROSES PEMUATAN DATASET

[1/3] Memuat dataset_final\train
Found 1200 files belonging to 4 classes.

[2/3] Memuat dataset_final\val
Found 131 files belonging to 4 classes.

[3/3] Memuat dataset_final\test
Found 132 files belonging to 4 classes.

INFORMASI KELAS & MAPPING
Kelas ditemukan: ['1', '2', '3', '4']

Index      | Folder     | Kandungan Aflatoksin
--------------------------------------------------
0          | 1          | 1 PPB
1          | 2          | 2 PPB
2          | 3          | 3 PPB
3          | 4          | 4 PPB


In [12]:
# =============================================================================
# CELL 4: DATA AUGMENTATION
# =============================================================================
#
# CATATAN: Hanya menggunakan augmentasi yang AMAN untuk dataset aflatoksin
#
# ‚úÖ AMAN:
#   - RandomFlip: orientasi tidak mengubah nilai PPB
#   - RandomRotation: sudut tidak mengubah nilai PPB
#
# ‚ùå TIDAK BOLEH:
#   - RandomBrightness: mengubah intensitas = mengubah nilai PPB!
#   - RandomContrast: mengubah kontras = mengubah nilai PPB!
# =============================================================================

print("Setting up data augmentation...")

data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip("horizontal"),
    tf.keras.layers.RandomRotation(0.028),  # ~1.6 derajat
], name="data_augmentation")

print("‚úì RandomFlip (horizontal)")
print("‚úì RandomRotation (¬±1.6¬∞)")
print("\n‚ùå RandomBrightness - TIDAK DIGUNAKAN (mengubah nilai PPB)")
print("‚ùå RandomContrast - TIDAK DIGUNAKAN (mengubah nilai PPB)")

Setting up data augmentation...
‚úì RandomFlip (horizontal)
‚úì RandomRotation (¬±1.6¬∞)

‚ùå RandomBrightness - TIDAK DIGUNAKAN (mengubah nilai PPB)
‚ùå RandomContrast - TIDAK DIGUNAKAN (mengubah nilai PPB)


In [13]:
# =============================================================================
# CELL 5: FUNGSI PREPROCESSING UNTUK RESNET18
# =============================================================================
#
# ResNet18 dari TensorFlow Hub mengharapkan input dalam range [0, 1]
# Kita akan melakukan normalisasi sederhana: pixel / 255.0
#
# Alternatif: Menggunakan ImageNet normalization (mean subtraction)
# Tapi untuk simplicity, kita gunakan [0, 1] normalization yang juga umum
# =============================================================================

# ImageNet mean dan std (opsional, untuk normalisasi standar)
IMAGENET_MEAN = tf.constant([0.485, 0.456, 0.406])
IMAGENET_STD = tf.constant([0.229, 0.224, 0.225])

def preprocess_for_resnet18(images, labels, training=False):
    """
    Preprocessing untuk ResNet18.
    
    ResNet18 dari TF Hub mengharapkan input:
    - Range [0, 1] atau normalized dengan ImageNet stats
    - Format RGB
    
    Args:
        images: Batch gambar dengan nilai pixel 0-255
        labels: One-hot encoded labels
        training: Boolean, True untuk training (apply augmentation)
    
    Returns:
        Tuple (preprocessed_images, labels)
    """
    # Cast ke float32
    images = tf.cast(images, tf.float32)
    
    # Terapkan augmentasi HANYA saat training
    if training:
        images = data_augmentation(images, training=True)
    
    # Normalisasi ke range [0, 1]
    images = images / 255.0
    
    # Opsional: ImageNet normalization (uncomment jika ingin mencoba)
    # images = (images - IMAGENET_MEAN) / IMAGENET_STD
    
    return images, labels

print("Fungsi preprocessing untuk ResNet18 telah dibuat.")
print("\nNormalisasi yang digunakan:")
print("  - Input: [0, 255]")
print("  - Output: [0, 1]")

Fungsi preprocessing untuk ResNet18 telah dibuat.

Normalisasi yang digunakan:
  - Input: [0, 255]
  - Output: [0, 1]


In [14]:
# =============================================================================
# CELL 6: TERAPKAN PREPROCESSING & OPTIMASI PIPELINE
# =============================================================================

print("Menerapkan preprocessing ke dataset...")

# Training Dataset
train_ds = train_ds.map(
    lambda x, y: preprocess_for_resnet18(x, y, training=True),
    num_parallel_calls=AUTOTUNE
).cache().prefetch(buffer_size=AUTOTUNE)

print("‚úì Training dataset: augmentation ON, cache ON, prefetch ON")

# Validation Dataset
val_ds = val_ds.map(
    lambda x, y: preprocess_for_resnet18(x, y, training=False),
    num_parallel_calls=AUTOTUNE
).cache().prefetch(buffer_size=AUTOTUNE)

print("‚úì Validation dataset: augmentation OFF, cache ON, prefetch ON")

# Test Dataset
test_ds = test_ds.map(
    lambda x, y: preprocess_for_resnet18(x, y, training=False),
    num_parallel_calls=AUTOTUNE
).cache().prefetch(buffer_size=AUTOTUNE)

print("‚úì Test dataset: augmentation OFF, cache ON, prefetch ON")

print("\n" + "="*60)
print("PREPROCESSING SELESAI")
print("="*60)
print("\nDataset siap digunakan untuk training ResNet18!")

Menerapkan preprocessing ke dataset...
‚úì Training dataset: augmentation ON, cache ON, prefetch ON
‚úì Validation dataset: augmentation OFF, cache ON, prefetch ON
‚úì Test dataset: augmentation OFF, cache ON, prefetch ON

PREPROCESSING SELESAI

Dataset siap digunakan untuk training ResNet18!


In [15]:
# =============================================================================
# CELL 7: VERIFIKASI PREPROCESSING
# =============================================================================

print("Verifikasi preprocessing...")

for images, labels in train_ds.take(1):
    print(f"\nBatch shape: {images.shape}")
    print(f"Labels shape: {labels.shape}")
    print(f"\nStatistik pixel setelah preprocessing:")
    print(f"  - Min: {tf.reduce_min(images).numpy():.4f}")
    print(f"  - Max: {tf.reduce_max(images).numpy():.4f}")
    print(f"  - Mean: {tf.reduce_mean(images).numpy():.4f}")
    print(f"  - Std: {tf.math.reduce_std(images).numpy():.4f}")
    
    min_val = tf.reduce_min(images).numpy()
    max_val = tf.reduce_max(images).numpy()
    
    # ResNet18 dengan normalisasi [0,1] menghasilkan nilai dalam range [0, 1]
    if 0 <= min_val and max_val <= 1:
        print("\n‚úÖ Range nilai sesuai dengan ekspektasi ResNet18 [0, 1]")
    else:
        print("\n‚ö†Ô∏è Range nilai tidak sesuai ekspektasi [0, 1]")

Verifikasi preprocessing...

Batch shape: (32, 224, 224, 3)
Labels shape: (32,)

Statistik pixel setelah preprocessing:
  - Min: 0.0000
  - Max: 0.8009
  - Mean: 0.0006
  - Std: 0.0103

‚úÖ Range nilai sesuai dengan ekspektasi ResNet18 [0, 1]
