# Praktikum Minggu 14: Deep Learning untuk Big Data
## *Week 14 Lab: Deep Learning for Big Data*

**Mata Kuliah / Course:** Big Data Analytics  
**Topik / Topic:** MLP, CNN, RNN/LSTM, Transfer Learning, Regularization  

---
### Deskripsi
Praktikum ini mengimplementasikan deep learning menggunakan TensorFlow/Keras:
- Neural Network sederhana (MLP) pada dataset MNIST
- Convolutional Neural Network (CNN) untuk klasifikasi gambar
- RNN/LSTM untuk data sekuensial (time series)
- Transfer Learning dengan MobileNetV2
- Teknik regularisasi (Dropout, Batch Normalization)

**Catatan**: TensorFlow/Keras tersedia secara default di Google Colab.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import warnings
warnings.filterwarnings('ignore')

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, regularizers
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns

print(f'TensorFlow version: {tf.__version__}')
print(f'Keras version: {keras.__version__}')
print(f'GPU available: {len(tf.config.list_physical_devices("GPU")) > 0}')

# Reproducibility
np.random.seed(42)
tf.random.set_seed(42)

## 1. Neural Network Sederhana (MLP)

Multilayer Perceptron (MLP) adalah bentuk paling dasar neural network.
Kita akan melatihnya pada dataset MNIST (gambar digit tulisan tangan 28×28 piksel, 10 kelas).

In [None]:
# Load and preprocess MNIST
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize pixel values to [0, 1]
X_train_flat = X_train.reshape(-1, 784).astype('float32') / 255.0
X_test_flat  = X_test.reshape(-1, 784).astype('float32') / 255.0

print(f'Train shape: {X_train_flat.shape} | Test shape: {X_test_flat.shape}')
print(f'Classes: {np.unique(y_train)}')

# Visualize sample images
fig, axes = plt.subplots(2, 10, figsize=(15, 3))
for i in range(20):
    axes[i//10, i%10].imshow(X_train[i], cmap='gray')
    axes[i//10, i%10].set_title(str(y_train[i]))
    axes[i//10, i%10].axis('off')
plt.suptitle('MNIST Sample Images')
plt.tight_layout()
plt.show()

# Build MLP model
mlp_model = models.Sequential([
    layers.Input(shape=(784,)),
    layers.Dense(256, activation='relu'),
    layers.Dense(128, activation='relu'),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
], name='MLP')

mlp_model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print('\n=== MLP Model Summary ===')
mlp_model.summary()

# Train for 5 epochs
print('\n=== Training MLP ===')
mlp_history = mlp_model.fit(
    X_train_flat, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=1
)

# Evaluate
test_loss, test_acc = mlp_model.evaluate(X_test_flat, y_test, verbose=0)
print(f'\nMLP Test Accuracy: {test_acc:.4f} | Test Loss: {test_loss:.4f}')

## 2. CNN untuk Klasifikasi Gambar (MNIST)

CNN menggunakan operasi konvolusi untuk mengekstrak fitur spasial dari gambar.
Biasanya jauh lebih akurat daripada MLP untuk tugas computer vision.

In [None]:
# Reshape for CNN: (samples, height, width, channels)
X_train_cnn = X_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
X_test_cnn  = X_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Build CNN model
cnn_model = models.Sequential([
    layers.Input(shape=(28, 28, 1)),
    # Block 1
    layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2, 2)),
    # Block 2
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2, 2)),
    # Classifier head
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.4),
    layers.Dense(10, activation='softmax')
], name='CNN')

cnn_model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print('=== CNN Model Summary ===')
cnn_model.summary()

cnn_history = cnn_model.fit(
    X_train_cnn, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=1
)

test_loss_cnn, test_acc_cnn = cnn_model.evaluate(X_test_cnn, y_test, verbose=0)
print(f'\nCNN Test Accuracy: {test_acc_cnn:.4f} | Test Loss: {test_loss_cnn:.4f}')
print(f'Improvement over MLP: +{(test_acc_cnn - test_acc)*100:.2f}%')

## 3. Visualisasi Hasil Training

Memvisualisasikan kurva training loss/accuracy dan confusion matrix untuk mengevaluasi model.

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# MLP history
ax = axes[0, 0]
ax.plot(mlp_history.history['loss'], label='Train Loss')
ax.plot(mlp_history.history['val_loss'], label='Val Loss')
ax.set_title('MLP: Training & Validation Loss'); ax.set_xlabel('Epoch')
ax.legend(); ax.grid(alpha=0.3)

ax = axes[0, 1]
ax.plot(mlp_history.history['accuracy'], label='Train Acc')
ax.plot(mlp_history.history['val_accuracy'], label='Val Acc')
ax.set_title('MLP: Training & Validation Accuracy'); ax.set_xlabel('Epoch')
ax.legend(); ax.grid(alpha=0.3)

# CNN history
ax = axes[1, 0]
ax.plot(cnn_history.history['loss'], label='Train Loss')
ax.plot(cnn_history.history['val_loss'], label='Val Loss')
ax.set_title('CNN: Training & Validation Loss'); ax.set_xlabel('Epoch')
ax.legend(); ax.grid(alpha=0.3)

ax = axes[1, 1]
ax.plot(cnn_history.history['accuracy'], label='Train Acc')
ax.plot(cnn_history.history['val_accuracy'], label='Val Acc')
ax.set_title('CNN: Training & Validation Accuracy'); ax.set_xlabel('Epoch')
ax.legend(); ax.grid(alpha=0.3)

plt.tight_layout()
plt.show()

# Confusion matrix for CNN
y_pred_cnn = np.argmax(cnn_model.predict(X_test_cnn, verbose=0), axis=1)
cm = confusion_matrix(y_test, y_pred_cnn)

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=range(10), yticklabels=range(10))
plt.title(f'CNN Confusion Matrix (Test Acc: {test_acc_cnn:.4f})')
plt.xlabel('Predicted'); plt.ylabel('True')
plt.tight_layout()
plt.show()

print('=== CNN Classification Report ===')
print(classification_report(y_test, y_pred_cnn))

## 4. RNN/LSTM untuk Data Sekuensial

LSTM dapat mempelajari pola jangka panjang dalam data sekuensial.
Kita gunakan data time series sinusoidal untuk memprediksi nilai berikutnya.

In [None]:
# Generate synthetic time series: sine wave + noise
t = np.linspace(0, 8 * np.pi, 1000)
signal = np.sin(t) + 0.5 * np.sin(3 * t) + 0.15 * np.random.randn(len(t))
signal = (signal - signal.min()) / (signal.max() - signal.min())  # normalize [0,1]

# Create sequences: use last SEQ_LEN steps to predict next value
SEQ_LEN = 30

def create_sequences(data, seq_len):
    X, y = [], []
    for i in range(len(data) - seq_len):
        X.append(data[i:i + seq_len])
        y.append(data[i + seq_len])
    return np.array(X), np.array(y)

X_seq, y_seq = create_sequences(signal, SEQ_LEN)
X_seq = X_seq.reshape((-1, SEQ_LEN, 1))  # (samples, timesteps, features)

split = int(0.8 * len(X_seq))
X_tr, X_te = X_seq[:split], X_seq[split:]
y_tr, y_te = y_seq[:split], y_seq[split:]

print(f'Training sequences: {X_tr.shape} | Test sequences: {X_te.shape}')

# Build LSTM model
lstm_model = models.Sequential([
    layers.Input(shape=(SEQ_LEN, 1)),
    layers.LSTM(64, return_sequences=True),
    layers.Dropout(0.2),
    layers.LSTM(32),
    layers.Dense(16, activation='relu'),
    layers.Dense(1)
], name='LSTM_TimeSeries')

lstm_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
print('\n=== LSTM Model Summary ===')
lstm_model.summary()

# Train
early_stop = EarlyStopping(patience=5, restore_best_weights=True)
lstm_history = lstm_model.fit(
    X_tr, y_tr, epochs=30, batch_size=32,
    validation_split=0.1, callbacks=[early_stop], verbose=1
)

# Predict and plot
y_pred_lstm = lstm_model.predict(X_te, verbose=0).flatten()
test_mse = np.mean((y_te - y_pred_lstm) ** 2)
print(f'\nTest MSE: {test_mse:.5f} | Test MAE: {np.mean(np.abs(y_te - y_pred_lstm)):.4f}')

plt.figure(figsize=(14, 5))
x_axis = range(len(y_te))
plt.plot(x_axis, y_te, label='Actual', alpha=0.7)
plt.plot(x_axis, y_pred_lstm, label='LSTM Prediction', alpha=0.8, linestyle='--')
plt.title(f'LSTM Time Series Prediction (MSE={test_mse:.5f})')
plt.xlabel('Time Step'); plt.ylabel('Value')
plt.legend(); plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

## 5. Transfer Learning (Konsep & Simulasi)

Transfer Learning memanfaatkan model yang sudah dilatih pada dataset besar.
Kita menggunakan MobileNetV2 pre-trained pada ImageNet dan menambahkan custom head.

In [None]:
from tensorflow.keras.applications import MobileNetV2

print('=== Transfer Learning dengan MobileNetV2 ===')
print('Loading pre-trained MobileNetV2 (ImageNet weights)...')

# Load base model (without top classification layers)
base_model = MobileNetV2(
    weights='imagenet',
    include_top=False,
    input_shape=(96, 96, 3)  # smaller input for demo
)

# Freeze all base layers
base_model.trainable = False
frozen_count = sum(1 for l in base_model.layers if not l.trainable)
print(f'Frozen layers: {frozen_count}/{len(base_model.layers)}')

# Add custom classification head (simulating 5-class problem)
NUM_CLASSES = 5
inputs = keras.Input(shape=(96, 96, 3))
x = base_model(inputs, training=False)       # frozen inference
x = layers.GlobalAveragePooling2D()(x)        # reduce spatial dimensions
x = layers.Dense(128, activation='relu')(x)
x = layers.Dropout(0.3)(x)
outputs = layers.Dense(NUM_CLASSES, activation='softmax')(x)

tl_model = keras.Model(inputs, outputs, name='TransferLearning_MobileNetV2')
tl_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print('\n=== Transfer Learning Model Summary (top layers) ===')
# Show only trainable layers
print(f'Total params: {tl_model.count_params():,}')
trainable_params = sum(np.prod(v.shape) for v in tl_model.trainable_variables)
non_trainable   = sum(np.prod(v.shape) for v in tl_model.non_trainable_variables)
print(f'Trainable params: {trainable_params:,}')
print(f'Non-trainable params (frozen base): {non_trainable:,}')
print(f'Training efficiency: only {trainable_params/tl_model.count_params()*100:.1f}% of params trained!')

# Demonstrate fine-tuning concept
print('\n=== Fine-tuning Strategy ===')
print('Phase 1 (Feature Extraction): Freeze entire base, train only head')
print('Phase 2 (Fine-tuning): Unfreeze top N layers of base, train with low LR')
print('\nUnfreezing last 20 layers for fine-tuning demonstration:')
base_model.trainable = True
for layer in base_model.layers[:-20]:
    layer.trainable = False
trainable_after = sum(np.prod(v.shape) for v in tl_model.trainable_variables)
print(f'Trainable params after unfreezing top 20 layers: {trainable_after:,}')

# Re-freeze for demonstration (don't train)
base_model.trainable = False
print('\n[Demo only — not running full training to save time]')
print('In practice, you would call: tl_model.fit(train_ds, epochs=10, ...)')

## 6. Regularization Techniques

Perbandingan model dengan dan tanpa regularisasi untuk melihat dampak terhadap overfitting.

In [None]:
# Use a small subset of MNIST for faster overfitting demonstration
X_small = X_train_flat[:3000]
y_small = y_train[:3000]

def build_model(use_dropout=False, use_batchnorm=False, name='model'):
    layers_list = [layers.Input(shape=(784,))]
    for units in [512, 256, 128]:
        layers_list.append(layers.Dense(units, activation='relu'))
        if use_batchnorm:
            layers_list.append(layers.BatchNormalization())
        if use_dropout:
            layers_list.append(layers.Dropout(0.4))
    layers_list.append(layers.Dense(10, activation='softmax'))
    model = models.Sequential(layers_list, name=name)
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

fit_kwargs = dict(epochs=20, batch_size=64, validation_data=(X_test_flat, y_test), verbose=0)

# Model 1: No regularization (prone to overfitting)
m_plain = build_model(name='No_Regularization')
h_plain = m_plain.fit(X_small, y_small, **fit_kwargs)

# Model 2: Dropout only
m_drop = build_model(use_dropout=True, name='Dropout_0.4')
h_drop = m_drop.fit(X_small, y_small, **fit_kwargs)

# Model 3: Dropout + BatchNorm
m_full = build_model(use_dropout=True, use_batchnorm=True, name='Dropout+BatchNorm')
h_full = m_full.fit(X_small, y_small, **fit_kwargs)

# Compare training curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
models_info = [
    (h_plain, 'No Regularization', 'tomato'),
    (h_drop,  'Dropout(0.4)', 'steelblue'),
    (h_full,  'Dropout+BatchNorm', 'seagreen')
]
for hist, label, color in models_info:
    axes[0].plot(hist.history['accuracy'],     color=color, linestyle='-',  label=f'{label} (train)')
    axes[0].plot(hist.history['val_accuracy'], color=color, linestyle='--', label=f'{label} (val)')
    axes[1].plot(hist.history['loss'],     color=color, linestyle='-',  label=f'{label} (train)')
    axes[1].plot(hist.history['val_loss'], color=color, linestyle='--', label=f'{label} (val)')

axes[0].set_title('Accuracy Comparison'); axes[0].set_xlabel('Epoch'); axes[0].legend(fontsize=7)
axes[1].set_title('Loss Comparison');     axes[1].set_xlabel('Epoch'); axes[1].legend(fontsize=7)
plt.suptitle('Effect of Regularization Techniques on Training vs Validation')
plt.tight_layout()
plt.show()

# Summary table
summary_rows = []
for (h, label, _), model in zip(models_info, [m_plain, m_drop, m_full]):
    train_acc = h.history['accuracy'][-1]
    val_acc   = h.history['val_accuracy'][-1]
    _, test_acc_val = model.evaluate(X_test_flat, y_test, verbose=0)
    overfit_gap = train_acc - val_acc
    summary_rows.append({'Model': label, 'Train Acc': round(train_acc, 4),
                          'Val Acc': round(val_acc, 4), 'Test Acc': round(test_acc_val, 4),
                          'Overfit Gap': round(overfit_gap, 4)})

df_summary = pd.DataFrame(summary_rows)
print('\n=== Regularization Comparison Summary ===')
print(df_summary.to_string(index=False))

## Tugas Praktikum

Selesaikan tugas-tugas berikut:

1. **Tugas 1 — Arsitektur MLP**: Eksperimen dengan 3 konfigurasi MLP berbeda pada MNIST:
   - Shallow: 1 hidden layer (128 neuron)
   - Medium: 3 hidden layers (256, 128, 64)
   - Deep: 5 hidden layers (512, 256, 128, 64, 32)
   
   Bandingkan test accuracy, training time, dan overfitting level.
   Plot learning curves dalam satu grafik.

2. **Tugas 2 — CNN Architecture Experiment**: Modifikasi CNN dengan menambahkan
   satu Conv block ekstra (Conv2D + MaxPooling). Bandingkan jumlah parameter,
   training time, dan accuracy dengan CNN awal.
   Visualisasikan feature maps (output Conv2D layer pertama) untuk 5 gambar test.

3. **Tugas 3 — LSTM Multivariate**: Perluas time series prediction menggunakan
   dua fitur input (sine wave + cosine wave sebagai multivariate input).
   Reshape input menjadi `(samples, SEQ_LEN, 2)`. Bandingkan MSE dengan
   model univariate.

4. **Tugas 4 — Learning Rate Scheduling**: Implementasikan `ReduceLROnPlateau`
   callback pada CNN training. Bandingkan kurva training dengan dan tanpa
   learning rate scheduling. Dokumentasikan kapan LR berubah.

5. **Tugas 5 — Custom Dataset**: Cari dataset gambar bebas di Kaggle (min. 3 kelas,
   min. 500 gambar total). Terapkan Transfer Learning (MobileNetV2 atau EfficientNetB0)
   dengan 2 fase training (feature extraction → fine-tuning). Laporkan accuracy,
   confusion matrix, dan contoh prediksi benar/salah.