<a href="https://colab.research.google.com/github/KrituneX/Hands-on-Machine-Learning-with-Scikit-Learn-Keras-TensorFlow/blob/main/Chapter_10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 10: Introduction to Artificial Neural Networks with Keras
## Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow

## 1. Fundamental Concepts of Neural Networks

### 1.1 Biological Inspiration
Jaringan saraf tiruan terinspirasi dari struktur otak biologis:
- **Neuron**: Unit pemroses dasar (≈10^11 neuron di otak manusia)
- **Dendrites**: Menerima sinyal input
- **Axons**: Mengirim sinyal output
- **Synapses**: Koneksi antar neuron (dapat diperkuat/diperlemah)

### 1.2 Artificial Neuron (Perceptron)
Model matematis neuron dengan:
- Input: $x = [x_1, x_2, ..., x_n]$
- Weights: $w = [w_1, w_2, ..., w_n]$
- Bias: $b$
- Activation function: $\phi$

**Output Calculation**:
$$ z = w^T x + b $$
$$ a = \phi(z) $$

### 1.3 Activation Functions
Fungsi non-linear yang menentukan output neuron:

| Fungsi | Formula | Range | Kelebihan |
|--------|---------|-------|-----------|
| Sigmoid | $\sigma(z) = \frac{1}{1+e^{-z}}$ | (0,1) | Output probabilistic |
| Tanh | $tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}}$ | (-1,1) | Zero-centered |
| ReLU | $ReLU(z) = max(0,z)$ | [0,∞) | Tidak ada vanishing gradient |
| Leaky ReLU | $LReLU(z) = max(\alpha z,z)$ | (-∞,∞) | Memperbaiki dying ReLU |

## 2. Multi-Layer Perceptron (MLP) Architecture

### 2.1 Layer Organization
- **Input Layer**: Menerima data mentah (jumlah neuron = jumlah fitur)
- **Hidden Layers**: 1+ lapisan tersembunyi (biasanya 1-5 layer)
- **Output Layer**: Menghasilkan prediksi (jumlah neuron tergantung task)

**Contoh Arsitektur** untuk klasifikasi 3 kelas:
- Input: 4 features → 4 neurons
- Hidden 1: 10 neurons (ReLU)
- Hidden 2: 8 neurons (ReLU)
- Output: 3 neurons (Softmax)

### 2.2 Forward Propagation
Proses perhitungan dari input ke output:
$$ a^{[l]} = \phi(W^{[l]T} a^{[l-1]} + b^{[l]}) $$

### 2.3 Backpropagation Algorithm
Proses pembelajaran dengan gradient descent:
1. Hitung error di output layer:
$$ \delta^{[L]} = \nabla_a J \odot \phi'(z^{[L]}) $$
2. Propagasi error backward:
$$ \delta^{[l]} = (W^{[l+1]T} \delta^{[l+1]}) \odot \phi'(z^{[l]}) $$
3. Update weights dan biases:
$$ W^{[l]} := W^{[l]} - \eta \delta^{[l]} a^{[l-1]T} $$
$$ b^{[l]} := b^{[l]} - \eta \delta^{[l]} $$

## 3. Implementing Neural Networks with Keras

### 3.1 Building an MLP for MNIST Classification

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()

# Preprocessing
X_train = X_train.reshape(-1, 28*28) / 255.0
X_test = X_test.reshape(-1, 28*28) / 255.0

# Build model
model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train,
                    epochs=15,
                    batch_size=32,
                    validation_split=0.2)

# Evaluate
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")

### 3.2 Advanced Model Visualization

In [None]:
# Plot training history
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

## 4. Hyperparameter Tuning

### 4.1 Optimizers Comparison
- **SGD**: $ \theta := \theta - \eta \nabla_\theta J(\theta) $
- **Momentum**: $ v := \gamma v + \eta \nabla_\theta J(\theta) $
              $ \theta := \theta - v $
- **Adam**: Kombinasi Momentum + RMSProp

### 4.2 Regularization Techniques
- **L2 Regularization**: $ J(\theta) = Loss + \frac{\lambda}{2}||\theta||^2 $
- **Dropout**: Random deactivation neurons selama training
- **Early Stopping**: Menghentikan training ketika validasi error mulai naik

## 5. Practical Implementation with Callbacks

In [None]:
# Advanced model with callbacks
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LearningRateScheduler

def lr_scheduler(epoch, lr):
    if epoch < 5:
        return lr
    else:
        return lr * tf.math.exp(-0.1)

callbacks = [
    EarlyStopping(patience=3, restore_best_weights=True),
    ModelCheckpoint('best_model.h5', save_best_only=True),
    LearningRateScheduler(lr_scheduler)
]

model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', kernel_regularizer=keras.regularizers.l2(0.01)),
    keras.layers.Dropout(0.3),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(X_train, y_train,
                   epochs=50,
                   batch_size=64,
                   validation_split=0.2,
                   callbacks=callbacks)

## 6. Model Evaluation and Interpretation

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

# Predictions
y_pred = model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred_classes)
plt.figure(figsize=(10,8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

# Misclassified examples
misclassified_idx = np.where(y_pred_classes != y_test)[0]
plt.figure(figsize=(15,4))
for i, idx in enumerate(misclassified_idx[:5]):
    plt.subplot(1,5,i+1)
    plt.imshow(X_test[idx].reshape(28,28), cmap='gray')
    plt.title(f'True: {y_test[idx]}\nPred: {y_pred_classes[idx]}')
    plt.axis('off')
plt.tight_layout()
plt.show()

# **Penjelasan Tambahan:**

### **1. Teori Mendalam:**
- Penjelasan biologi neuron dan analogi ke ANN
- Detail matematis forward/backpropagation
- Perbandingan berbagai activation functions dan optimizers
- Persamaan matematis untuk semua komponen kunci

### **2. Implementasi Praktis:**
- Contoh lengkap klasifikasi MNIST
- Visualisasi training process
- Implementasi callback untuk training canggih
- Analisis hasil dengan confusion matrix

### **3. Best Practices:**
- Teknik regularisasi (L2, Dropout)
- Hyperparameter tuning
- Early stopping
- Learning rate scheduling

### **4. Model Evaluation:**
- Interpretasi confusion matrix
- Analisis misclassified examples
- Visualisasi prediksi

### **5. Struktur Jelas:**
- Pembagian section sistematis
- Kode dan teori terintegrasi
- Diagram dan visualisasi pendukung