# üöÄ MACHINE LEARNING OPTIMIZER - TUTORIAL SUPER LENGKAP

**Tujuan**: Menjadi AI Engineer Profesional dengan memahami SETIAP DETAIL!

---

## üìö Daftar Isi:
1. **Penjelasan Fungsi NumPy & Pandas**
2. **Matematika & Rumus Linear Regression**
3. **Optimizer Algorithms (GD, Momentum, Adam)**
4. **Training & Evaluation**
5. **Visualisasi & Analisis**

---
# PART 1: IMPORT & PENJELASAN SETIAP LIBRARY
---

In [None]:
# NUMPY - Library untuk operasi matematika & array
import numpy as np

# PANDAS - Library untuk manipulasi data (seperti Excel di Python)
import pandas as pd

# MATPLOTLIB - Library untuk visualisasi (membuat grafik)
import matplotlib.pyplot as plt

# SEABORN - Library visualisasi yang lebih cantik
import seaborn as sns

# SKLEARN - Library machine learning (sudah jadi)
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error

import time

# Setting agar plot muncul langsung di notebook
%matplotlib inline

# Setting style visualisasi
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("‚úÖ Semua library berhasil di-import!")

---
# PART 2: PENJELASAN FUNGSI-FUNGSI NUMPY
---

## üîç FUNGSI NUMPY YANG AKAN KITA PAKAI:

### 1. **np.zeros()**
**Fungsi**: Membuat array yang isinya semua NOL

**Cara Pakai**:
```python
np.zeros((3, 1))  # Buat array 3 baris, 1 kolom, isi semua 0
```

**Analogi**: Seperti bikin kertas kosong dengan ukuran tertentu

**Kenapa dipakai?**: Untuk inisialisasi weights (bobot) di awal training. Kita mulai dari 0, lalu perlahan diupdate.

---

### 2. **.shape**
**Fungsi**: Melihat ukuran/dimensi array

**Cara Pakai**:
```python
X.shape  # Output: (100, 5) artinya 100 baris, 5 kolom
```

**Analogi**: Seperti ngecek ukuran meja (panjang x lebar)

---

### 3. **np.dot()** atau **.dot()**
**Fungsi**: Perkalian matrix (matrix multiplication)

**Cara Pakai**:
```python
np.dot(X, w)  # atau X.dot(w)
```

**Rumus Matematika**:
```
Jika X = [[1, 2],    w = [[5],
          [3, 4]]          [6]]

Hasil = [[1*5 + 2*6],  = [[17],
         [3*5 + 4*6]]     [39]]
```

**Analogi**: Seperti menghitung total belanja. Kamu punya 2 apel (harga 5rb) dan 3 jeruk (harga 6rb). Total = 2√ó5 + 3√ó6 = 28rb

---

### 4. **np.mean()**
**Fungsi**: Menghitung rata-rata

**Rumus**: mean = (x‚ÇÅ + x‚ÇÇ + ... + x‚Çô) / n

**Analogi**: Nilai rata-rata ujian kamu

---

### 5. **np.sqrt()**
**Fungsi**: Akar kuadrat

**Rumus**: ‚àöx

**Contoh**: np.sqrt(16) = 4

---

### 6. **.T** (Transpose)
**Fungsi**: Membalik baris jadi kolom, kolom jadi baris

**Contoh**:
```
X = [[1, 2, 3],     X.T = [[1, 4],
     [4, 5, 6]]            [2, 5],
                           [3, 6]]
```

**Analogi**: Seperti memutar kertas 90 derajat

In [None]:
# DEMO FUNGSI NUMPY
print("=" * 60)
print("DEMO: Fungsi-fungsi NumPy")
print("=" * 60)

# 1. np.zeros()
print("\n1. np.zeros() - Membuat array berisi NOL")
zeros_array = np.zeros((3, 2))
print(f"Shape: {zeros_array.shape}")
print(zeros_array)

# 2. .shape
print("\n2. .shape - Melihat ukuran array")
X_demo = np.array([[1, 2, 3], [4, 5, 6]])
print(f"X_demo.shape = {X_demo.shape}  # (2 baris, 3 kolom)")

# 3. np.dot() - Matrix multiplication
print("\n3. np.dot() - Perkalian Matrix")
A = np.array([[1, 2], [3, 4]])
B = np.array([[5], [6]])
result = np.dot(A, B)
print(f"A = \n{A}")
print(f"B = \n{B}")
print(f"A.dot(B) = \n{result}")
print("Perhitungan: [1*5 + 2*6] = [17], [3*5 + 4*6] = [39]")

# 4. np.mean()
print("\n4. np.mean() - Rata-rata")
data = np.array([10, 20, 30, 40, 50])
print(f"Data: {data}")
print(f"Mean: {np.mean(data)}  # (10+20+30+40+50)/5 = 30")

# 5. np.sqrt()
print("\n5. np.sqrt() - Akar kuadrat")
print(f"‚àö16 = {np.sqrt(16)}")
print(f"‚àö25 = {np.sqrt(25)}")

# 6. .T - Transpose
print("\n6. .T - Transpose (balik baris-kolom)")
X = np.array([[1, 2, 3], [4, 5, 6]])
print(f"X = \n{X}")
print(f"X.T = \n{X.T}")

---
# PART 3: LOAD DATA & PREPROCESSING
---

## üìä PENJELASAN PANDAS FUNCTIONS:

### **pd.read_csv()**
- **Fungsi**: Membaca file CSV (Comma Separated Values)
- **Analogi**: Seperti buka file Excel

### **df.drop()**
- **Fungsi**: Menghapus kolom atau baris
- **Parameter**:
  - `axis=1`: hapus kolom
  - `axis=0`: hapus baris
  - `inplace=True`: langsung ubah df asli

### **df.shape**
- **Fungsi**: Melihat ukuran dataframe (baris, kolom)

### **pd.get_dummies()**
- **Fungsi**: One-hot encoding (ubah kategori jadi angka)
- **Contoh**:
  ```
  Warna: [Merah, Biru, Merah]
  
  Jadi:
  Warna_Biru  Warna_Merah
      0            1
      1            0
      0            1
  ```

In [None]:
print("=" * 70)
print("LOADING DATA")
print("=" * 70)

# Baca file CSV
df = pd.read_csv("dataset/CarPrice_Assignment.csv")

print(f"\nüìä Dataset shape: {df.shape}")
print(f"   Artinya: {df.shape[0]} mobil, {df.shape[1]} kolom/fitur")

# Lihat 5 baris pertama
print("\n5 Baris Pertama:")
df.head()

In [None]:
# Info tentang dataset
print("\nInfo Dataset:")
df.info()

In [None]:
# Hapus kolom car_ID (tidak relevan untuk prediksi)
df.drop("car_ID", axis=1, inplace=True)
print(f"‚úÖ Kolom 'car_ID' dihapus")
print(f"Shape sekarang: {df.shape}")

## üéØ PISAHKAN TARGET (y) DAN FEATURES (X)

**Target (y)**: Yang mau kita prediksi (harga mobil)

**Features (X)**: Data yang kita pakai untuk prediksi (merek, tahun, dll)

In [None]:
# Pisahkan y (target) dan X (features)
y = df["price"].values.reshape(-1, 1).astype(np.float64)
X = df.drop("price", axis=1)

print("Target (y) - Harga Mobil:")
print(f"  Shape: {y.shape}  # {y.shape[0]} mobil")
print(f"  Mean:  ${np.mean(y):,.2f}")
print(f"  Min:   ${np.min(y):,.2f}")
print(f"  Max:   ${np.max(y):,.2f}")

print(f"\nFeatures (X):")
print(f"  Shape: {X.shape}  # {X.shape[1]} fitur")

## üî¢ ONE-HOT ENCODING

**Masalah**: Komputer tidak bisa baca teks ("Toyota", "Honda")

**Solusi**: Ubah jadi angka 0 dan 1

**Contoh**:
```
Brand: [Toyota, Honda, Toyota]

Jadi:
Brand_Honda  Brand_Toyota
    0            1
    1            0
    0            1
```

In [None]:
# One-hot encoding
print("Sebelum encoding:")
print(f"Jumlah kolom: {X.shape[1]}")

X = pd.get_dummies(X, drop_first=True)

print(f"\nSetelah encoding:")
print(f"Jumlah kolom: {X.shape[1]}")
print(f"\nKolom bertambah karena setiap kategori jadi kolom sendiri!")

## üìè STANDARDIZATION (SCALING)

**Masalah**: Fitur punya skala berbeda
- Tahun: 1990-2020 (skala puluhan)
- Harga: 5000-50000 (skala ribuan)

**Solusi**: Standardisasi ke skala yang sama

**Rumus**:
```
z = (x - Œº) / œÉ

Dimana:
- x = nilai asli
- Œº (mu) = mean (rata-rata)
- œÉ (sigma) = standard deviation
```

**Hasil**: Semua fitur punya mean=0, std=1

**Analogi**: Seperti mengubah semua nilai ujian ke skala 0-100

In [None]:
# Standardization
num_cols = X.select_dtypes(include=["int64", "float64"]).columns
scaler = StandardScaler()

print("Sebelum standardisasi:")
print(X[num_cols].describe())

X[num_cols] = scaler.fit_transform(X[num_cols])

print("\nSetelah standardisasi:")
print(X[num_cols].describe())
print("\n‚úÖ Perhatikan: mean ‚âà 0, std ‚âà 1")

In [None]:
# Convert ke numpy array
X = X.values.astype(np.float64)

print(f"Final X shape: {X.shape}")
print(f"Final y shape: {y.shape}")

## ‚úÇÔ∏è TRAIN/TEST SPLIT

**Kenapa split?**
- **Training set**: Untuk belajar (80%)
- **Test set**: Untuk ujian (20%)

**Analogi**: Seperti belajar dari buku latihan, lalu ujian pakai soal baru

In [None]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print("Data Split:")
print(f"  Training: {X_train.shape[0]} mobil (80%)")
print(f"  Testing:  {X_test.shape[0]} mobil (20%)")
print(f"\n‚úÖ Data siap untuk training!")

---
# PART 4: MATEMATIKA LINEAR REGRESSION
---

## üìê RUMUS DASAR LINEAR REGRESSION

### **Persamaan Prediksi**:
```
≈∑ = w‚ÇÅx‚ÇÅ + w‚ÇÇx‚ÇÇ + ... + w‚Çôx‚Çô + b

Atau dalam bentuk matrix:
≈∑ = Xw + b
```

**Dimana**:
- `≈∑` (y-hat) = prediksi
- `X` = features (data input)
- `w` = weights (bobot)
- `b` = bias (intercept)

**Analogi**: 
```
Harga Rumah = (Luas √ó w‚ÇÅ) + (Kamar √ó w‚ÇÇ) + (Lokasi √ó w‚ÇÉ) + b
```

---

## üìä LOSS FUNCTION (MSE)

**Mean Squared Error**:
```
MSE = (1/n) Œ£(≈∑·µ¢ - y·µ¢)¬≤
```

**Cara Baca**:
1. `(≈∑·µ¢ - y·µ¢)` = error (selisih prediksi vs actual)
2. `¬≤` = dikuadratkan (agar error negatif jadi positif)
3. `Œ£` (sigma) = jumlahkan semua
4. `(1/n)` = rata-ratakan

**Analogi**: Seperti ngitung rata-rata kesalahan tebakan kamu

---

## üéØ GRADIENT (TURUNAN)

**Gradient untuk w**:
```
‚àÇL/‚àÇw = (2/n) X·µÄ(≈∑ - y)
```

**Gradient untuk b**:
```
‚àÇL/‚àÇb = (2/n) Œ£(≈∑ - y)
```

**Cara Baca**:
- `‚àÇL/‚àÇw` = "turunan Loss terhadap w"
- Memberitahu: "Ke arah mana w harus diubah untuk mengurangi error"

**Analogi**: Seperti kompas yang nunjukin arah turun gunung

In [None]:
# FUNGSI INTI

def predict(X, w, b):
    """
    Prediksi: ≈∑ = Xw + b
    
    Args:
        X: features (n_samples, n_features)
        w: weights (n_features, 1)
        b: bias (scalar)
    
    Returns:
        predictions (n_samples, 1)
    """
    return np.dot(X, w) + b


def mse(y_pred, y):
    """
    Mean Squared Error: MSE = (1/n) Œ£(≈∑·µ¢ - y·µ¢)¬≤
    
    Args:
        y_pred: prediksi
        y: nilai sebenarnya
    
    Returns:
        MSE (scalar)
    """
    return np.mean((y_pred - y) ** 2)


def compute_gradients(X, y, y_pred):
    """
    Hitung gradients:
    - dw = (2/n) X·µÄ(≈∑ - y)
    - db = (2/n) Œ£(≈∑ - y)
    
    Args:
        X: features
        y: target
        y_pred: prediksi
    
    Returns:
        dw, db (gradients)
    """
    n = len(y)
    dw = (2.0/n) * np.dot(X.T, (y_pred - y))
    db = (2.0/n) * np.sum(y_pred - y)
    return dw, db

print("‚úÖ Fungsi inti sudah didefinisikan!")

---
# PART 5: OPTIMIZER ALGORITHMS
---

## 1Ô∏è‚É£ GRADIENT DESCENT (GD)

**Rumus Update**:
```
w = w - Œ± √ó ‚àÇL/‚àÇw
b = b - Œ± √ó ‚àÇL/‚àÇb
```

**Dimana**:
- `Œ±` (alpha) = learning rate (seberapa besar langkah)
- `‚àÇL/‚àÇw` = gradient (arah turun)

**Cara Kerja**:
1. Hitung prediksi
2. Hitung error (loss)
3. Hitung gradient (arah turun)
4. Update parameter (turun sedikit)
5. Ulangi!

**Analogi**: Turun gunung dengan mata tertutup. Kamu cek kemiringan tanah, lalu melangkah ke arah yang lebih rendah.

In [None]:
def train_gradient_descent(X, y, lr=0.01, epochs=150, verbose=True):
    """
    Gradient Descent Optimizer
    
    Args:
        X: training features
        y: training targets
        lr: learning rate (Œ±)
        epochs: jumlah iterasi
        verbose: print progress atau tidak
    
    Returns:
        w, b, losses, training_time
    """
    # Inisialisasi parameters
    n_features = X.shape[1]
    w = np.zeros((n_features, 1), dtype=np.float64)  # Mulai dari 0
    b = 0.0
    
    losses = []
    start_time = time.time()
    
    for epoch in range(epochs):
        # 1. Forward pass (prediksi)
        y_pred = predict(X, w, b)
        
        # 2. Compute loss
        loss = mse(y_pred, y)
        losses.append(loss)
        
        # 3. Compute gradients
        dw, db = compute_gradients(X, y, y_pred)
        
        # 4. Update parameters
        w = w - lr * dw  # w_new = w_old - Œ± √ó gradient
        b = b - lr * db
        
        # Print progress
        if verbose and epoch % 30 == 0:
            print(f"Epoch {epoch:3d} | Loss: {loss:,.2f}")
    
    training_time = time.time() - start_time
    
    if verbose:
        print(f"\n‚úÖ Training selesai dalam {training_time:.2f} detik")
    
    return w, b, losses, training_time

print("‚úÖ Gradient Descent function ready!")

## 2Ô∏è‚É£ MOMENTUM

**Rumus**:
```
v = Œ≤ √ó v + Œ± √ó ‚àÇL/‚àÇw    (velocity update)
w = w - v                 (parameter update)
```

**Dimana**:
- `v` = velocity (kecepatan)
- `Œ≤` (beta) = momentum coefficient (biasanya 0.9)

**Cara Kerja**:
- Mengakumulasi gradient dari iterasi sebelumnya
- Seperti bola menggelinding: semakin lama semakin cepat

**Analogi**: 
- GD = jalan kaki turun gunung
- Momentum = naik sepeda turun gunung (punya momentum!)

**Keuntungan**:
- Lebih cepat konvergen
- Bisa melewati local minima

In [None]:
def train_momentum(X, y, lr=0.01, epochs=150, beta=0.9, verbose=True):
    """
    Momentum Optimizer
    
    Args:
        beta: momentum coefficient (0.9 recommended)
    """
    n_features = X.shape[1]
    w = np.zeros((n_features, 1), dtype=np.float64)
    b = 0.0
    
    # Inisialisasi velocity
    vw = np.zeros((n_features, 1), dtype=np.float64)
    vb = 0.0
    
    losses = []
    start_time = time.time()
    
    for epoch in range(epochs):
        # Forward pass
        y_pred = predict(X, w, b)
        loss = mse(y_pred, y)
        losses.append(loss)
        
        # Compute gradients
        dw, db = compute_gradients(X, y, y_pred)
        
        # Update velocity: v = Œ≤√óv + Œ±√ógradient
        vw = beta * vw + lr * dw
        vb = beta * vb + lr * db
        
        # Update parameters: w = w - v
        w = w - vw
        b = b - vb
        
        if verbose and epoch % 30 == 0:
            print(f"Epoch {epoch:3d} | Loss: {loss:,.2f}")
    
    training_time = time.time() - start_time
    if verbose:
        print(f"\n‚úÖ Training selesai dalam {training_time:.2f} detik")
    
    return w, b, losses, training_time

print("‚úÖ Momentum function ready!")

## 3Ô∏è‚É£ ADAM (Adaptive Moment Estimation)

**Rumus Lengkap**:
```
m = Œ≤‚ÇÅ √ó m + (1-Œ≤‚ÇÅ) √ó ‚àÇL/‚àÇw      (first moment - mean)
v = Œ≤‚ÇÇ √ó v + (1-Œ≤‚ÇÇ) √ó (‚àÇL/‚àÇw)¬≤   (second moment - variance)

mÃÇ = m / (1 - Œ≤‚ÇÅ·µó)                (bias correction)
vÃÇ = v / (1 - Œ≤‚ÇÇ·µó)                (bias correction)

w = w - Œ± √ó mÃÇ / (‚àövÃÇ + Œµ)        (update)
```

**Dimana**:
- `m` = first moment (rata-rata gradient)
- `v` = second moment (variance gradient)
- `Œ≤‚ÇÅ` = 0.9 (decay rate untuk m)
- `Œ≤‚ÇÇ` = 0.999 (decay rate untuk v)
- `Œµ` = 1e-8 (untuk numerical stability)
- `t` = timestep (epoch ke berapa)

**Cara Kerja**:
1. Track rata-rata gradient (m)
2. Track variance gradient (v)
3. Adaptive learning rate untuk setiap parameter
4. Bias correction di awal training

**Analogi**: 
- Seperti GPS yang pintar
- Tahu kapan harus jalan cepat, kapan harus pelan
- Setiap parameter punya "kecepatan" sendiri

**Kenapa PALING POPULER?**
- Kombinasi Momentum + RMSProp
- Adaptive learning rate
- Reliable untuk berbagai masalah

In [None]:
def train_adam(X, y, lr=0.01, epochs=150, b1=0.9, b2=0.999, eps=1e-8, verbose=True):
    """
    Adam Optimizer
    
    Args:
        b1: beta1 untuk first moment (0.9)
        b2: beta2 untuk second moment (0.999)
        eps: epsilon untuk numerical stability (1e-8)
    """
    n_features = X.shape[1]
    w = np.zeros((n_features, 1), dtype=np.float64)
    b = 0.0
    
    # Inisialisasi moments
    m_w = np.zeros((n_features, 1), dtype=np.float64)  # first moment untuk w
    v_w = np.zeros((n_features, 1), dtype=np.float64)  # second moment untuk w
    m_b = 0.0  # first moment untuk b
    v_b = 0.0  # second moment untuk b
    
    losses = []
    start_time = time.time()
    
    for epoch in range(1, epochs + 1):  # Mulai dari 1 untuk bias correction
        # Forward pass
        y_pred = predict(X, w, b)
        loss = mse(y_pred, y)
        losses.append(loss)
        
        # Compute gradients
        dw, db = compute_gradients(X, y, y_pred)
        
        # Update first moment: m = Œ≤‚ÇÅ√óm + (1-Œ≤‚ÇÅ)√ógradient
        m_w = b1 * m_w + (1 - b1) * dw
        m_b = b1 * m_b + (1 - b1) * db
        
        # Update second moment: v = Œ≤‚ÇÇ√óv + (1-Œ≤‚ÇÇ)√ógradient¬≤
        v_w = b2 * v_w + (1 - b2) * (dw ** 2)
        v_b = b2 * v_b + (1 - b2) * (db ** 2)
        
        # Bias correction
        m_w_hat = m_w / (1 - b1 ** epoch)
        m_b_hat = m_b / (1 - b1 ** epoch)
        v_w_hat = v_w / (1 - b2 ** epoch)
        v_b_hat = v_b / (1 - b2 ** epoch)
        
        # Update parameters: w = w - Œ± √ó mÃÇ / (‚àövÃÇ + Œµ)
        w = w - lr * m_w_hat / (np.sqrt(v_w_hat) + eps)
        b = b - lr * m_b_hat / (np.sqrt(v_b_hat) + eps)
        
        if verbose and (epoch - 1) % 30 == 0:
            print(f"Epoch {epoch-1:3d} | Loss: {loss:,.2f}")
    
    training_time = time.time() - start_time
    if verbose:
        print(f"\n‚úÖ Training selesai dalam {training_time:.2f} detik")
    
    return w, b, losses, training_time

print("‚úÖ Adam function ready!")

---
# PART 6: TRAINING SEMUA OPTIMIZER
---

In [None]:
print("=" * 70)
print("TRAINING DIMULAI!")
print("=" * 70)

optimizers = {}

# 1. Gradient Descent
print("\n1Ô∏è‚É£ GRADIENT DESCENT")
print("-" * 70)
w, b, losses, train_time = train_gradient_descent(X_train, y_train, lr=0.01, epochs=150)
optimizers["GD"] = (w, b, losses, train_time)

In [None]:
# 2. Momentum
print("\n2Ô∏è‚É£ MOMENTUM")
print("-" * 70)
w, b, losses, train_time = train_momentum(X_train, y_train, lr=0.01, epochs=150, beta=0.9)
optimizers["Momentum"] = (w, b, losses, train_time)

In [None]:
# 3. Adam
print("\n3Ô∏è‚É£ ADAM")
print("-" * 70)
w, b, losses, train_time = train_adam(X_train, y_train, lr=0.01, epochs=150)
optimizers["Adam"] = (w, b, losses, train_time)

---
# PART 7: EVALUATION
---

## üìä METRICS YANG DIPAKAI:

### 1. **MSE (Mean Squared Error)**
```
MSE = (1/n) Œ£(≈∑·µ¢ - y·µ¢)¬≤
```
- Rata-rata kuadrat error
- Semakin kecil semakin bagus

### 2. **RMSE (Root Mean Squared Error)**
```
RMSE = ‚àöMSE
```
- Akar dari MSE
- Dalam satuan yang sama dengan y (dollar)
- Lebih mudah diinterpretasi

### 3. **MAE (Mean Absolute Error)**
```
MAE = (1/n) Œ£|≈∑·µ¢ - y·µ¢|
```
- Rata-rata absolute error
- Tidak dikuadratkan

### 4. **R¬≤ Score (Coefficient of Determination)**
```
R¬≤ = 1 - (SS_res / SS_tot)

Dimana:
SS_res = Œ£(y·µ¢ - ≈∑·µ¢)¬≤  (residual sum of squares)
SS_tot = Œ£(y·µ¢ - »≥)¬≤   (total sum of squares)
```

**Interpretasi R¬≤**:
- R¬≤ = 1.0 ‚Üí Perfect prediction (100%)
- R¬≤ = 0.8 ‚Üí Model menjelaskan 80% variance
- R¬≤ = 0.0 ‚Üí Model tidak lebih baik dari rata-rata
- R¬≤ < 0.0 ‚Üí Model lebih buruk dari rata-rata

In [None]:
print("=" * 70)
print("EVALUATION ON TEST SET")
print("=" * 70)

results = []

for name, (w, b, losses, train_time) in optimizers.items():
    # Prediksi di test set
    y_pred_test = predict(X_test, w, b)
    
    # Hitung metrics
    test_mse = mean_squared_error(y_test, y_pred_test)
    test_rmse = np.sqrt(test_mse)
    test_mae = mean_absolute_error(y_test, y_pred_test)
    r2 = r2_score(y_test, y_pred_test)
    
    results.append({
        'Optimizer': name,
        'Train Loss': losses[-1],
        'Test RMSE': test_rmse,
        'Test MAE': test_mae,
        'R¬≤ Score': r2,
        'Time (s)': train_time
    })
    
    print(f"\n{name}:")
    print(f"  R¬≤ Score:  {r2:.4f}")
    print(f"  RMSE:      ${test_rmse:,.2f}")
    print(f"  MAE:       ${test_mae:,.2f}")
    print(f"  Time:      {train_time:.2f}s")

# DataFrame
results_df = pd.DataFrame(results)
results_df = results_df.sort_values('R¬≤ Score', ascending=False)

print("\n" + "=" * 70)
print("SUMMARY TABLE")
print("=" * 70)
results_df

---
# PART 8: VISUALISASI
---

In [None]:
# 1. Loss Convergence
plt.figure(figsize=(12, 5))

for name, (_, _, losses, _) in optimizers.items():
    plt.plot(losses, label=name, linewidth=2)

plt.title("Loss Convergence Comparison", fontsize=14, fontweight='bold')
plt.xlabel("Epoch", fontsize=12)
plt.ylabel("Training Loss (MSE)", fontsize=12)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.yscale('log')
plt.tight_layout()
plt.show()

print("üìà Grafik menunjukkan seberapa cepat loss turun")
print("   Semakin cepat turun = semakin bagus!")

In [None]:
# 2. R¬≤ Score Comparison
plt.figure(figsize=(10, 5))

colors = ['#1f77b4', '#ff7f0e', '#2ca02c']
bars = plt.bar(results_df['Optimizer'], results_df['R¬≤ Score'], 
               color=colors, edgecolor='black', linewidth=1.5)

plt.ylabel("R¬≤ Score", fontsize=12)
plt.title("Test Set Performance (R¬≤ Score)", fontsize=14, fontweight='bold')
plt.ylim([0, 1])
plt.grid(True, alpha=0.3, axis='y')

# Add value labels
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{height:.4f}', ha='center', va='bottom', fontsize=11)

plt.tight_layout()
plt.show()

print("üìä R¬≤ Score: Semakin tinggi semakin bagus (max = 1.0)")

In [None]:
# 3. Prediction vs Actual (Best Model)
best_name = results_df.iloc[0]['Optimizer']
w_best, b_best, _, _ = optimizers[best_name]
y_pred_best = predict(X_test, w_best, b_best)

plt.figure(figsize=(8, 8))
plt.scatter(y_test, y_pred_best, alpha=0.6, s=50, edgecolors='black', linewidth=0.5)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 
         'r--', lw=2, label='Perfect Prediction')

plt.xlabel("Actual Price ($)", fontsize=12)
plt.ylabel("Predicted Price ($)", fontsize=12)
plt.title(f"Prediction vs Actual ({best_name})", fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("üìç Titik yang dekat dengan garis merah = prediksi bagus")
print("   Titik yang jauh = prediksi meleset")

In [None]:
# 4. Error Distribution
errors = (y_test - y_pred_best).flatten()

plt.figure(figsize=(10, 5))
plt.hist(errors, bins=30, alpha=0.7, color='skyblue', edgecolor='black')
plt.axvline(x=0, color='red', linestyle='--', linewidth=2, label='Zero Error')

plt.xlabel("Prediction Error ($)", fontsize=12)
plt.ylabel("Frequency", fontsize=12)
plt.title("Error Distribution", fontsize=14, fontweight='bold')
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)

# Statistics
mean_error = np.mean(errors)
std_error = np.std(errors)
plt.text(0.05, 0.95, f'Mean: ${mean_error:,.2f}\nStd: ${std_error:,.2f}',
         transform=plt.gca().transAxes, verticalalignment='top',
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5),
         fontsize=11)

plt.tight_layout()
plt.show()

print("üìä Distribusi error: Idealnya centered di 0")
print(f"   Mean error: ${mean_error:,.2f}")
print(f"   Std error:  ${std_error:,.2f}")

---
# PART 9: KESIMPULAN & REKOMENDASI
---

In [None]:
best = results_df.iloc[0]

print("=" * 70)
print("üèÜ BEST OPTIMIZER")
print("=" * 70)
print(f"\nOptimizer: {best['Optimizer']}")
print(f"\nPerformance:")
print(f"  ‚Ä¢ R¬≤ Score:  {best['R¬≤ Score']:.4f}")
print(f"  ‚Ä¢ Test RMSE: ${best['Test RMSE']:,.2f}")
print(f"  ‚Ä¢ Test MAE:  ${best['Test MAE']:,.2f}")
print(f"  ‚Ä¢ Time:      {best['Time (s)']:.2f}s")

print(f"\nüìä INTERPRETASI:")
print(f"  R¬≤ = {best['R¬≤ Score']:.4f} artinya model dapat menjelaskan")
print(f"  {best['R¬≤ Score']*100:.2f}% variance dalam harga mobil.")
print(f"\n  RMSE = ${best['Test RMSE']:,.2f} artinya rata-rata error")
print(f"  prediksi adalah sekitar ${best['Test RMSE']:,.2f}.")

print("\n" + "=" * 70)
print("üéì REKOMENDASI BELAJAR")
print("=" * 70)

print("""
1. PAHAMI MATEMATIKA:
   ‚úì Linear Algebra (matrix, dot product)
   ‚úì Calculus (derivatives, gradients)
   ‚úì Statistics (mean, variance)
   ‚úì Optimization (gradient descent)

2. MASTER OPTIMIZER:
   ‚Ä¢ Mulai dengan Gradient Descent (paling simple)
   ‚Ä¢ Pahami Momentum (lebih cepat)
   ‚Ä¢ Gunakan Adam untuk production (paling reliable)

3. BEST PRACTICES:
   ‚úì Selalu split data (train/test)
   ‚úì Standardize features
   ‚úì Monitor multiple metrics
   ‚úì Visualize results

4. NEXT STEPS:
   ‚Ä¢ Pelajari Neural Networks
   ‚Ä¢ Explore PyTorch / TensorFlow
   ‚Ä¢ Build portfolio projects
   ‚Ä¢ Kaggle competitions

5. RESOURCES:
   ‚Ä¢ Andrew Ng - Machine Learning (Coursera)
   ‚Ä¢ Fast.ai - Practical Deep Learning
   ‚Ä¢ Papers with Code
   ‚Ä¢ Kaggle Learn
""")

print("=" * 70)
print("‚úÖ TUTORIAL SELESAI! HAPPY LEARNING! üöÄ")
print("=" * 70)