<a href="https://colab.research.google.com/github/fadhilahmad11/Hands-on-Machine-Learning-with-Scikit-Learn-TensorFlow-Tugas-Machine-LearningW8-W16/blob/main/Chapter4_Training_Models_Report.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Chapter 4: Training Models  

---

## 1. Pendahuluan  

Bab ini membahas teori dasar dan praktik pelatihan model Machine Learning. Fokus utama pada:  
- Linear Regression (Normal Equation & Gradient Descent)  
- Polynomial Regression  
- Regularisasi (Ridge, Lasso, Elastic Net)  
- Logistic Regression dan Softmax  



## 2. Linear Regression  

### Persamaan model  

$
\hat{y} = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \dots + \theta_n x_n = \theta^T x
$

### Normal Equation  

Digunakan untuk menghitung parameter optimal secara langsung:  

$
\theta = (X^T X)^{-1} X^T y
$

Contoh kode:
```python
import numpy as np

X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

X_b = np.c_[np.ones((100, 1)), X]
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
```
Hasil `theta_best` mendekati [4, 3].

### Linear Regression dengan Scikit-Learn  

```python
from sklearn.linear_model import LinearRegression

lin_reg = LinearRegression()
lin_reg.fit(X, y)
```



## 3. Gradient Descent  

Tujuan: meminimalkan cost function (MSE):  

$
MSE(\theta) = \frac{1}{m} \sum_{i=1}^{m} (\theta^T x^{(i)} - y^{(i)})^2
$

Gradiennya:  

$
\frac{\partial MSE}{\partial \theta} = \frac{2}{m} X^T (X \theta - y)
$

### Stochastic Gradient Descent  

```python
from sklearn.linear_model import SGDRegressor

sgd_reg = SGDRegressor(max_iter=1000, tol=1e-3, penalty=None, eta0=0.1)
sgd_reg.fit(X, y.ravel())
```



## 4. Polynomial Regression  

Linear Regression pada fitur polinomial untuk menangkap hubungan non-linear.  

```python
from sklearn.preprocessing import PolynomialFeatures

poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly_features.fit_transform(X)
```



## 5. Regularisasi  

### Ridge Regression  

$
J(\theta) = MSE(\theta) + \alpha \sum_{i=1}^n \theta_i^2
$

```python
from sklearn.linear_model import Ridge

ridge_reg = Ridge(alpha=1, solver="cholesky")
ridge_reg.fit(X, y)
```

### Lasso Regression  

$
J(\theta) = MSE(\theta) + \alpha \sum_{i=1}^n |\theta_i|
$

```python
from sklearn.linear_model import Lasso

lasso_reg = Lasso(alpha=0.1)
lasso_reg.fit(X, y)
```

### Elastic Net  

$
J(\theta) = MSE(\theta) + r \alpha \sum |\theta_i| + \frac{(1 - r)}{2} \alpha \sum \theta_i^2
$

```python
from sklearn.linear_model import ElasticNet

elastic_net = ElasticNet(alpha=0.1, l1_ratio=0.5)
elastic_net.fit(X, y)
```



## 6. Logistic Regression  

Untuk klasifikasi biner:  

$
\hat{p} = \frac{1}{1 + e^{-\theta^T x}}
$

Cost function:

$
J(\theta) = -\frac{1}{m} \sum \left[ y^{(i)} \log(\hat{p}^{(i)}) + (1 - y^{(i)}) \log(1 - \hat{p}^{(i)}) \right]
$

```python
from sklearn.linear_model import LogisticRegression
from sklearn import datasets

iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]
y = (iris["target"] == 2).astype(int)

log_reg = LogisticRegression()
log_reg.fit(X, y)
```



## 7. Softmax Regression  

Untuk multiclass:  

$
p_k = \frac{e^{s_k(x)}}{\sum_j e^{s_j(x)}}
$

```python
softmax_reg = LogisticRegression(multi_class="multinomial", solver="lbfgs", C=10)
softmax_reg.fit(X, iris["target"])
```
