## XGBoost (Extreme Gradient Boosting)

- Core Idea: Loss Function + Regularization $\Omega(f_t)$ (Tree complexity)
$$
\mathcal{L} =
\sum_{i=1}^{n} l(y_i, \hat{y}_i)
\;+\;
\Omega(f_t)
$$

- In which, 

$$
\Omega(f_t)
=
\gamma T
+
\frac{1}{2}\lambda \sum_{j=1}^{T} ||w_j||^2
$$

- where $T$ is the number of tree leaves

#### Steps: 
1. Loss function + Regularization 

2. Second level Taylor Series Expansion to convert into an approximate function 

3. sample -> leave node

4. scoring function get 

```python

X_train_all, X_test, Y_train_all, Y_test = train_test_split(
    X, Y, 
    test_size=0.2, 
    random_state=42,
    stratify=Y
)

X_train, X_valid, Y_train, Y_valid = train_test_split(
    X_train_all, Y_train_all,     
    test_size=0.2,
    random_state=42,
    stratify=Y_train_all
)

model = XGBClassifier(
    n_estimators=2000,
    learning_rate=0.03,
    max_depth=6,
    subsample=0.8,
    colsample_bytree=0.8,
    objective="binary:logistic",
    # for binary classification 
    eval_metric="logloss", 
    # early stop to prevent overfitting
    eval_set=[(X_valid, y_valid)],
    early_stopping_rounds=50,
)
```
---
With Kmeans

```python
from xgboost import XGBClassifier
from sklearn.model_selection import StratifiedKFold
import numpy as np

model_params = {
    "n_estimators": 2000,
    "learning_rate": 0.03,
    "max_depth": 6,
    "min_child_weight": 1,
    "subsample": 0.8,
    "colsample_bytree": 0.8,
    "gamma": 0,
    "reg_lambda": 1,
    "eval_metric": "logloss",
    "tree_method": "hist"
}

skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

oof = np.zeros(len(X))
test_preds = []

for fold, (train_idx, val_idx) in enumerate(skf.split(X, y)):
    print(f"FOLD {fold}")

    model = XGBClassifier(**model_params)

    model.fit(
        X.iloc[train_idx], y.iloc[train_idx],
        eval_set=[(X.iloc[val_idx], y.iloc[val_idx])],
        early_stopping_rounds=100,
        verbose=False
    )

    oof[val_idx] = model.predict_proba(X.iloc[val_idx])[:, 1]
    test_preds.append(model.predict_proba(X_test)[:, 1])

final_pred = np.mean(test_preds, axis=0)
```