# AutoGluon Ensemble: GBM, LGBM, XGBoost, CatBoost, RF, Voting, Stacking

이 노트북은 AutoGluon을 사용하여 다양한 회귀 모델을 학습하고 앙상블(Voting, Stacking) 성능을 테스트합니다.
참조: `Plus_7_ensemble.ipynb`

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from autogluon.tabular import TabularDataset, TabularPredictor

np.set_printoptions(precision=4)

  from .autonotebook import tqdm as notebook_tqdm


### 1. 데이터 로드 및 전처리

In [2]:
data = fetch_california_housing()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['MedHouseVal'] = data.target

print("데이터 크기:", df.shape)
df.head()

데이터 크기: (20640, 9)


Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,MedHouseVal
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23,4.526
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22,3.585
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24,3.521
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,3.413
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,3.422


In [3]:
# 학습/테스트 데이터 분리
train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)

print("Train size:", train_data.shape)
print("Test size:", test_data.shape)

Train size: (16512, 9)
Test size: (4128, 9)


### 2. 모델 학습 설정 (Hyperparameters & Ensemble)

- **사용 모델**: GBM (LightGBM), XGB (XGBoost), CAT (CatBoost), RF (Random Forest)
- **앙상블 전략**:
    - **Voting (Weighted Ensemble)**: AutoGluon은 기본적으로 성능이 좋은 모델들의 가중 평균(Weighted Ensemble)을 생성합니다.
    - **Stacking**: `num_bag_folds`를 설정하여 K-Fold Bagging 및 Stacking을 수행합니다.

In [4]:
label = 'MedHouseVal'
eval_metric = 'mean_squared_error'  # 회귀 문제 평가 지표

# 1. 특정 모델 지정 (GBM, XGB, CAT, RF)
# AutoGluon에서 'GBM'은 LightGBM을 의미합니다.
hyperparameters = {
    'GBM': {}, 
    'XGB': {}, 
    'CAT': {}, 
    'RF': {}
}

# 2. Stacking 설정
# num_bag_folds >= 2 이면 스태킹(Stacking)이 활성화됩니다.
num_bag_folds = 5  
num_stack_levels = 1 # 스태킹 레이어 수 (0이면 Bagging만, 1 이상이면 Stacking)

save_path = 'ag_models_california_housing'

### 3. 학습 실행

In [5]:
predictor = TabularPredictor(label=label, eval_metric=eval_metric, path=save_path).fit(
    train_data,
    hyperparameters=hyperparameters,
    num_bag_folds=num_bag_folds,
    num_stack_levels=num_stack_levels,
    fit_weighted_ensemble=True  # Weighted Ensemble (Voting 유사 효과) 활성화
)

Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.5.0
Python Version:     3.11.14
Operating System:   Windows
Platform Machine:   AMD64
Platform Version:   10.0.26200
CPU Count:          32
Pytorch Version:    2.9.1+cpu
CUDA Version:       CUDA is not available
Memory Avail:       72.78 GB / 127.76 GB (57.0%)
Disk Space Avail:   850.97 GB / 1861.70 GB (45.7%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme'  : New in v1.5: The state-of-the-art for tabular data. Massively better than 'best' on datasets <100000 samples by using new Tabular Foundation Models (TFMs) meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, TabDPT, and TabM. Requires a GPU and `pip install autogluon.tabular[tabarena]` to install TabPFN, TabICL, and TabDPT.
	presets

### 4. 성능 평가 및 리더보드 확인

학습된 모델들의 성능을 비교합니다. 
- `WeightedEnsemble_L2` (또는 L3): 앙상블 모델 (Voting 역할)
- `LightGBM_BAG_L1` 등: Bagging/Stacking된 개별 모델들

In [6]:
# 리더보드 출력 (테스트 데이터 기준 평가)
leaderboard = predictor.leaderboard(test_data, silent=False)

print("\n--- Top Models ---")
print(leaderboard.head())

                 model  score_test  score_val         eval_metric  pred_time_test  pred_time_val   fit_time  pred_time_test_marginal  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0  WeightedEnsemble_L3   -0.176229  -0.186833  mean_squared_error        2.211723       1.313101  80.844694                 0.013969                0.000505           0.042846            3       True         10
1      CatBoost_BAG_L1   -0.177661  -0.190267  mean_squared_error        0.173399       0.034732  66.866957                 0.173399                0.034732          66.866957            1       True          3
2       XGBoost_BAG_L2   -0.177699  -0.193632  mean_squared_error        1.736716       0.768528  74.545679                 0.203737                0.014827           1.387385            2       True          9
3  WeightedEnsemble_L2   -0.177895  -0.188066  mean_squared_error        1.546361       0.754210  73.179453                 0.013382                0.000510

### 5. 최종 평가

테스트 데이터 전체에 대한 최종 평가 점수(MSE)를 확인합니다.

In [7]:
results = predictor.evaluate(test_data)
print("Final Test Results:", results)

Final Test Results: {'mean_squared_error': -0.1762292582818425, 'root_mean_squared_error': np.float64(-0.41979668684000176), 'mean_absolute_error': -0.26878763767935043, 'r2': 0.8655158342104664, 'pearsonr': 0.930390954011604, 'median_absolute_error': -0.16897896456718448}


### 6. 예측 결과 확인

In [8]:
y_pred = predictor.predict(test_data)
print("First 5 predictions:\n", y_pred.head())
print("\nFirst 5 actual values:\n", test_data[label].head())

First 5 predictions:
 20046    0.551353
3024     0.735540
15663    5.034648
20484    2.451155
9814     2.524682
Name: MedHouseVal, dtype: float32

First 5 actual values:
 20046    0.47700
3024     0.45800
15663    5.00001
20484    2.18600
9814     2.78000
Name: MedHouseVal, dtype: float64
