## 파이썬 머신러닝
# 그래디언트 부스팅 (Gradient Boosting)

- 그래디언트 부스팅도 랜덤 포레스트 처럼 나무를 여러개 만듭니다. 단, 한꺼번에 나무를 만들지 않고 나무를 하나 만든 다음 그것의 **오차**를 줄이는 방법으로 다음 나무를 만들고, 이런 과정을 **단계적**으로 진행합니다.
- 그래디언트 부스팅은 머신러닝 경연대회에서 우승을 많이 차지하였습니다. 어떻게 보면 점수를 올리기 위해 마지막 까지 모든 가능성을 쥐어짜는 방식이라고도 할 수 있겠습니다.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import GradientBoostingClassifier

cancer = load_breast_cancer()

In [7]:
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target)

model =GradientBoostingClassifier()
model.fit(X_train, y_train)

train_score = model.score(X_train, y_train)
test_score = model.score(X_test, y_test)
display(train_score, test_score)

1.0

0.972027972027972

In [8]:
model

GradientBoostingClassifier(criterion='friedman_mse', init=None,
              learning_rate=0.1, loss='deviance', max_depth=3,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, n_estimators=100,
              presort='auto', random_state=None, subsample=1.0, verbose=0,
              warm_start=False)

- 주목해야 할 인자는 **n_estimators** 와 **max_depth**, **learning_rate** 입니다.
- n_estimators 는 나무의 갯수입니다.
- 그래디언트 부스팅은 깊이를 작게하고 나무의 갯수를 늘리는 전략을 많이 취합니다.
- learning_rate 는 **학습률** 로서, 얼마나 빨리 이전의 오류를 줄여나갈지를 결정합니다.

In [15]:
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target)

model =GradientBoostingClassifier(n_estimators=1000, max_depth=1)
model.fit(X_train, y_train)

train_score = model.score(X_train, y_train)
test_score = model.score(X_test, y_test)
display(train_score, test_score)

1.0

0.965034965034965

In [6]:
model.predict(X_test)[:10]

array([0, 1, 1, 1, 0, 1, 1, 1, 1, 1])

In [8]:
model.predict_proba(X_test)[:10]

array([[7.58956876e-01, 2.41043124e-01],
       [6.13339911e-04, 9.99386660e-01],
       [7.35419302e-02, 9.26458070e-01],
       [9.13115064e-04, 9.99086885e-01],
       [9.99999974e-01, 2.59774434e-08],
       [2.77673252e-01, 7.22326748e-01],
       [8.12746925e-07, 9.99999187e-01],
       [4.18801451e-06, 9.99995812e-01],
       [7.32805121e-04, 9.99267195e-01],
       [1.22129743e-08, 9.99999988e-01]])

In [5]:
model.decision_function(X_test)

array([ -1.14696911,   7.39597774,   2.53351307,   6.99773512,
       -17.46603721,   0.95603252,  14.02284525,  12.38327961,
         7.21789768,  18.22076698, -12.46315014, -14.80908752,
         5.17955542, -11.79545634,   8.23071406,   1.3545579 ,
        11.54967041,  10.21032626,  10.16142301,   6.9859555 ,
        11.86801217,  10.38562486,  13.33057021,   3.28684357,
        16.74808022, -12.16589492,  10.05440505,  20.77551465,
        16.49929188, -11.33291215, -12.3372294 , -16.07217935,
       -22.50221104,  10.64466871,   9.19765515,  -4.91615793,
        15.7068671 ,   7.49983341,  13.38033442,   9.80109084,
        -8.27546836,   4.27489861,   9.76038265,   3.95758818,
         4.87044473,   6.22090072, -13.202101  ,  11.58843377,
       -13.53973336,   8.63773945, -14.5157607 , -13.58571723,
       -19.50132996,  12.02223109,  20.89947984,   7.45520139,
       -18.55870794, -11.91896697, -13.5485563 ,  15.13367513,
        15.67721502,   8.08982557,   3.52131676, -17.85

In [7]:
help(GradientBoostingClassifier)

Help on class GradientBoostingClassifier in module sklearn.ensemble.gradient_boosting:

class GradientBoostingClassifier(BaseGradientBoosting, sklearn.base.ClassifierMixin)
 |  Gradient Boosting for classification.
 |  
 |  GB builds an additive model in a
 |  forward stage-wise fashion; it allows for the optimization of
 |  arbitrary differentiable loss functions. In each stage ``n_classes_``
 |  regression trees are fit on the negative gradient of the
 |  binomial or multinomial deviance loss function. Binary classification
 |  is a special case where only a single regression tree is induced.
 |  
 |  Read more in the :ref:`User Guide <gradient_boosting>`.
 |  
 |  Parameters
 |  ----------
 |  loss : {'deviance', 'exponential'}, optional (default='deviance')
 |      loss function to be optimized. 'deviance' refers to
 |      deviance (= logistic regression) for classification
 |      with probabilistic outputs. For loss 'exponential' gradient
 |      boosting recovers the AdaBoost a