回调函数(通过callbacks参数传入)
   * record_evaluation===>xgboost(evals_result)
   * early_stopping===>xgboost(early_stopping_rounds)
   * log_evaluation===>xgboost(verbose_eval)

In [81]:
import lightgbm as lgb
from sklearn import datasets
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.preprocessing import OrdinalEncoder

In [82]:
X = datasets.fetch_covtype().data[:3000]
y = datasets.fetch_covtype().target[:3000]
X_train, X_test, y_train, y_test = train_test_split(X, y)

print(X_train.shape)
print(y_train.shape)
print(np.unique(y_train))  # 7分类任务

(2250, 54)
(2250,)
[1 2 3 4 5 6 7]


In [83]:
enc = OrdinalEncoder()
y_train_enc = enc.fit_transform(y_train.reshape(-1, 1)).reshape(-1, )
y_test_enc = enc.transform(y_test.reshape(-1, 1)).reshape(-1, )
print(np.unique(y_train_enc))

[0. 1. 2. 3. 4. 5. 6.]


In [84]:
train_dataset = lgb.Dataset(data=X_train, label=y_train_enc)

In [85]:
evals_result = {}  # 储存评估指标(内置和自定义)结果
# Create a callback that records the evaluation history into eval_result.
re_func = lgb.record_evaluation(eval_result=evals_result)

val_dataset = lgb.Dataset(data=X_test, label=y_test_enc)
eval_set = [train_dataset, val_dataset]

params = {"objective": "multiclass",
          "num_class": 7,
          "metric": "multi_error",
          "verbosity": -1}
lgb.train(params=params,
          train_set=train_dataset,
          valid_sets=eval_set,
          num_boost_round=10,
          # List of callback functions that are applied at each iteration.
          callbacks=[re_func])
'''
 after finishing a model training process will have the following structure:
{
 'training':
     {
      'multi_error': [0.48253, 0.35953, ...]
     },
 'valid1':
     {
      'multi_error': [0.480385, 0.357756, ...]
     }
}
'''
evals_result

{'training': OrderedDict([('multi_error',
               [0.4266666666666667,
                0.248,
                0.164,
                0.13466666666666666,
                0.12088888888888889,
                0.10844444444444444,
                0.09688888888888889,
                0.08933333333333333,
                0.08266666666666667,
                0.07866666666666666])]),
 'valid_1': OrderedDict([('multi_error',
               [0.48,
                0.344,
                0.268,
                0.232,
                0.22666666666666666,
                0.21866666666666668,
                0.20933333333333334,
                0.19733333333333333,
                0.19733333333333333,
                0.19466666666666665])])}

In [86]:
# Create a callback that activates early stopping.
es_func = lgb.early_stopping(stopping_rounds=200)

val_dataset = lgb.Dataset(data=X_test, label=y_test_enc)
eval_set = [train_dataset, val_dataset]

params = {"objective": "multiclass",
          "num_class": 7,
          "metric": "multi_error",
          "verbosity": -1}
lgb.train(params=params,
          train_set=train_dataset,
          valid_sets=eval_set,
          callbacks=[es_func])

Training until validation scores don't improve for 200 rounds
Did not meet early stopping. Best iteration is:
[49]	training's multi_error: 0	valid_1's multi_error: 0.148


<lightgbm.basic.Booster at 0x23b3a07c640>

In [87]:
# Create a callback that logs the evaluation results.
le_func = lgb.log_evaluation(
    # period (int, optional (default=1)) –
    # The period to log the evaluation results.
    # The last boosting stage or the boosting stage found by using early_stopping callback is also logged.
    period=9)

val_dataset = lgb.Dataset(data=X_test, label=y_test_enc)
eval_set = [train_dataset, val_dataset]

params = {"objective": "multiclass",
          "num_class": 7,
          "metric": "multi_error"}

lgb.train(params=params,
          train_set=train_dataset,
          valid_sets=eval_set,
          callbacks=[le_func])

You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1871
[LightGBM] [Info] Number of data points in the train set: 2250, number of used features: 33
[LightGBM] [Info] Start training from score -1.818788
[LightGBM] [Info] Start training from score -1.206940
[LightGBM] [Info] Start training from score -2.487577
[LightGBM] [Info] Start training from score -3.186086
[LightGBM] [Info] Start training from score -1.320091
[LightGBM] [Info] Start training from score -2.289340
[LightGBM] [Info] Start training from score -3.083957
[9]	training's multi_error: 0.0826667	valid_1's multi_error: 0.197333
[18]	training's multi_error: 0.0417778	valid_1's multi_error: 0.164
[27]	training's multi_error: 0.0164444	valid_1's multi_error: 0.154667
[36]	training's multi_error: 0.00222222	valid_1's multi_error: 0.146667
[45]	training's multi_error: 0.000444444	valid_1's multi_error: 0.149333
[54]	training's multi_error: 0	valid_1's multi_error: 0.141333
[63]	training's multi

<lightgbm.basic.Booster at 0x23b3b3b37f0>