Description
Version
lightgbm==4.5.0
Install
pip install lightgbm
Question
I found that if we use custom objective, the shrinkage of first tree is equal to learning rate. And if we use default objective like mse, the shrinkage of first tree is equal to 1.
Tests codes are like this:
import numpy as np
import lightgbm as lgb
X = np.random.randn(10000, 5)
y = np.random.randn(10000)
def custom_mse_objective(preds, train_data):
labels = train_data.get_label()
residual = preds - labels
grad = residual
hess = np.ones_like(labels)
return grad, hess
params = {
'num_leaves': 3,
'max_depth': 3,
'learning_rate': 0.15,
'verbose': -1,
'seed': 1000,
'linear_tree': True,
'deterministic': True,
'force_row_wise': True,
'n_estimators': 100,
}
print("Training with default MSE objective...")
train_data = lgb.Dataset(X, label=y)
model_default = lgb.train(params, train_data)
print("\nTraining with custom MSE objective...")
params['objective'] = custom_mse_objective
train_data = lgb.Dataset(X, label=y)
model_custom = lgb.train(params, train_data)
print(model_default.dump_model()['tree_info'][0]['shrinkage'])
print(model_custom.dump_model()['tree_info'][0]['shrinkage'])
outputs:
Training with default MSE objective...
Training with custom MSE objective...
1
0.15
As boosting code shows in bool GBDT::TrainOneIter
if (gradients == nullptr || hessians == nullptr) {
for (int cur_tree_id = 0; cur_tree_id < num_tree_per_iteration_; ++cur_tree_id) {
init_scores[cur_tree_id] = BoostFromAverage(cur_tree_id, true);
}
...
} else {
...
}
it only set init_scores when gradients and hessians are both nullptr, which means it must be default objective.
And as shrinkage code shows
if (std::fabs(init_scores[cur_tree_id]) > kEpsilon) {
new_tree->AddBias(init_scores[cur_tree_id]);
}
only fabs(init_scores) > 0, it will set new_tree.shrinkage = 1.
Thus, if we use custom objective, the shrinkage of first tree will be learning rate. And if we use default objective like mse, the shrinkage of first tree will be 1. It's not clear to me if this is a deliberate design or a bug, if it's adesign, are there anyone can explain this for me, thanks a lot!