Skip to content

Commit

Permalink
Path smoothing (#2950)
Browse files Browse the repository at this point in the history
* Path smoothing

* Try to fix issue with gpu version.

* Fix failing CI for R package.

* Minor fixes.

* Minor refactor.

* Restore old code to get CI working.

* Fix style issues.

* Fix ci for R package.

* Minor fixes for docs and code style.

* Update docs.
  • Loading branch information
btrotta committed May 3, 2020
1 parent 6823af9 commit e50a915
Show file tree
Hide file tree
Showing 13 changed files with 314 additions and 141 deletions.
2 changes: 1 addition & 1 deletion .ci/test_r_package.sh
Expand Up @@ -98,7 +98,7 @@ if grep -q -R "WARNING" "$LOG_FILE_NAME"; then
exit -1
fi

ALLOWED_CHECK_NOTES=2
ALLOWED_CHECK_NOTES=3
NUM_CHECK_NOTES=$(
cat ${LOG_FILE_NAME} \
| grep -e '^Status: .* NOTE.*' \
Expand Down
2 changes: 2 additions & 0 deletions docs/Parameters-Tuning.rst
Expand Up @@ -81,4 +81,6 @@ Deal with Over-fitting

- Try ``extra_trees``

- Try increasing ``path_smooth``

.. _Optuna: https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258
16 changes: 16 additions & 0 deletions docs/Parameters.rst
Expand Up @@ -522,6 +522,22 @@ Learning Control Parameters

- applied once per forest

- ``path_smooth`` :raw-html:`<a id="path_smooth" title="Permalink to this parameter" href="#path_smooth">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = double, constraints: ``path_smooth >= 0.0``

- controls smoothing applied to tree nodes

- helps prevent overfitting on leaves with few samples

- if set to zero, no smoothing is applied

- if ``path_smooth > 0`` then ``min_data_in_leaf`` must be at least ``2``

- larger values give stronger regularisation

- the weight of each node is ``(n / path_smooth) * w + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node

- note that the parent output ``w_p`` itself has smoothing applied, unless it is the root node, so that the smoothing effect accumulates with the tree depth

- ``verbosity`` :raw-html:`<a id="verbosity" title="Permalink to this parameter" href="#verbosity">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``verbose``

- controls the level of LightGBM's verbosity
Expand Down
10 changes: 10 additions & 0 deletions include/LightGBM/config.h
Expand Up @@ -495,6 +495,16 @@ struct Config {
// desc = applied once per forest
std::vector<double> cegb_penalty_feature_coupled;

// check = >= 0.0
// desc = controls smoothing applied to tree nodes
// desc = helps prevent overfitting on leaves with few samples
// desc = if set to zero, no smoothing is applied
// desc = if ``path_smooth > 0`` then ``min_data_in_leaf`` must be at least ``2``
// desc = larger values give stronger regularisation
// descl2 = the weight of each node is ``(n / path_smooth) * w + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node
// descl2 = note that the parent output ``w_p`` itself has smoothing applied, unless it is the root node, so that the smoothing effect accumulates with the tree depth
double path_smooth = 0;

// alias = verbose
// desc = controls the level of LightGBM's verbosity
// desc = ``< 0``: Fatal, ``= 0``: Error (Warning), ``= 1``: Info, ``> 1``: Debug
Expand Down
5 changes: 3 additions & 2 deletions include/LightGBM/tree.h
Expand Up @@ -142,6 +142,9 @@ class Tree {
/*! \brief Get depth of specific leaf*/
inline int leaf_depth(int leaf_idx) const { return leaf_depth_[leaf_idx]; }

/*! \brief Get parent of specific leaf*/
inline int leaf_parent(int leaf_idx) const {return leaf_parent_[leaf_idx]; }

/*! \brief Get feature of specific split*/
inline int split_feature(int split_idx) const { return split_feature_[split_idx]; }

Expand All @@ -163,8 +166,6 @@ class Tree {
return split_feature_inner_[node_idx];
}

inline int leaf_parent(int leaf_idx) const { return leaf_parent_[leaf_idx]; }

inline uint32_t threshold_in_bin(int node_idx) const {
return threshold_in_bin_[node_idx];
}
Expand Down
8 changes: 8 additions & 0 deletions src/io/config.cpp
Expand Up @@ -314,6 +314,14 @@ void Config::CheckParamConflict() {
force_col_wise = true;
force_row_wise = false;
}
// min_data_in_leaf must be at least 2 if path smoothing is active. This is because when the split is calculated
// the count is calculated using the proportion of hessian in the leaf which is rounded up to nearest int, so it can
// be 1 when there is actually no data in the leaf. In rare cases this can cause a bug because with path smoothing the
// calculated split gain can be positive even with zero gradient and hessian.
if (path_smooth > kEpsilon && min_data_in_leaf < 2) {
min_data_in_leaf = 2;
Log::Warning("min_data_in_leaf has been increased to 2 because this is required when path smoothing is active.");
}
if (is_parallel && monotone_constraints_method == std::string("intermediate")) {
// In distributed mode, local node doesn't have histograms on all features, cannot perform "intermediate" monotone constraints.
Log::Warning("Cannot use \"intermediate\" monotone constraints in parallel learning, auto set to \"basic\" method.");
Expand Down
5 changes: 5 additions & 0 deletions src/io/config_auto.cpp
Expand Up @@ -229,6 +229,7 @@ const std::unordered_set<std::string>& Config::parameter_set() {
"cegb_penalty_split",
"cegb_penalty_feature_lazy",
"cegb_penalty_feature_coupled",
"path_smooth",
"verbosity",
"input_model",
"output_model",
Expand Down Expand Up @@ -450,6 +451,9 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
cegb_penalty_feature_coupled = Common::StringToArray<double>(tmp_str, ',');
}

GetDouble(params, "path_smooth", &path_smooth);
CHECK_GE(path_smooth, 0.0);

GetInt(params, "verbosity", &verbosity);

GetString(params, "input_model", &input_model);
Expand Down Expand Up @@ -654,6 +658,7 @@ std::string Config::SaveMembersToString() const {
str_buf << "[cegb_penalty_split: " << cegb_penalty_split << "]\n";
str_buf << "[cegb_penalty_feature_lazy: " << Common::Join(cegb_penalty_feature_lazy, ",") << "]\n";
str_buf << "[cegb_penalty_feature_coupled: " << Common::Join(cegb_penalty_feature_coupled, ",") << "]\n";
str_buf << "[path_smooth: " << path_smooth << "]\n";
str_buf << "[verbosity: " << verbosity << "]\n";
str_buf << "[max_bin: " << max_bin << "]\n";
str_buf << "[max_bin_by_feature: " << Common::Join(max_bin_by_feature, ",") << "]\n";
Expand Down

0 comments on commit e50a915

Please sign in to comment.