Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Stochastic derivatives for boosted tree training #811

Merged
merged 13 commits into from
Nov 14, 2019

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Nov 11, 2019

The primary change is to use a random downsample (bag) of data for each new tree we train. I extended initialisation to perform a line search over the downsampling ratio to find a good range to search during fine tuning. This is primarily intended to improve runtime on larger data sets, but incidentally improved QoR for many data sets since it tends to decorrelates the errors between trees.

This also tweaks the initial range used for line search to base its start and end point on percentile values for the gain and total curvature of a tree. Finally, since I needed to update progress monitoring to account for the extra line searches, I switched to updating progress after each tree which is trained. This gives much finer grained progress monitoring.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good altogether. If I didn't miss it, you need to add code for persistence/restoring of the new hyperparameters and bump the schema version. Otherwise, just a few minor comments to improve readability.

lib/maths/CBoostedTreeFactory.cc Outdated Show resolved Hide resolved
lib/maths/CBoostedTreeFactory.cc Show resolved Hide resolved
lib/maths/CBoostedTreeFactory.cc Show resolved Hide resolved

// We need to scale the regularisation terms to account for the difference
// in the down sample factor compared to the value used in the line search.
auto scaleRegularizers = [&](CBoostedTreeImpl& tree, double downsampleFactor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking of need to scale the regularizers! ➕

lib/maths/CBoostedTreeFactory.cc Outdated Show resolved Hide resolved
lib/maths/CBoostedTreeImpl.cc Show resolved Hide resolved
Comment on lines +692 to +693
nextTreeCountToRefreshSplits +=
static_cast<std::size_t>(std::max(0.5 / eta, 2.0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a comment explaining what is going on here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a catch all comment for amortising cost computing splits. See 56f32ea. I didn't explicitly mention why I tied to eta, i.e. captures something about how different we expect splits chosen from round-to-round to be. Does this seem sufficient to you?

lib/maths/CBoostedTreeImpl.cc Show resolved Hide resolved
@tveasey
Copy link
Contributor Author

tveasey commented Nov 14, 2019

Thanks for the review @valeriy42 and good catch on missing persistence changes! I think I've now addressed all your comments. Can you take another look?

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for addressing the issues.

@tveasey tveasey merged commit fb6ea40 into elastic:master Nov 14, 2019
@tveasey tveasey deleted the stochastic-derivatives branch November 14, 2019 20:55
@tveasey tveasey changed the title [ML] Stochastic derivatives for tree training [ML] Stochastic derivatives for boosted tree training Nov 14, 2019
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Nov 28, 2019
tveasey added a commit that referenced this pull request Nov 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants