[ML] Stochastic derivatives for boosted tree training #811

tveasey · 2019-11-11T11:03:31Z

The primary change is to use a random downsample (bag) of data for each new tree we train. I extended initialisation to perform a line search over the downsampling ratio to find a good range to search during fine tuning. This is primarily intended to improve runtime on larger data sets, but incidentally improved QoR for many data sets since it tends to decorrelates the errors between trees.

This also tweaks the initial range used for line search to base its start and end point on percentile values for the gain and total curvature of a tree. Finally, since I needed to update progress monitoring to account for the extra line searches, I switched to updating progress after each tree which is trained. This gives much finer grained progress monitoring.

valeriy42

Looks good altogether. If I didn't miss it, you need to add code for persistence/restoring of the new hyperparameters and bump the schema version. Otherwise, just a few minor comments to improve readability.

lib/maths/CBoostedTreeFactory.cc

valeriy42 · 2019-11-14T09:42:38Z

lib/maths/CBoostedTreeFactory.cc

+
+        // We need to scale the regularisation terms to account for the difference
+        // in the down sample factor compared to the value used in the line search.
+        auto scaleRegularizers = [&](CBoostedTreeImpl& tree, double downsampleFactor) {


Good thinking of need to scale the regularizers! ➕

lib/maths/CBoostedTreeFactory.cc

lib/maths/CBoostedTreeImpl.cc

valeriy42 · 2019-11-14T10:05:32Z

lib/maths/CBoostedTreeImpl.cc

+            nextTreeCountToRefreshSplits +=
+                static_cast<std::size_t>(std::max(0.5 / eta, 2.0));


It would be nice to have a comment explaining what is going on here.

I added a catch all comment for amortising cost computing splits. See 56f32ea. I didn't explicitly mention why I tied to eta, i.e. captures something about how different we expect splits chosen from round-to-round to be. Does this seem sufficient to you?

lib/maths/CBoostedTreeImpl.cc

…oute

tveasey · 2019-11-14T15:37:05Z

Thanks for the review @valeriy42 and good catch on missing persistence changes! I think I've now addressed all your comments. Can you take another look?

valeriy42

LGTM. Thank you for addressing the issues.

Backport #811.

tveasey added 4 commits November 1, 2019 17:48

Stochastic derivatives

d7b6557

Assorted

696fc4c

Unit tests progress monitoring

035382d

Simplify

6965afc

tveasey added >enhancement review v8.0.0 :ml/DataFrameAnalysis v7.6.0 labels Nov 11, 2019

tveasey requested a review from valeriy42 November 11, 2019 11:03

tveasey added 4 commits November 11, 2019 11:12

Docs

e4e398e

Relax test thresholds for other platforms

2e94fa9

Merge branch 'master' into stochastic-derivatives

bb5dccf

Update tests to reflect potentially larger forests

06c044f

valeriy42 reviewed Nov 14, 2019

View reviewed changes

tveasey added 4 commits November 14, 2019 14:23

Expose initial downsample factor parameter

5b9a04a

Improve comment

d9bf645

Handle downsample factor in persistence and implement state upgrade r…

292ff10

…oute

Add comment to explain strategy for recomputing candidate splits

56f32ea

valeriy42 approved these changes Nov 14, 2019

View reviewed changes

Fix unit test

6075175

tveasey merged commit fb6ea40 into elastic:master Nov 14, 2019

tveasey deleted the stochastic-derivatives branch November 14, 2019 20:55

tveasey changed the title ~~[ML] Stochastic derivatives for tree training~~ [ML] Stochastic derivatives for boosted tree training Nov 14, 2019

edsavage mentioned this pull request Nov 15, 2019

[7.6][ML]Upgrade to Eigen 3.3.7 (#827) #829

Merged

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Nov 28, 2019

[ML] Stochastic derivatives for boosted tree training (elastic#811)

6743cde

tveasey mentioned this pull request Nov 28, 2019

[7.6][ML] Stochastic derivatives for boosted tree training #856

Merged

tveasey added a commit that referenced this pull request Nov 28, 2019

[7.6][ML] Stochastic derivatives for boosted tree training (#856)

389b089

Backport #811.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Stochastic derivatives for boosted tree training #811

[ML] Stochastic derivatives for boosted tree training #811

tveasey commented Nov 11, 2019

valeriy42 left a comment

valeriy42 Nov 14, 2019

valeriy42 Nov 14, 2019

tveasey Nov 14, 2019

tveasey commented Nov 14, 2019

valeriy42 left a comment

		nextTreeCountToRefreshSplits +=
		static_cast<std::size_t>(std::max(0.5 / eta, 2.0));

[ML] Stochastic derivatives for boosted tree training #811

[ML] Stochastic derivatives for boosted tree training #811

Conversation

tveasey commented Nov 11, 2019

valeriy42 left a comment

Choose a reason for hiding this comment

valeriy42 Nov 14, 2019

Choose a reason for hiding this comment

valeriy42 Nov 14, 2019

Choose a reason for hiding this comment

tveasey Nov 14, 2019

Choose a reason for hiding this comment

tveasey commented Nov 14, 2019

valeriy42 left a comment

Choose a reason for hiding this comment