[ML] Apply consistent size penalty selecting best classification and regression model #2291

tveasey · 2022-05-31T10:18:18Z

We currently apply a small penalty to prefer selecting small models when the validation loss is similar. This is based on the model size, but is also parameterised by the mean size of all models trained up to each point where we test. The mean size changes through the optimisation loop and means we don't apply a completely consistent penalty when comparing the candidate model with the current best model (whose penalty was calculated using earlier parameters). This change stores the best model size as well so we compute penalties using the same parameters.

valeriy42

Good catch. I have a single comment. If this was a copy/paste error, you could just fix it. If not, would you add a quick explanation, please? No need for me to review it again.

lib/maths/analytics/CBoostedTreeHyperparameters.cc

Consistent model size penalty selecting best model

beb5034

tveasey added >enhancement review :ml/DataFrameAnalysis v8.4.0 labels May 31, 2022

tveasey requested a review from valeriy42 May 31, 2022 10:18

tveasey added 2 commits May 31, 2022 15:44

Test fix

ed04d6a

Docs

35e5ad3

valeriy42 approved these changes Jun 1, 2022

View reviewed changes

lib/maths/analytics/CBoostedTreeHyperparameters.cc Outdated Show resolved Hide resolved

Bug fix

c9b524d

tveasey merged commit 8825530 into elastic:main Jun 2, 2022

tveasey deleted the best-model-selection branch June 2, 2022 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Apply consistent size penalty selecting best classification and regression model #2291

[ML] Apply consistent size penalty selecting best classification and regression model #2291

Uh oh!

tveasey commented May 31, 2022

Uh oh!

valeriy42 left a comment

Uh oh!

Uh oh!

Uh oh!

[ML] Apply consistent size penalty selecting best classification and regression model #2291

[ML] Apply consistent size penalty selecting best classification and regression model #2291

Uh oh!

Conversation

tveasey commented May 31, 2022

Uh oh!

valeriy42 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!