Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Line search for best learn rate #948

Merged
merged 8 commits into from
Jan 22, 2020
Merged

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Jan 17, 2020

This switches to performing an initial line search over the learn rate and reducing the range we search in the main optimisation loop. This does cost us some run time, but we generally get small wins in both the QoR and the stability of QoR as borne out by the sample results below.

Because we can't estimate the runtime upfront I've extended the progress monitor to allow us to reset the expected total number of steps. This will advance progress if it needs to; we have no mechanism for reducing progress if the expected number steps increase. In this case, we simply get stuck at the current progress. I use a margin in the initial estimates to reduce the chance of this so we get smoother progress monitoring.

Sample regression results:

data set master mean R^2 master sd R^2 patch mean R^2 patch var R^2
boston 0.85 0.011 0.92 0.006
elevators 0.88 0.028 0.89 0.002
stock 0.98 0.004 0.99 0.003
houses 0.68 0.026 0.66 0.016

Sample classification results:

data set master mean acc master sd acc patch mean acc patch var acc
bank 0.87 0.003 0.88 0.006
census 0.84 0.004 0.84 0.003
clicks 0.66 0.016 0.67 0.009
qsar 0.84 0.007 0.85 0.008

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work altogether! I am wondering if adding mainLoop as prefix or suffix actually improves readability since very often there is no alternative variable, like mainLoopMaximumNumberTrees and initializationLoopMaximumNumberTrees. The exception beeing, totalNumberSteps and mainLoopNumberSteps.

lib/core/CLoopProgress.cc Show resolved Hide resolved
lib/core/unittest/CLoopProgressTest.cc Show resolved Hide resolved
lib/maths/CBoostedTreeFactory.cc Outdated Show resolved Hide resolved
lib/maths/CBoostedTreeFactory.cc Outdated Show resolved Hide resolved
@tveasey
Copy link
Contributor Author

tveasey commented Jan 21, 2020

Thanks for the review @valeriy42. I've now addressed all your comments can you take another look?

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good work on providing explanations!

@tveasey tveasey merged commit 48c1fd0 into elastic:master Jan 22, 2020
@tveasey tveasey deleted the line-search-eta branch January 22, 2020 09:37
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Jan 22, 2020
tveasey added a commit that referenced this pull request Jan 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants