Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Early stopping in the line searches to compute initial regulariser values #903

Merged
merged 12 commits into from
Dec 17, 2019

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Dec 13, 2019

This migrates to using BO for choosing points at which to evaluate the regularisers in the line searches we perform to find their initial values. Importantly, it also thresholds the minimum expected improvement to bother to continue with the line search to be at least 1% of the current test loss. This saves us up to 3 iterations in the loop (since we lower bound the minimum number of iterations we'll use) or around 40% of the cost of the line searches.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good altogether. Just a few minor comments.

docs/CHANGELOG.asciidoc Outdated Show resolved Hide resolved
Comment on lines +233 to +236
BOOST_REQUIRE_CLOSE(c1, prediction, 3.0);
BOOST_REQUIRE_SMALL(c2, 0.25);
BOOST_REQUIRE_SMALL(c3, 0.25);
BOOST_REQUIRE_SMALL(c4, 0.25);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching the error in BOOST_REQUIRE_CLOSE call. Why the values c2, ... c4 became so much larger?

Copy link
Contributor Author

@tveasey tveasey Dec 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers are slightly unstable. The reason they were 0 before is that none of these features were selected at all for training because of the threshold on MIC. Once they get selected there is scope for them to be non-zero. Since these are still so much smaller than c1, 0.25 vs ~100, I figured it was ok to just allow some leeway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

#include <maths/CSampling.h>
#include <maths/CTools.h>

#include <boost/math/distributions/normal.hpp>
#include <boost/optional/optional_io.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need this include?

Copy link
Contributor Author

@tveasey tveasey Dec 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to print out boost::optional on this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably oversee something, by boost/optional.hpp is already include in the header. in the Line 177 I don't see any use of optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header boost/optional.hpp is include in the header file. Is it still required here? Line 177 doesn't use optional, afaics.

Copy link
Contributor Author

@tveasey tveasey Dec 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this line I've added expectedImprovement is type boost::optional<double>. This print state doesn't compile if I don't include optional_io.hpp

LOG_TRACE(<< "best = " << xmax.cwiseProduct(m_MaxBoundary - m_MinBoundary).transpose()
          << " EI(best) = " << expectedImprovement);

Unfortunately, the print functionality is not included by optional.hpp.

lib/maths/CBoostedTreeFactory.cc Outdated Show resolved Hide resolved
minTestLoss.add(testLoss);
testLosses.emplace_back(regularizer, testLoss);
}
while (testLosses.size() > 0 && testLosses.size() < MAX_LINE_SEARCH_ITERATIONS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to check for the first condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This can happen in the loop which searches for the best downsample factor, i.e. applyRegularizer can return false for all values in the loop

for (auto regularizer :
         {intervalLeftEnd, (2.0 * intervalLeftEnd + intervalRightEnd) / 3.0,
          (intervalLeftEnd + 2.0 * intervalRightEnd) / 3.0, intervalRightEnd}) {
  ...

In that case, we'll just use downsample factor of 1 and this actually SIGSEGVs when trying to compute maximum expected loss without this condition.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +710 to +722
// This has the following steps:
// 1. Coarse search the interval [intervalLeftEnd, intervalRightEnd] using
// fixed steps,
// 2. Fine tune, via Bayesian Optimisation targeting expected improvement,
// and stop if the expected improvement small compared to the current
// minimum test loss,
// 3. Calculate the parameter interval which gives the lowest test losses,
// 4. Fit an OLS quadratic approximation to the test losses in the interval
// from step 3 and use it to estimate the best parameter value,
// 5. Compare the size of the residual errors w.r.t. to the OLS curve from
// step 4 with its variation over the interval from step 3 and truncate
// the returned interval if we can determine there is a low chance of
// missing the best solution by doing so.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants