Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baseline reduction: separate learning of additive regression baseline #1336

Merged
merged 9 commits into from
Nov 12, 2017

Conversation

albietz
Copy link
Contributor

@albietz albietz commented Oct 5, 2017

This reduction allows a regression learner to separately learn an additive baseline prediction from only "constant" features (taken from the constant_namespace), and the residual on top of that. This seems to make it faster to learn a possibly large constant offset in practice.

cc @JohnLangford

@JohnLangford
Copy link
Member

The windows barf here: https://ci.appveyor.com/project/JohnLangford/vowpal-wabbit/build/1.0.2255#L2307 is presumably because the windows build doesn't include the new file.

void predict_or_learn(baseline& data, base_learner& base, example& ec)
{ if (is_learn)
{ // do a full prediction, for safety in accurate predictive validation
base.predict(ec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can factor base.predict() out of the if/else for simplicity.

@JohnLangford
Copy link
Member

This looks good to go other than the minor refactoring and fixing the windows build about here: https://github.com/JohnLangford/vowpal_wabbit/blob/master/vowpalwabbit/vw_dynamic.vcxproj#L436 . Can you tweak?

@albietz
Copy link
Contributor Author

albietz commented Oct 20, 2017

Some comments:

  • I added a learning rate multiplier based on the largest label magnitude seen so far. It seems like occasionally these get really large (e.g. I was getting values extremely large values at some point when using doubly robust estimates, even though labels were smaller than 10), hence the cap at 1000. It might be useful to allow to explicitly add the multiplier as a flag instead.
  • I added an option for using a separate example with a single global feature for the baseline (assuming the examples don't have that feature), which seems easier than fiddling with feature values if an example has other constant features other than the global. Perhaps I can use a separate namespace instead to avoid conflicts?

@JohnLangford JohnLangford merged commit fd259cd into VowpalWabbit:master Nov 12, 2017
@JohnLangford
Copy link
Member

Merged in, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants