New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LARS fixes: intercept, normalization, bugfix #3493
LARS fixes: intercept, normalization, bugfix #3493
Conversation
gamma = val1; | ||
if ((val2 > 0.0) && (val2 < gamma)) | ||
if ((val2 >= 0.0) && (val2 < gamma)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the tiny relaxation that is the bugfix part of this PR.
…e don't learn anything.
…es not seem to be supported by the paper.
…he right thing to do. (To me the paper is slightly unclear here.)
gamma = val1; | ||
if ((val2 >= 0.0) && (val2 < gamma)) | ||
gamma = val2; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out the "tiny relaxation" is not so tiny: on LARS iterations where we are using the LASSO modification, sometimes we remove a feature because its sign changes (this is Eq. 3.6). However, we need to avoid considering it for the next iteration---so we can't use the relaxation that allows gamma
(the step size) to become 0, since the removed feature's correlation should be exactly zero.
I also had one more thing to fix here: |
…termination condition partway through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second approval provided automatically after 24 hours. 👍
This is the end of a very deep rabbit hole. I thought that #3442 was a result of LARS not supporting fitting an intercept and being able to normalize data to unit variance. That's actually not the case, but I only found that out after writing the support, so here it is.
Deep inside the changes is a minor bugfix that takes a little while to explain:
LARS is an iterative algorithm that adds a feature to the model in each iteration. It selects which feature is the best to add based on residuals, and then selects a step size for how much of that feature to add (so to speak). The paper's rule for choosing the step size, in Equation 2.13, forces a positive (specifically nonzero) step size. The step size is chosen such that we step exactly to the point where the next feature should be added.
However, it can sometimes happen that two different features have the exact same residual value, and could equally be chosen as the next feature to add to the model. In this case, the step size should technically be 0: we should add both features at once. I say "technically" because the paper does not actually clarify this, so my claim there is somewhat rooted in opinion, and I think it's possible to disagree on reasonable grounds.
Anyway, the fix implied by my opinion is to relax Equation 2.13 to allow a zero step size, and thus what effectively happens is that the model can add two features at once.
This fixes the newly-added KKT checks from @gscarella, which I adapted to add to the LARS test suite.
About the intercept and normalization support: I enable them by default since the paper's theory assumes that. There's a little bit of extra accounting to make sure predictions are still right when this happens.