Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correcting information criterion calculation in least_angle.py #6080

Closed
wants to merge 1 commit into from

Conversation

mehmetbasbug
Copy link

The information criterion calculation is not compatible with the original paper
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA
@amueller
Copy link
Member

amueller commented Oct 7, 2016

@mehmetbasbug sorry for the slow reply. @agramfort @GaelVaroquaux do you have any input on this?

@agramfort
Copy link
Member

sorry @mehmetbasbug for the slow reaction

can you point me to the equation you refer to?

@mehmetbasbug
Copy link
Author

Eq. 2.15 and Eq. 2.16
Basically, the error is due to taking the log of mse.

@@ -1424,6 +1424,7 @@ def fit(self, X, y, copy_X=True):

R = y[:, np.newaxis] - np.dot(X, coef_path_) # residuals
mean_squared_error = np.mean(R ** 2, axis=0)
sigma = np.var(y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it sigma2

@@ -1437,7 +1438,7 @@ def fit(self, X, y, copy_X=True):

self.alphas_ = alphas_
with np.errstate(divide='ignore'):
self.criterion_ = n_samples * np.log(mean_squared_error) + K * df
self.criterion_ = n_samples * mean_squared_error/sigma + K * df
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me it should be

mean_squared_error / sigma2 + K * df

given that K is 2 or log(n)

ideally everything should be divided by n.

Did you check how it affects the example using BIC and AIC?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to divide everything by n then we should have

mean_squared_error / sigma2 + K * df / n_samples

It was a long while ago but I remember getting the same results as the R package from the authors when I used the corrected version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok agreed. can you see why travis is not happy and add a test that compares with R solution? thx

Copy link

@yuachen yuachen Oct 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reply to #7692. In general, it is hard to estimate sigma2 with a single formula for LASSO. I looked at R implementation of selectiveInference Page 12.

Estimate of error standard deviation. If NULL (default), this is estimated using
the mean squared residual of the full least squares fit when n >= 2p, and using
the standard deviation of y when n < 2p.

So if n >= 2p, it should be mean_squared_error * n_samples / (n_samples - df - 1).
if n < 2p, it should be var(y).

For the diabetes data in travis test, n >= 2p, the first formula should be used. This is probably why travis is not happy.

Also, sigma2 should be moved to the second term as a multiplicative term. Dividing by sigma2 is numerically unstable

@agramfort
Copy link
Member

also it breaks the tests. Please see travis errors

@agramfort
Copy link
Member

@mehmetbasbug @yuachen follow up on #9022

please review if you have time

thanks

@agramfort agramfort closed this Jun 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants