[MRG+1] Fix BIC/AIC for Lasso #9022

Merged
merged 8 commits into from Jun 11, 2017

Conversation

Projects
None yet
4 participants
@agramfort
Member

agramfort commented Jun 6, 2017

Reference Issue

taking over #6080

What does this implement/fix? Explain your changes.

AIC/BIC criterion in LassoLarsIC was buggy ...

Any other comments?

Nope

- with np.errstate(divide='ignore'):
- self.criterion_ = n_samples * np.log(mean_squared_error) + K * df
+ eps64 = np.finfo('float64').eps
+ self.criterion_ = (n_samples * mean_squared_error / (sigma2 + eps64) +

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

I added the eps64 to avoid the division by zero

@agramfort

agramfort Jun 6, 2017

Member

I added the eps64 to avoid the division by zero

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jun 10, 2017

Member

OK, it took me a while to understand that the reason that we don't need the log is that the MSE is actually already the log-likelihood.

@GaelVaroquaux

GaelVaroquaux Jun 10, 2017

Member

OK, it took me a while to understand that the reason that we don't need the log is that the MSE is actually already the log-likelihood.

This comment has been minimized.

@vene

vene Jun 10, 2017

Member

Maybe add a comment pointing any future reader to eqn 53/54 of https://web.stanford.edu/~hastie/Papers/dflasso.pdf? It's true the paper is already in the docstring though, so feel free to just ignore this comment.

@vene

vene Jun 10, 2017

Member

Maybe add a comment pointing any future reader to eqn 53/54 of https://web.stanford.edu/~hastie/Papers/dflasso.pdf? It's true the paper is already in the docstring though, so feel free to just ignore this comment.

This comment has been minimized.

@vene

vene Jun 10, 2017

Member

also maybe document somewhere that the criterion is off by a constant and scaled by n, compared to the actual value, but that this doesn't affect relative comparisons? The wikipedia page for AIC has a section calling this "software unreliability"...

@vene

vene Jun 10, 2017

Member

also maybe document somewhere that the criterion is off by a constant and scaled by n, compared to the actual value, but that this doesn't affect relative comparisons? The wikipedia page for AIC has a section calling this "software unreliability"...

@@ -414,7 +414,7 @@ def test_lasso_lars_ic():
rng = np.random.RandomState(42)
X = diabetes.data
y = diabetes.target
- X = np.c_[X, rng.randn(X.shape[0], 4)] # add 4 bad features
+ X = np.c_[X, rng.randn(X.shape[0], 5)] # add 4 bad features

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

I need this to have enough alpha on the grid so alpha_bic > alpha_aic (otherwise it was equal)

@agramfort

agramfort Jun 6, 2017

Member

I need this to have enough alpha on the grid so alpha_bic > alpha_aic (otherwise it was equal)

This comment has been minimized.

@vene

vene Jun 8, 2017

Member

comment is now out of sync with code

@vene

vene Jun 8, 2017

Member

comment is now out of sync with code

This comment has been minimized.

@vene

vene Jun 8, 2017

Member

comment is out of sync with code

@vene

vene Jun 8, 2017

Member

comment is out of sync with code

- X = y.reshape(-1, 1)
- lars = linear_model.LassoLarsIC(normalize=False)
- assert_no_warnings(lars.fit, X, y)
- assert_true(np.any(np.isinf(lars.criterion_)))

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

not needed anymore. There is no log anymore

@agramfort

agramfort Jun 6, 2017

Member

not needed anymore. There is no log anymore

@agramfort agramfort changed the title from Fix BIC/AIC for Lasso to [MRG] Fix BIC/AIC for Lasso Jun 6, 2017

mehmetbasbug and others added some commits Dec 23, 2015

correcting information criterion calculation in least_angle.py
The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA
@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 8, 2017

Member

good to go on my end. Travis is happy

Member

agramfort commented Jun 8, 2017

good to go on my end. Travis is happy

doc/whats_new.rst
@@ -317,6 +317,8 @@ Bug fixes
- Fixed a memory leak in our LibLinear implementation. :issue:`9024` by
:user:`Sergei Lebedev <superbobry>`
+ - Fix AIC/BIC criterion computation in LassoLarsIC by `Alexandre Gramfort`_

This comment has been minimized.

@vene

vene Jun 8, 2017

Member

space before?

@vene

vene Jun 8, 2017

Member

space before?

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 10, 2017

Member

LGTM.

+1 for merge

Member

GaelVaroquaux commented Jun 10, 2017

LGTM.

+1 for merge

@GaelVaroquaux GaelVaroquaux changed the title from [MRG] Fix BIC/AIC for Lasso to [MRG+1] Fix BIC/AIC for Lasso Jun 10, 2017

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 10, 2017

Member

@vene you give me the other +1 plz?

Member

agramfort commented Jun 10, 2017

@vene you give me the other +1 plz?

@@ -1456,6 +1456,7 @@ def fit(self, X, y, copy_X=True):
R = y[:, np.newaxis] - np.dot(X, coef_path_) # residuals
mean_squared_error = np.mean(R ** 2, axis=0)

This comment has been minimized.

@vene

vene Jun 10, 2017

Member

here we take the mean of sum of squares, only to multiply by n_samples afterwards. It seems mean_squared_error is not used elsewhere in the local scope, we could save some cycles.

@vene

vene Jun 10, 2017

Member

here we take the mean of sum of squares, only to multiply by n_samples afterwards. It seems mean_squared_error is not used elsewhere in the local scope, we could save some cycles.

@vene

This comment has been minimized.

Show comment
Hide comment
@vene

vene Jun 10, 2017

Member

Hmm I'd like a stronger test but I don't have any good ideas of how to get one... this was subtle until I found the equations in the paper, the wikipedia page was not super helpful...

Member

vene commented Jun 10, 2017

Hmm I'd like a stronger test but I don't have any good ideas of how to get one... this was subtle until I found the equations in the paper, the wikipedia page was not super helpful...

@vene

This comment has been minimized.

Show comment
Hide comment
@vene

vene Jun 10, 2017

Member

LGTM apart from the minor comments. @agramfort , if you're busy but agree with my comments I'll be happy to make the changes myself and merge.

Member

vene commented Jun 10, 2017

LGTM apart from the minor comments. @agramfort , if you're busy but agree with my comments I'll be happy to make the changes myself and merge.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 11, 2017

Member
Member

agramfort commented Jun 11, 2017

vene and others added some commits Jun 11, 2017

Merge pull request #12 from vene/fixaic
DOC comments and docstring on criterion computation
@vene

This comment has been minimized.

Show comment
Hide comment
@vene

vene Jun 11, 2017

Member

My comment have been addressed and @GaelVaroquaux 's +1 should still stand. Travis passes; Appveyor failures seem irrelevant and also present on master. (AFAIK appveyor is just back from an outage.)

Merging. Thanks @agramfort and @mehmetbasbug !

Member

vene commented Jun 11, 2017

My comment have been addressed and @GaelVaroquaux 's +1 should still stand. Travis passes; Appveyor failures seem irrelevant and also present on master. (AFAIK appveyor is just back from an outage.)

Merging. Thanks @agramfort and @mehmetbasbug !

@vene vene merged commit 6a35622 into scikit-learn:master Jun 11, 2017

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[FIX] BIC/AIC for Lasso (#9022)
* correcting information criterion calculation in least_angle.py

The information criterion calculation is not compatible with the original paper 
Zou, Hui, Trevor Hastie, and Robert Tibshirani. "On the “degrees of freedom” of the lasso." The Annals of Statistics 35.5 (2007): 2173-2192.
APA

* FIX : fix AIC/BIC computation in LassoLarsIC

* update what's new

* fix test

* fix test

* address comments

* DOC comments and docstring on criterion computation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment