Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 2] Allow f_regression to accept a sparse matrix with centering #8065

Merged
merged 3 commits into from Dec 20, 2016

Conversation

@acadiansith
Copy link
Contributor

@acadiansith acadiansith commented Dec 16, 2016

Reference Issue

N/A

What does this implement/fix? Explain your changes.

f_regression currently doesn't accept sparse matrices when center=True to avoid allocating a dense matrix of the centered X values, but the computation can be completed without this dense matrix. The numerators can take advantage of the observation that E[(X - E[X])(Y - E[Y])] = E[X(Y - E[Y])], and the denominator can use E[(X - E[X])^2] = E[X^2] - E[X]^2.

I've also included a unit test to verify that the output is the same for sparse and dense versions of a matrix.

Any other comments?

The output is the same as before (I've checked by hand), but I don't have any tests included to confirm this.

acadiansith added 2 commits Dec 16, 2016
Allows f_regression to accept a sparse matrix when centering=True.
@amueller
Copy link
Member

@amueller amueller commented Dec 16, 2016

thanks this looks nice. We have a bit of a backlog on reviews though

@agramfort
Copy link
Member

@agramfort agramfort commented Dec 18, 2016

LGTM

+1 for merge after a what's new update.

thx @acadiansith

@agramfort agramfort changed the title [MRG] Allow f_regression to accept a sparse matrix with centering [MRG+1] Allow f_regression to accept a sparse matrix with centering Dec 18, 2016
Copy link
Member

@raghavrv raghavrv left a comment

Pending a minor clarification (for my understanding), this LGTM. There is no algorithmic change and we now support sparse...

Thanks!!

n_samples = X.shape[0]

# compute centered values
# note that E[(x - mean(x))*(y - mean(y))] = E[x*(y - mean(y))], so we

This comment has been minimized.

@raghavrv

raghavrv Dec 19, 2016
Member

This comment is applicable only when y is mean centered correct? In which case it would be E[x*y]? (Sorry if I'm misunderstanding)

This comment has been minimized.

@TomDLT

TomDLT Dec 19, 2016
Member

The comment is applicable even when y is not centered.
Yet you are right, here we compute E[x*(y - mean(y)) by first centering y and then computing E[x*y].

This comment has been minimized.

@raghavrv

raghavrv Dec 19, 2016
Member

Ah yes sorry my math is a bit rusty... Thanks heaps for the clarification (online and offline)!

@raghavrv raghavrv changed the title [MRG+1] Allow f_regression to accept a sparse matrix with centering [MRG + 2] Allow f_regression to accept a sparse matrix with centering Dec 19, 2016
@raghavrv
Copy link
Member

@raghavrv raghavrv commented Dec 19, 2016

This needs a whatsnew entry as observed by @agramfort ...

@TomDLT
TomDLT approved these changes Dec 19, 2016
@TomDLT TomDLT merged commit 456fb56 into scikit-learn:master Dec 20, 2016
3 checks passed
3 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@TomDLT
Copy link
Member

@TomDLT TomDLT commented Dec 20, 2016

Thanks @acadiansith

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
…scikit-learn#8065)

* Updated centering for f_regression

Allows f_regression to accept a sparse matrix when centering=True.

* Fixed E226 spacing issue.

* Added f_regression sparse update to whats_new.rst
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…scikit-learn#8065)

* Updated centering for f_regression

Allows f_regression to accept a sparse matrix when centering=True.

* Fixed E226 spacing issue.

* Added f_regression sparse update to whats_new.rst
NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…scikit-learn#8065)

* Updated centering for f_regression

Allows f_regression to accept a sparse matrix when centering=True.

* Fixed E226 spacing issue.

* Added f_regression sparse update to whats_new.rst
paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…scikit-learn#8065)

* Updated centering for f_regression

Allows f_regression to accept a sparse matrix when centering=True.

* Fixed E226 spacing issue.

* Added f_regression sparse update to whats_new.rst
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…scikit-learn#8065)

* Updated centering for f_regression

Allows f_regression to accept a sparse matrix when centering=True.

* Fixed E226 spacing issue.

* Added f_regression sparse update to whats_new.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants