[MRG + 2] Allow f_regression to accept a sparse matrix with centering #8065
Conversation
Allows f_regression to accept a sparse matrix when centering=True.
thanks this looks nice. We have a bit of a backlog on reviews though |
LGTM +1 for merge after a what's new update. thx @acadiansith |
Thanks!! |
n_samples = X.shape[0] | ||
|
||
# compute centered values | ||
# note that E[(x - mean(x))*(y - mean(y))] = E[x*(y - mean(y))], so we |
raghavrv
Dec 19, 2016
Member
This comment is applicable only when y is mean centered correct? In which case it would be E[x*y]
? (Sorry if I'm misunderstanding)
This comment is applicable only when y is mean centered correct? In which case it would be E[x*y]
? (Sorry if I'm misunderstanding)
TomDLT
Dec 19, 2016
Member
The comment is applicable even when y
is not centered.
Yet you are right, here we compute E[x*(y - mean(y))
by first centering y
and then computing E[x*y]
.
The comment is applicable even when y
is not centered.
Yet you are right, here we compute E[x*(y - mean(y))
by first centering y
and then computing E[x*y]
.
raghavrv
Dec 19, 2016
Member
Ah yes sorry my math is a bit rusty... Thanks heaps for the clarification (online and offline)!
Ah yes sorry my math is a bit rusty... Thanks heaps for the clarification (online and offline)!
This needs a whatsnew entry as observed by @agramfort ... |
Thanks @acadiansith |
sergeyf
added a commit
to sergeyf/scikit-learn
that referenced
this pull request
Feb 28, 2017
…scikit-learn#8065) * Updated centering for f_regression Allows f_regression to accept a sparse matrix when centering=True. * Fixed E226 spacing issue. * Added f_regression sparse update to whats_new.rst
Closed
Sundrique
added a commit
to Sundrique/scikit-learn
that referenced
this pull request
Jun 14, 2017
…scikit-learn#8065) * Updated centering for f_regression Allows f_regression to accept a sparse matrix when centering=True. * Fixed E226 spacing issue. * Added f_regression sparse update to whats_new.rst
NelleV
added a commit
to NelleV/scikit-learn
that referenced
this pull request
Aug 11, 2017
…scikit-learn#8065) * Updated centering for f_regression Allows f_regression to accept a sparse matrix when centering=True. * Fixed E226 spacing issue. * Added f_regression sparse update to whats_new.rst
paulha
added a commit
to paulha/scikit-learn
that referenced
this pull request
Aug 19, 2017
…scikit-learn#8065) * Updated centering for f_regression Allows f_regression to accept a sparse matrix when centering=True. * Fixed E226 spacing issue. * Added f_regression sparse update to whats_new.rst
maskani-moh
added a commit
to maskani-moh/scikit-learn
that referenced
this pull request
Nov 15, 2017
…scikit-learn#8065) * Updated centering for f_regression Allows f_regression to accept a sparse matrix when centering=True. * Fixed E226 spacing issue. * Added f_regression sparse update to whats_new.rst
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Reference Issue
N/A
What does this implement/fix? Explain your changes.
f_regression
currently doesn't accept sparse matrices whencenter=True
to avoid allocating a dense matrix of the centeredX
values, but the computation can be completed without this dense matrix. The numerators can take advantage of the observation thatE[(X - E[X])(Y - E[Y])] = E[X(Y - E[Y])]
, and the denominator can useE[(X - E[X])^2] = E[X^2] - E[X]^2
.I've also included a unit test to verify that the output is the same for sparse and dense versions of a matrix.
Any other comments?
The output is the same as before (I've checked by hand), but I don't have any tests included to confirm this.