New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Issue #8173 - Passing n_neighbors to compute MI #8181

Merged
merged 1 commit into from Jan 19, 2017

Conversation

Projects
None yet
4 participants
@glemaitre
Copy link
Contributor

glemaitre commented Jan 10, 2017

Reference Issue

Fixes #8173

What does this implement/fix? Explain your changes.

The parameters n_neighbors is passed to the function compute_mi in _estimate_mi.
Previously this parameter was not given to compute_mi. The user could not set this
parameter as indicated in the documentation.

Any other comments?

A single test has been added in a classification case.
The computation of the mutual information was in fact correct since it was already tested
for different neighbours

@glemaitre glemaitre changed the title pass n_neighbors in _estimate_mi [MRG] Issue #8173 - Passing n_neighbors to compute MI Jan 10, 2017

@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jan 10, 2017

@jnothman Do you see additional unit test(s) or this is enough.

@jnothman
Copy link
Member

jnothman left a comment

Just to be sure, can you please remove the default n_neighbors from _compute_mi if this harms nothing?

@@ -5,7 +5,8 @@
from scipy.sparse import csr_matrix

from sklearn.utils.testing import (assert_array_equal, assert_almost_equal,
assert_false, assert_raises, assert_equal)
assert_false, assert_raises, assert_equal,
assert_array_almost_equal)

This comment has been minimized.

@jnothman

jnothman Jan 11, 2017

Member

I think the preference is for assert_allclose

@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jan 11, 2017

Just to be sure, can you please remove the default n_neighbors from _compute_mi if this harms nothing?

I did it and then I had a second thought. _compute_mi is a wrapper to choose either the discrete or continuous version of the MI. There is no need for n_neighbors for the discrete. So it seems odd to required absolutely n_neighbors when calling the function for discrete case. I would assume that this is the reason of having a default value.

assert_allclose(mi, [0.06987399, 0.03197151, 0.21946924], rtol=1e-6)
mi_7 = mutual_info_classif(X, y, discrete_features=[2], n_neighbors=7,
random_state=0)
assert_allclose(mi_7, [0.0735522, 0.0343685, 0.2194692], rtol=1e-5)

This comment has been minimized.

@ogrisel

ogrisel Jan 17, 2017

Member

I don't really like tests that hardcode numerical values that are true only for randomly generated but with fixed see data. Is there a better way to test the impact of n_neighbors without hardcoding the values?

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jan 17, 2017

for n_neighbors in [5, 7, 9]:
mi_nn = mutual_info_classif(X, y, discrete_features=[2],
n_neighbors=n_neighbors, random_state=0)
# Check that the continuous values have an higher MI

This comment has been minimized.

@jnothman

jnothman Jan 18, 2017

Member

I think it would help to say "with greater n_neighbors"

@glemaitre glemaitre force-pushed the glemaitre:is/8173 branch 2 times, most recently from ce35545 to 05d0585 Jan 18, 2017

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jan 18, 2017

LGTM

@jnothman jnothman changed the title [MRG] Issue #8173 - Passing n_neighbors to compute MI [MRG+1] Issue #8173 - Passing n_neighbors to compute MI Jan 18, 2017

@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jan 18, 2017

@ogrisel for a second review

@glemaitre glemaitre force-pushed the glemaitre:is/8173 branch from 05d0585 to 1cf33ec Jan 19, 2017

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 19, 2017

LGTM thanks @glemaitre

@raghavrv raghavrv merged commit aaebee1 into scikit-learn:master Jan 19, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

@Przemo10 Przemo10 referenced this pull request Mar 17, 2017

Closed

update fork (#1) #8606

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment