New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Fixes #10284 Added store_cv_values to RidgeClassifierCV and a test. #10297

Merged
merged 12 commits into from Mar 15, 2018

Conversation

Projects
None yet
4 participants
@mabelvj
Contributor

mabelvj commented Dec 12, 2017

Added the parameter and implemented a test.

Please let me know if it's ok!

Fixes #10284
closes #6769

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 13, 2017

Please put "Fixes #10284" or similar in the PR description next time. This tells github to close the issue when the PR is merged.

@jnothman jnothman added the Bug label Dec 13, 2017

@jnothman

Apart from a bit of confusion about the shape of y, and a terminology nitpick, looks good to me

r = RidgeClassifierCV(alphas=alphas, store_cv_values=True)
# with len(y.shape) == 1

This comment has been minimized.

@jnothman

jnothman Dec 13, 2017

Member

len(y.shape) == 2.

# with len(y.shape) == 1
r.fit(x, y)
n_responses = y.shape[1]

This comment has been minimized.

@jnothman

jnothman Dec 13, 2017

Member

we would usually call these "outputs" or "targets" rather than "responses".

This comment has been minimized.

@mabelvj

mabelvj Dec 13, 2017

Contributor

Thought the same, but followed nomenclature in the test for RidgeCV in Line 577. I'll change it to target.

This comment has been minimized.

@mabelvj

mabelvj Dec 13, 2017

Contributor

BTW line 579 (I don't know how to reference it from here) seems to have a typo (double assignation). Should I change it?

rng = rng = np.random.RandomState(42)

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 13, 2017

@mabelvj mabelvj changed the title from added store_cv_values to RidgeClassifierCV and a test. FIX issue #10284 linear_model.RidgeClassifierCV's Parameter store_cv_values to Fixes #10284 Added store_cv_values to RidgeClassifierCV and a test. Dec 13, 2017

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 14, 2017

Done!

r = RidgeClassifierCV(alphas=alphas, store_cv_values=True)
# with len(y.shape) == 2

This comment has been minimized.

@jnothman

jnothman Dec 14, 2017

Member

But this is the case with both tests. What do you intend to distinguish between the different ys?

This comment has been minimized.

@mabelvj

mabelvj Dec 14, 2017

Contributor

I was trying to replicate the test for RidgeCV:

def test_ridgecv_store_cv_values():
    # Test _RidgeCV's store_cv_values attribute.
    rng = np.random.RandomState(42)

    n_samples = 8
    n_features = 5
    x = rng.randn(n_samples, n_features)
    alphas = [1e-1, 1e0, 1e1]
    n_alphas = len(alphas)

    r = RidgeCV(alphas=alphas, store_cv_values=True)

    # with len(y.shape) == 1
    y = rng.randn(n_samples)
    r.fit(x, y)
    assert_equal(r.cv_values_.shape, (n_samples, n_alphas))

    # with len(y.shape) == 2
    n_responses = 3
    y = rng.randn(n_samples, n_responses)
    r.fit(x, y)
    assert_equal(r.cv_values_.shape, (n_samples, n_responses, n_alphas))

However, for the first case # with len(y.shape) == 1I found that RidgeClassifierCV always returns cv_values with 3 dimensions. That's part of the confusion.

I guess then two tests are unnecessary.

This comment has been minimized.

@mabelvj

mabelvj Dec 15, 2017

Contributor

Done! Removed it.

@jnothman

Well I don't mind having a test for y.ndim==1, but I think you were passing in a 2d array with a single column before.

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 17, 2017

Ok, I got it. It's the fact that RidgeCV returned cv_values len(r.cv_values_)==2 for y with len(y.shape)==1that confused me, whereas RidgeClassifierCV returns cv_values with len(r.cv_values_)==3.

I added the test for RidgeClassifierCV for the case of y with shape (5,) and (5,3).

Hope it's ok now.

Fixex #10284. Added store_cv_values to RidgeClassifierCV and a test. …
…ISSUE: Linear_model.RidgeClassifierCV's Parameter store_cv_values
@jnothman

This comment has been minimized.

Member

jnothman commented Dec 18, 2017

DENSE_FILTER = lambda X: X
SPARSE_FILTER = lambda X: sp.csr_matrix(X)
def DENSE_FILTER(X): return X

This comment has been minimized.

@jnothman

jnothman Dec 18, 2017

Member

While I said, sure, get rid of the extra rng =, in general, we avoid changing unrelated things in PRs. They make it harder to review and may cause merge conflicts with other work.

This comment has been minimized.

@mabelvj

mabelvj Dec 19, 2017

Contributor

Sorry, pep8 made that. Generally, I reverse those changes but missed this.

each alpha should be stored in the `cv_values_` attribute (see
below). This flag is only compatible with `cv=None` (i.e. using
Generalized Cross-Validation).
Attributes
----------
cv_values_ : array, shape = [n_samples, n_alphas] or \

This comment has been minimized.

@jnothman

jnothman Dec 19, 2017

Member

So we need to fix this documentation. We never seem to get ndim==2

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 23, 2017

Changed the documentation. Now it reflects that it only returns cv_values with 3 dimensions.

@jnothman

LGTM

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 24, 2017

Please add an entry to the change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@jnothman jnothman changed the title from Fixes #10284 Added store_cv_values to RidgeClassifierCV and a test. to [MRG+1] Fixes #10284 Added store_cv_values to RidgeClassifierCV and a test. Dec 24, 2017

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 24, 2017

Great! I didn't know I had to do it. Should I do it for the other PR I worked in? Fixes #10216 export_graphviz should work with a sklearn.tree._tree.Tree #10234

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 24, 2017

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 25, 2017

Ok! It was just to know how to proceed in the future. Now I have it clear. I'm new to all this and learning a lot! Thanks clarifying it 👍

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Dec 25, 2017

When reviewing documentation I have found one format error in the paragraph for cv_values_ in RidgeClassifierCV which makes the line breaks to appear misplaced 3.2.4.1.10. sklearn.linear_model.RidgeClassifierCV
.

I thought we could fix it now we are at it.

mabelvj added some commits Dec 27, 2017

@@ -195,6 +195,12 @@ Classifiers and regressors
callable and b) the input to the NearestNeighbors model is sparse.
:issue:`9579` by :user:`Thomas Kober <tttthomasssss>`.
- Fixed a bug in :class:`linear_model.RidgeClassifierCV` where
the parameter `store_cv_values` was not immplemented though it was documented

This comment has been minimized.

@jnothman

jnothman Dec 31, 2017

Member

Use double backticks for fixed width font

store_cv_values : boolean, default=False
Flag indicating if the cross-validation values corresponding to
each alpha should be stored in the `cv_values_` attribute (see
below). This flag is only compatible with `cv=None` (i.e. using

This comment has been minimized.

@jnothman

jnothman Dec 31, 2017

Member

Double backticks

This comment has been minimized.

@mabelvj

mabelvj Dec 31, 2017

Contributor

ok! Dind't touch this, I copied it from RidgeCV. Should I change documentation for store_cv_values_ and cv_values_ too in other parts of the file to keep consistency??

This comment has been minimized.

@jnothman

jnothman Dec 31, 2017

Member

I suppose so

@mabelvj

This comment has been minimized.

Contributor

mabelvj commented Jan 5, 2018

Done @jnothman!

@jnothman

This comment has been minimized.

Member

jnothman commented Jan 6, 2018

Thanks. It's best we wait for another review before merge.

@glemaitre

Could you solve the issue with the what new and address the comments

def DENSE_FILTER(X): return X
def SPARSE_FILTER(X): return sp.csr_matrix(X)

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

Could you reverse that

DENSE_FILTER = lambda X: X
SPARSE_FILTER = lambda X: sp.csr_matrix(X)
def DENSE_FILTER(X): return X

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

Reverse this

@@ -352,7 +355,7 @@ def _test_ridge_loo(filter_):
assert_equal(ridge_gcv2.alpha_, alpha_)
# check that we get same best alpha with custom score_func
func = lambda x, y: -mean_squared_error(x, y)
def func(x, y): return -mean_squared_error(x, y)

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

reverse this change

n_targets = 3
y = rng.randn(n_samples, n_targets)
r.fit(x, y)
assert_equal(r.cv_values_.shape, (n_samples, n_targets, n_alphas))

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor
assert r.cv_values_.shape == (n_samples, n_targets, n_alphas) 
def test_ridge_classifier_cv_store_cv_values():
# Test RidgeClassifier

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

remove this comment

# with len(y.shape) == 1
n_targets = 1
r.fit(x, y)
assert_equal(r.cv_values_.shape, (n_samples, n_targets, n_alphas))

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

plain assert

r.fit(x, y)
assert_equal(r.cv_values_.shape, (n_samples, n_responses, n_alphas))
assert_equal(r.cv_values_.shape, (n_samples, n_targets, n_alphas))

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

plain assert

@@ -27,7 +27,7 @@ random sampling procedures.
- :class:`metrics.roc_auc_score` (bug fix)
- :class:`metrics.roc_curve` (bug fix)
- :class:`neural_network.BaseMultilayerPerceptron` (bug fix)
- :class:`neural_network.MLPRegressor` (bug fix)
- :class:`neural_network.MLPRidgeClassifierCV_store_cv_values_issue_10284Regressor` (bug fix)

This comment has been minimized.

@glemaitre

glemaitre Feb 8, 2018

Contributor

ugh, it seems that something went wrong there :)

mabelvj and others added some commits Feb 9, 2018

@qinhanmin2014

LGTM, I've reverted some unrelevant changes (some blank lines). Will merge when green.

@qinhanmin2014 qinhanmin2014 merged commit eb1a3c4 into scikit-learn:master Mar 15, 2018

5 of 6 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: python2 Your tests passed on CircleCI!
Details
ci/circleci: python3 Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
@qinhanmin2014

This comment has been minimized.

Member

qinhanmin2014 commented Mar 15, 2018

Thanks @mabelvj :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment