Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Add common tests for consistent decision_function behavior in binary case #10500

Merged

Conversation

Projects
None yet
6 participants
@NarineK
Copy link
Contributor

NarineK commented Jan 18, 2018

Reference Issues/PRs

Fixes: #10175

As discussed in #10175 I added checks/tests for binary case. To do so, I created a helper function for check_classifiers_classes in order to be able to reuse existing code and support binary string and int labels: ['one', 'two'] and [1, -1]. I also added check for decision_function.

Narine Kokhlikyan added some commits Jan 18, 2018

@NarineK

This comment has been minimized.

Copy link
Contributor Author

NarineK commented Jan 18, 2018

@jnothman
Copy link
Member

jnothman left a comment

Thanks

assert_equal(decision.shape, (n_samples, ))
dec_pred = (decision.ravel() > 0).astype(np.int)
assert_array_equal(classifier.classes_[dec_pred], y_pred)
elif (len(classes) == 3 and

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

why not just elif not isinstance(classifier, BaseLibSVM)? Why do we need to be sure there are 3?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 6, 2018

Author Contributor

thanks, yes, we don't need len(classes) == 3 especially if we getattr(classifier, 'decision_function_shape', 'ovr') == 'ovr') instead of isinstance

dec_pred = (decision.ravel() > 0).astype(np.int)
assert_array_equal(classifier.classes_[dec_pred], y_pred)
elif (len(classes) == 3 and
# 1on1 of LibSVM works differently

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

If decision_function_shape='ovr', which is the default, this logic should still work. So instead of the current isinstance, you could use and getattr(classifier, 'decision_function_shape', 'ovr') == 'ovr')...?

not isinstance(classifier, BaseLibSVM)):
assert_equal(decision.shape, (n_samples, len(classes)))
decision_y = np.argmax(decision, axis=1).astype(int)
assert_array_equal(classifier.classes_[decision_y], y_pred)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

we try to give clear error messages to estimator developers to tell them why a check failed. Could you please add one here?

if len(classes) == 2:
assert_equal(decision.shape, (n_samples, ))
dec_pred = (decision.ravel() > 0).astype(np.int)
assert_array_equal(classifier.classes_[dec_pred], y_pred)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

we try to give clear error messages to estimator developers to tell them why a check failed. Could you please add one here?

# This is a pathological data set for ComplementNB.
assert_array_equal(np.unique(y), np.unique(y_pred))
if np.any(classifier.classes_ != classes):
print("Unexpected classes_ attribute for %r: "

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

This is not covered by tests. It would be good idea to add a test in test_estimator_checks that covers the different cases in the new tests (i.e. make sure that they fail with appropriate messages when an estimator is bad, as our library plus code coverage naturally checks the good cases).

X, y = make_blobs(n_samples=30, random_state=0, cluster_std=0.1)
X, y = shuffle(X, y, random_state=7)
X = StandardScaler().fit_transform(X)
X_m, y_m = make_blobs(n_samples=30, random_state=0, cluster_std=0.1)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 3, 2018

Member

suffixes like _m and _b are hard to understand. Better off just using full words

@NarineK

This comment has been minimized.

Copy link
Contributor Author

NarineK commented Feb 6, 2018

Thank you for the review comments, @jnothman ! I'll address those.

Narine Kokhlikyan
msg_shape="Predicted shape of `decision_function` mismatches the expected " \
"shape: expected '%s', got '%s'"
if len(classes) == 2:
assert_equal(decision.shape, (n_samples, ), msg_shape % \

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

For future reference, we no longer need to use helpers like this, and can use a bare assert instead.

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

I'm not sure we need to test shape separate to content.

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 8, 2018

Author Contributor

Shape can be seen as a precheck prior to content. I can remove it if necessary.

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 8, 2018

Member

assert_array_equal will do this pre-check

if len(classes) == 2:
assert_equal(decision.shape, (n_samples, ), msg_shape % \
((n_samples, ), decision.shape))
dec_pred = (decision.ravel() > 0).astype(np.int)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

We should include a row of zeros in X to test the boundary condition in more likelihood.

((n_samples, ), decision.shape))
dec_pred = (decision.ravel() > 0).astype(np.int)
dec_exp = classifier.classes_[dec_pred]
msg="Predicted class mismatch for classifier %r: " \

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Need to mention decision_function in this message:
"DF does not match binary classifier prediction" or similar

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Need space around = for assignment

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 8, 2018

Author Contributor

sure

elif getattr(classifier, 'decision_function_shape', 'ovr') == 'ovr':
assert_equal(decision.shape, (n_samples, len(classes)), msg_shape % \
((n_samples, len(classes)), decision.shape))
decision_y = np.argmax(decision, axis=1).astype(int)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

We might want to try trigger a tie by repeating the test with the following as the only training/test data.

X = [[0], [0], [0]]
y = [0, 1, 2]

Maybe this is too fussy?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 10, 2018

Author Contributor

This seems to be to be too fussy. Since the features are identical the predicted labels are identical as well : y_pred: array(['one', 'one', 'one'] and the expected are: array(['one', 'two', 'three'])

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 10, 2018

Member

Yes, I suppose I meant to check that it gave the first class in all cases, but you're right... Don't worry about it I think

((n_samples, len(classes)), decision.shape))
decision_y = np.argmax(decision, axis=1).astype(int)
assert_array_equal(classifier.classes_[decision_y], y_pred, msg_shape % \
((n_samples, len(classes)), decision.shape))

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Why msg_shape when testing predictions?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 7, 2018

Author Contributor

That's right! I'll fix that. Thank you!

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 7, 2018

Author Contributor

thank you, I'll fix that

y_pred = classifier.predict(X)

if hasattr(classifier, "decision_function"):
decision = classifier.decision_function(X)

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Add an assertion that decision is a ndarray, please

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 8, 2018

Author Contributor

sure

@@ -1443,41 +1443,87 @@ def check_supervised_y_2d(name, estimator_orig):
assert_allclose(y_pred.ravel(), y_pred_2d.ravel())


@ignore_warnings(category=(DeprecationWarning, FutureWarning))
@ignore_warnings
def check_classifiers_classes_helper(X, y, name, classifier_orig):

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Come up with a better name given broader scope?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 7, 2018

Author Contributor

how about check_classifiers_predictions or check_classifiers_predictions_helper ?

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

The former sounds better to me.

@sklearn-lgtm

This comment has been minimized.

Copy link

sklearn-lgtm commented Feb 7, 2018

This pull request introduces 1 alert - view on lgtm.com

new alerts:

  • 1 for Wrong number of arguments for format

Comment posted by lgtm.com

# This is a pathological data set for ComplementNB.
assert_array_equal(np.unique(y), np.unique(y_pred))
if np.any(classifier.classes_ != classes):
msg="Unexpected classes_ attribute for %r: expected '%s', got '%s'" % \

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 7, 2018

Author Contributor

@jnothman , do you have a suggestion what I could use instead of backslash? With backslash it reports E502 the backslash is redundant between brackets

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 7, 2018

Member

Put the whole expression in parentheses and remove the backslash

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 7, 2018

Author Contributor

Thank you :)

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 8, 2018

Author Contributor

@jnothman , I tried different ways of putting parenthesis but it is still failing.
assert_array_equal(y_exp, y_pred, err_msg=msg_classes % (classifier, ", ".join(map(str, y_exp)), ", ".join(map(str, y_pred))))
What kind of parentheses do you have in your mind ? Would you mind showing it on the above example ?

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 9, 2018

Member

I would probably do the following:

        assert_array_equal(classes, classifier.classes_,
                           err_msg="Unexpected classes_ attribute for %r: "
                                   "expected '%s', got '%s'" %
                                   (classifier,
                                    ", ".join(map(str, classes)),
                                    ", ".join(map(str, classifier.classes_))))

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 10, 2018

Author Contributor

Thank you, it worked :)

Narine Kokhlikyan added some commits Feb 11, 2018

@jnothman
Copy link
Member

jnothman left a comment

cosmetic nitpicks. otherwise I think this lgtm

if hasattr(classifier, "decision_function"):
decision = classifier.decision_function(X)
n_samples, n_features = X.shape
assert_true(isinstance(decision, np.ndarray))

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

use a bare assert statement instead of assert_true, please. I know old code here prefers assert_true, but we're moving away from assert_{true,false,equal} at least due to the move to pytest.

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 12, 2018

Author Contributor

sure

assert_array_equal(dec_exp, y_pred,
err_msg="decision_function does not match "
"classifier for %r: expected '%s', got '%s'" %
(classifier, ", ".join(map(str, dec_exp)), ", "

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

it's clearer with that last "," on the next line, please

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 12, 2018

Author Contributor

I tried in the previous commit and I keep getting the error: 'E128 continuation line under-indented for visual indent'

This comment has been minimized.

Copy link
@gxyd

gxyd Feb 12, 2018

Contributor

I guess this should go in-line with "decision_function doesn't match ", i.e

err_msg="decision_function does not match "
        "classifier for %r ...            "

currently "classifier" line starts as:

err_msg="decision_function does not match "
"classifier for %r ...                    "

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 12, 2018

Author Contributor

Thank you, @gxyd ! I'll give a try!

X, y = make_blobs(n_samples=30, random_state=0, cluster_std=0.1)
X, y = shuffle(X, y, random_state=7)
X = StandardScaler().fit_transform(X)
X_multiple, y_multiple = make_blobs(n_samples=30, random_state=0,

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

Please call it "multiclass" rather than "multiple"

y_names_multiple = np.array(labels_multiple)[y_multiple]
y_names_binary = np.array(labels_binary)[y_binary]

for (X, y, y_names) in [(X_multiple, y_multiple, y_names_multiple),

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

first parentheses here are unnecessary

labels_multiple = ["one", "two", "three"]
labels_binary = ["one", "two"]

y_names_multiple = np.array(labels_multiple)[y_multiple]

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

this could be written more explicitly and clearly as np.take(labels_multiple, y_multiple)

if name == 'BernoulliNB':
X = X > X.mean()
set_random_state(classifier)
# fit

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 11, 2018

Member

I don't think this comment helps the reader

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 11, 2018

Author Contributor

I don't think either. I reused the existing code snippet, I'll fix it :)

@jnothman jnothman changed the title Add common tests for consistent decision_function behavior in binary case [MRG+1] Add common tests for consistent decision_function behavior in binary case Feb 11, 2018

Narine Kokhlikyan added some commits Feb 12, 2018

@glemaitre
Copy link
Contributor

glemaitre left a comment

Very few nitpicks

if hasattr(classifier, "decision_function"):
decision = classifier.decision_function(X)
n_samples, n_features = X.shape
assert(isinstance(decision, np.ndarray))

This comment has been minimized.

Copy link
@glemaitre

glemaitre Feb 13, 2018

Contributor

remove the outer paranthesis

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

thanks, will do

if name != "ComplementNB":
# This is a pathological data set for ComplementNB.
assert_array_equal(np.unique(y), np.unique(y_pred))
if np.any(classifier.classes_ != classes):

This comment has been minimized.

Copy link
@glemaitre

glemaitre Feb 13, 2018

Contributor

Why do we have this condition before to make the assert?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

it seems that we do not need it. It came from the old codebase


# training set performance
if name != "ComplementNB":
# This is a pathological data set for ComplementNB.

This comment has been minimized.

Copy link
@glemaitre

glemaitre Feb 13, 2018

Contributor

Could you explain what is happening there?

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

@glemaitre , this code has already existed from the multiclass case. I moved it to a separate method in order to be able to use it both for multiclass and binary case. I can try to investigate .

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

It seems that the ComplementNB has been added recently. During prediction it doesn't always predict all possible classes. For example: the input contains 3 unique classes but the prediction has only 2 unique classes. I have to read to implementation in order to understand why this happens.

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

@jnothman , Do you know why this happens ? Is this a feature of ComplementNB ?

This comment has been minimized.

Copy link
@jnothman

jnothman Feb 16, 2018

Member

@airalcorn2 might have an idea...

This comment has been minimized.

Copy link
@airalcorn2

airalcorn2 Feb 16, 2018

Contributor

It's really just how the math shakes out for this particular data set. The complement feature counts for the three classes are:

[[ 28.19448326  20.20988056]
 [ 21.56648936  44.53201548]
 [ 45.26121905  30.28029562]]

so the complement feature count totals for each class are:

[[ 48.40436382]
 [ 66.09850484]
 [ 75.54151467]]

As a result, the normalized complement features are:

[[ 0.58247813  0.41752187]
 [ 0.32627802  0.67372198]
 [ 0.59915689  0.40084311]]

As you can see, the normalized complement feature vectors are very similar for the 0 and 2 classes. The final feature_log_prob_ matrix (so, after taking the negative log of the above) looks (when transposed) like:

[[ 0.54046364  1.12000544  0.51223179]
 [ 0.87341835  0.39493775  0.91418518]]

The features for the 0 class in the data matrix are the following:

[[ 1.86111069  2.78745871]
 [ 1.94936262  2.66852288]
 [ 2.0000686   2.57898891]
 [ 1.90144365  2.69194788]
 [ 1.68480541  2.73007368]
 [ 1.87215982  2.62201183]
 [ 1.84494713  2.71265579]
 [ 1.90820282  2.6300342 ]
 [ 1.88070993  2.70714243]
 [ 1.91380191  2.67237895]]

So the second feature is consistently ~1.4 times larger than the first feature for this particular class. Looking back at feature_log_prob_, we can see that class 2 has weights of [0.512, 0.914] while class 0 has weights of [0.540, 0.873], i.e., class 2 has a larger weight on the second feature than class 0, but only a slightly smaller weight on the first feature. As a result, ComplementNB classifies all of the 0 samples as 2.

I think it's just a case of the data not resembling text data, so the assumptions that are baked into ComplementNB end up being detrimental.

Full code to explore is below:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

from sklearn.datasets import make_blobs
from sklearn.naive_bayes import ComplementNB
from sklearn.preprocessing import StandardScaler
from sklearn.utils import shuffle
from sklearn.utils.testing import assert_array_equal

X_multiclass, y_multiclass = make_blobs(n_samples = 30, random_state = 0,
                                        cluster_std = 0.1)
X_multiclass, y_multiclass = shuffle(X_multiclass, y_multiclass,
                                     random_state = 7)
X_multiclass = StandardScaler().fit_transform(X_multiclass)
X_multiclass -= X_multiclass.min() - .1
labels_multiclass = ["one", "two", "three"]
y_names_multiclass = np.take(labels_multiclass, y_multiclass)

(X, y) = (X_multiclass, y_multiclass)

df = pd.DataFrame(dict(x = X[:, 0], y = X[:, 1], color = y))
sns.lmplot("x", "y", data = df, hue = "color", fit_reg = False)
plt.show()

classifier = ComplementNB()
classifier.fit(X, y)
y_pred = classifier.predict(X)
assert_array_equal(np.unique(y), np.unique(y_pred))

print(classifier.feature_count_)
print(classifier.class_count_)
print(classifier.feature_all_)

comp_count = classifier.feature_all_ + classifier.alpha - classifier.feature_count_
print(comp_count)
sums = comp_count.sum(axis = 1, keepdims = True)
print(sums)
normed = comp_count / sums
print(normed)
logged = -np.log(normed)
print(logged.T)

jll = np.dot(X, logged.T)
print(jll)
zero_class = X[np.where(y == 0)]
print(zero_class)
print(zero_class[:, 1] / zero_class[:, 0])
print(jll[np.where(y == 0)])

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 16, 2018

Author Contributor

Thank you for the detailed explanation @airalcorn2 :)



def choose_check_classifiers_labels(name, y, y_names):
if name in ["LabelPropagation", "LabelSpreading"]:

This comment has been minimized.

Copy link
@glemaitre

glemaitre Feb 13, 2018

Contributor
return y if names in ["LabelPropagation", "LabelSpreading"] else y_names

This comment has been minimized.

Copy link
@NarineK

NarineK Feb 15, 2018

Author Contributor

thanks, good idea


return y_names


def check_classifiers_classes(name, classifier_orig):

This comment has been minimized.

Copy link
@glemaitre

glemaitre Feb 13, 2018

Contributor

It seems that we could use parametrize using pytest.
Do we avoid to do that in the estimator_checks tests to not force pytest as a dependency, @jnothman @lesteve?

Narine Kokhlikyan
@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Feb 15, 2018

@NarineK

This comment has been minimized.

Copy link
Contributor Author

NarineK commented Feb 15, 2018

I'd be interested in knowing as well. Maybe @jnothman knows more about it :)

Narine Kokhlikyan added some commits Feb 15, 2018

@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Feb 16, 2018

So this is a MRG+2. Thanks @NarineK.

@glemaitre glemaitre merged commit 01c1472 into scikit-learn:master Feb 16, 2018

8 checks passed

ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: python2 Your tests passed on CircleCI!
Details
ci/circleci: python3 Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.5%)
Details
codecov/project Absolute coverage decreased by -1.84% but relative coverage increased by +3.49% compared to afe540c
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.