[MRG+1] Issue#5803 : Regression Test added #8112

amanp10 · 2016-12-24T09:54:44Z

Regression Test for testing the functioning of predict for hard voting.

Tests whether the fix for np.bincount by changing the datatype of predictions array to int works properly.

… Branch1

jnothman · 2016-12-25T13:14:54Z

I don't think this failed before #5802's patch:

$ git fetch https://github.com/scikit-learn/scikit-learn refs/pull/8099/head
From https://github.com/scikit-learn/scikit-learn
 * branch            refs/pull/8099/head -> FETCH_HEAD
$ git checkout FETCH_HEAD
HEAD is now at 0a0492a... Merge branch 'master' of https://github.com/amanp10/scikit-learn into Branch1
$ git revert a161f2c
[detached HEAD 5e96367] Revert "predictions are float but np.bincount can only be used on integers, thus an error comes out when using it"
 1 file changed, 1 insertion(+), 1 deletion(-)
$ nosetests sklearn/ensemble/tests/test_voting_classifier.py
..............
----------------------------------------------------------------------
Ran 14 tests in 1.711s

OK

MechCoder · 2016-12-26T00:44:07Z

Unless I'm reading the old issue wrongly, isn't the output of the method of the _predict output already encoded? That seems to be enforced here (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/voting_classifier.py#L164) no?

… Branch1

amanp10 · 2016-12-28T09:25:07Z

I have used regression algorithms as classifiers for the test because only they return prediction values as float.
There were also some PEP8 errors from before in the test file. Should I remove them?

amanp10 · 2016-12-28T09:27:01Z

@MechCoder sorry I didn't get you.

jnothman · 2017-01-09T22:38:04Z

The point is, @amanp10, that the success of this test was not affected by the patch in a161f2c. We want a test that would have failed at the last release, but succeeds now.

amanp10 · 2017-01-10T04:04:18Z

When I run the test on my machine it does give the following error before the patch
TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'
Also, it seems logical since algorithms returning float values as predictions would produce an error from np.bincount.

jnothman · 2017-01-10T04:22:33Z

Really? Is the bug a function of numpy version?

$ git checkout 0.18.1
HEAD is now at a5ab948... skip tree-test on 32bit
$ ipython
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:52:12)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: %paste
from sklearn.ensemble.tests.test_voting_classifier import *
from sklearn.linear_model import Ridge, Lasso

def test_predict_for_hard_voting():
  # Test predictions array data type error
  clf1 = Ridge(random_state=123)
  clf2 = Lasso(random_state=123)
  clf3 = SVC(probability=True, random_state=123)
  eclf1 = VotingClassifier(estimators=[
      ('rd', clf1), ('la', clf2), ('svc', clf3)], weights=[1, 2, 3],
      voting='hard')

  eclf1.fit(X, y)
  eclf1.predict(X)
## -- End pasted text --

In [2]:

In [2]:

In [2]:

In [2]: test_predict_for_hard_voting()

In [3]:

amanp10 · 2017-01-10T06:22:42Z

What I have done is manually reverted the changes in the commit a161f2c since it was only a one line change. Then I ran the test and it failed as expected. Here is the error log.

Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/home/amanp10/gsoc/scikit-learn/sklearn/ensemble/tests/test_voting_classifier.py", line 290, in test_predict_for_hard_voting
eclf1.predict(X)
File "/home/amanp10/gsoc/scikit-learn/sklearn/ensemble/voting_classifier.py", line 195, in predict
arr=predictions)
File "/usr/lib/python2.7/dist-packages/numpy/lib/shape_base.py", line 79, in apply_along_axis
res = func1d(arr[tuple(i.tolist())],*args)
File "/home/amanp10/gsoc/scikit-learn/sklearn/ensemble/voting_classifier.py", line 193, in
weights=self.weights)),
TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'

Ran 14 tests in 2.368s

FAILED (errors=1)

amanp10 · 2017-01-11T19:01:30Z

I am waiting for your final response, meanwhile I may take up another issue if its okay?

jnothman · 2017-01-12T03:51:18Z

You're right. I can get that test to fail if I revert the patch.

But now I'm confused as to why other tests aren't failing in the same situation. There are other tests performing hard voting, with weights, where predict is called, such as test_predict_on_toy_problem (although there the weights are all equal) and test_gridsearch.

I'm sorry that I don't know the code well, and this is exacerbated by a very poor description of the issue at #5802 and #5803; there is no changelog entry for #5802 to tell me what's fixed exactly, either.

Could you please explain... and add a changelog entry? Thanks so much.

amanp10 · 2017-01-12T04:15:43Z

I understand your dilemma. The test test_predict_on_toy_problem and test_gridsearch passed because they used classification algorithms for the voting classifier, thus returning INT prediction values (class labels). However, the fix was for FLOAT prediction values which we get only if regression algorithms are used in the voting classifier. Hence, my test failed due to FLOAT prediction values, which is indeed fixed in the commit a161f2c

jnothman · 2017-01-12T04:41:53Z

Oh, of course. And yet, I'm not sure that we should be admitting regressors here. Those regressors are not outputting ints, and an astype(int) transformation isn't going to make sense of a regressor trained on values (0, 1), or (-1, 1). I suspect we should require that VotingClassifier only accept constituent classifiers. Really if a classifier is fit on integer y it should return integer y, just as it is expected to return strings if it is fit on strings. That's what LabelEncoder and classes_'s job is. If all of scikit-learn's classifiers are well-behaved, you're only going to be able to construct a test by building an ill-behaved classifier in the test :\

I'm glad I finally understand what's going on. In the future it would be appreciated if you made sure your PR description is explicit about what you're fixing (particularly if the issue description is lacking).

amanp10 · 2017-01-12T04:50:53Z

How do you suggest I do it? I will need some guidance here on making an ill-behaved classifier.

amanp10 · 2017-01-12T04:53:24Z

@alvarouc Please tell me what exactly failed before a161f2c so I can put it in the regression test.

jnothman · 2017-01-12T05:26:04Z

class DodgySVC(SVC):
    def predict(self, X):
        return super(DodgySVC, self).predict(X).astype(float)

?
:P

amanp10 · 2017-01-12T07:24:32Z

I used this class as it is and it fulfills our need. It fails before the patch and works after it. Do I need to add more classifiers like this or just one is okay?

jnothman · 2017-01-12T08:54:38Z

just one is okay. but l will admit that I'm still not sure about us allowing a classifier to return floats if given ints On 12 Jan 2017 6:24 pm, "Aman Pratik" <notifications@github.com> wrote: I used this class as it is and it fulfills our need. It fails before the patch and works after it. Do I need to add more classifiers like this or just one is okay? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8112 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz69JJGb9OtRjYygpagcSuopdg37oCks5rRdUxgaJpZM4LVMpS> .

amanp10 · 2017-01-12T11:30:43Z

What do you suggest I do? There seems to be no other way to make the test.

jnothman · 2017-01-12T13:26:47Z

I'm okay with using that. I might not call it Dodgy if I were you... but I would leave a comment that it is dubiously conforming to our API.

amanp10 · 2017-01-12T14:00:35Z

I have made some changes, please have a look @jnothman.

amanp10 · 2017-01-15T12:09:06Z

@jnothman I am waiting for the final reviews on this one, so that I can make the corrections soon as I wont be available for a week.

jnothman

Again, I'm not sure we should care about this case, but this LGTM.

… Branch1

agramfort · 2017-06-07T08:22:14Z

LGTM

shall I merge?

glemaitre · 2017-06-08T08:27:36Z

sklearn/ensemble/tests/test_voting_classifier.py

+
+
+def test_predict_for_hard_voting():
+    # Test predictions array data type error


Unnecessary comment

glemaitre · 2017-06-08T08:28:09Z

LGTM +1 after removing the comment

agramfort · 2017-06-08T08:29:39Z

thx @glemaitre

i'll remove the comment in master

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

amanp10 added 2 commits December 24, 2016 15:10

Issue#5803 Regression Test added

21b2fc0

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

0a0492a

… Branch1

amanp10 added 2 commits December 28, 2016 14:49

Float predictions values used for testing

5dbcffc

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

c90fbf7

… Branch1

jnothman mentioned this pull request Jan 5, 2017

Warning of utils.deprecated for properties is ugly #8113

Closed

Custom Classifier used for test

4bd7ca9

jnothman reviewed Jan 18, 2017

View reviewed changes

jnothman changed the title ~~[MRG] Issue#5803 : Regression Test added~~ [MRG+1] Issue#5803 : Regression Test added Jan 18, 2017

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

c5d74d6

… Branch1

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

37a8475

… Branch1

glemaitre reviewed Jun 8, 2017

View reviewed changes

agramfort merged commit b93be1f into scikit-learn:master Jun 8, 2017

jnothman mentioned this pull request Jun 8, 2017

Error when using VotingClassifier on data with more than 2 classes #5803

Closed

amanp10 deleted the Branch1 branch June 8, 2017 15:58

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

4e8e2ca

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

ed9c672

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

a06a737

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

44bdb00

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

fcceac9

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

afb2158

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

6ab0073

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Issue#5803 : Regression Test added (scikit-learn#8112)

44bd7a6

* Issue#5803 Regression Test added * Float predictions values used for testing * Custom Classifier used for test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] Issue#5803 : Regression Test added #8112

[MRG+1] Issue#5803 : Regression Test added #8112

amanp10 commented Dec 24, 2016

jnothman commented Dec 25, 2016

MechCoder commented Dec 26, 2016

amanp10 commented Dec 28, 2016 •

edited

amanp10 commented Dec 28, 2016

jnothman commented Jan 9, 2017

amanp10 commented Jan 10, 2017

jnothman commented Jan 10, 2017

amanp10 commented Jan 10, 2017 •

edited

amanp10 commented Jan 11, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017 via email

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

amanp10 commented Jan 15, 2017

jnothman left a comment

agramfort commented Jun 7, 2017

glemaitre Jun 8, 2017

glemaitre commented Jun 8, 2017

agramfort commented Jun 8, 2017



		def test_predict_for_hard_voting():
		# Test predictions array data type error

[MRG+1] Issue#5803 : Regression Test added #8112

[MRG+1] Issue#5803 : Regression Test added #8112

Conversation

amanp10 commented Dec 24, 2016

jnothman commented Dec 25, 2016

MechCoder commented Dec 26, 2016

amanp10 commented Dec 28, 2016 • edited

amanp10 commented Dec 28, 2016

jnothman commented Jan 9, 2017

amanp10 commented Jan 10, 2017

jnothman commented Jan 10, 2017

amanp10 commented Jan 10, 2017 • edited

amanp10 commented Jan 11, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017 via email

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017

amanp10 commented Jan 12, 2017

amanp10 commented Jan 15, 2017

jnothman left a comment

Choose a reason for hiding this comment

agramfort commented Jun 7, 2017

glemaitre Jun 8, 2017

Choose a reason for hiding this comment

glemaitre commented Jun 8, 2017

agramfort commented Jun 8, 2017

amanp10 commented Dec 28, 2016 •

edited

amanp10 commented Jan 10, 2017 •

edited