Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AdaBoost ZeroDivisionError #7501

Closed
mfeurer opened this issue Sep 27, 2016 · 9 comments
Closed

AdaBoost ZeroDivisionError #7501

mfeurer opened this issue Sep 27, 2016 · 9 comments
Labels
Bug
Milestone

Comments

@mfeurer
Copy link
Contributor

@mfeurer mfeurer commented Sep 27, 2016

Description

AdaBoostClassifier throws a ZeroDivisionError when calling predict_proba if the classifier has only been fit on samples from a single class.

Steps/Code to Reproduce

import sklearn.ensemble
import numpy as np
X = np.random.random((10, 10))
y = np.zeros((10, ))
ada = sklearn.ensemble.AdaBoostClassifier().fit(X, y)
ada.predict(X)
ada.predict_proba(X)

Expected Results

A ValueError when using fit.

Actual Results

In [10]: ada.predict_proba(X)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-10-492c02a5f340> in <module>()
----> 1 ada.predict_proba(X)

/home/feurerm/virtualenvs/2016_epm/lib/python3.4/site-packages/sklearn/ensemble/weight_boosting.py in predict_proba(self, X)
    765 
    766         proba /= self.estimator_weights_.sum()
--> 767         proba = np.exp((1. / (n_classes - 1)) * proba)
    768         normalizer = proba.sum(axis=1)[:, np.newaxis]
    769         normalizer[normalizer == 0.0] = 1.0

ZeroDivisionError: float division by zero

Versions

  • Linux-3.13.0-54-generic-x86_64-with-Ubuntu-14.04-trusty
  • Python 3.4.3 (default, Oct 14 2015, 20:28:29)
    [GCC 4.8.4]
  • NumPy 1.11.1
  • SciPy 0.18.0
  • Scikit-Learn 0.17.1
@amueller amueller added the Bug label Sep 29, 2016
@amueller amueller added this to the 0.19 milestone Sep 29, 2016
@amueller
Copy link
Member

@amueller amueller commented Sep 29, 2016

thanks for the report.

@soniampub
Copy link

@soniampub soniampub commented Sep 29, 2016

@amueller I would like to work on this issue, but I am new to Scikit learn also open source contribution. Can you please give me some thoughts on how to fix this issue and how I can contribute more and more on this project. I want to be a active contributor here.

@amueller
Copy link
Member

@amueller amueller commented Sep 30, 2016

First, check if the error still exists on the current development version.
Then we need to think about a bugfix. If the number of classes is 1, we should probably just return ones as the probabilities.
Then add a test that the behavior is as desired and the error is not raised anymore.

@mfeurer
Copy link
Contributor Author

@mfeurer mfeurer commented Sep 30, 2016

Do you intend that classifiers work on datasets with a single class? As stated in the issue I would have actually liked to see an exception stating that I try to fit on a dataset with a single class only. This would prevent downstream errors.

@amueller
Copy link
Member

@amueller amueller commented Sep 30, 2016

IIRC, the behavior is inconsistent. Some classifiers work on a single class, others don't. It's a bit unfortunate. But I don't want to break working behavior for consistencies sake. There is a test that during fit, it either works or a sensible error is thrown:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/estimator_checks.py#L917

It seems we should add decision_function and predict_proba to this test.

For some models, like tree-based ones or knn, fitting to a single class actually has a legitimate result, for others, like linear models, it does not. We could start deprecating the support for single classes, but that's something that could happen within cross-validation. Arguably that will not give you a great result, but that's not really a reason to prohibit the user to do it.

@soniampub
Copy link

@soniampub soniampub commented Sep 30, 2016

Sure, i will start looking into it.

@floondi
Copy link

@floondi floondi commented Jan 23, 2017

Hey, is it okay if I take this one?

@jnothman
Copy link
Member

@jnothman jnothman commented Jan 23, 2017

Looks like it!

@dokato
Copy link
Contributor

@dokato dokato commented Feb 16, 2017

Seemed easy to fix, please check if that's okay.

dokato pushed a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato pushed a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato added a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato added a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato added a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato added a commit to dokato/scikit-learn that referenced this issue Feb 16, 2017
dokato added a commit to dokato/scikit-learn that referenced this issue Feb 20, 2017
jnothman added a commit that referenced this issue Feb 20, 2017
* FIX AdaBoost ZeroDivisionError in proba #7501

* FIX AdaBoost ZeroDivisionError in proba #7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba #7501 - tests corrected

* FIX #7501 improvements suggested by lesteve introduced

* FIX #7501 whats_new file updated

* Tweak in rst
sergeyf added a commit to sergeyf/scikit-learn that referenced this issue Feb 28, 2017
…ikit-learn#8371)

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX scikit-learn#7501 improvements suggested by lesteve introduced

* FIX scikit-learn#7501 whats_new file updated

* Tweak in rst
Sundrique added a commit to Sundrique/scikit-learn that referenced this issue Jun 14, 2017
…ikit-learn#8371)

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX scikit-learn#7501 improvements suggested by lesteve introduced

* FIX scikit-learn#7501 whats_new file updated

* Tweak in rst
NelleV added a commit to NelleV/scikit-learn that referenced this issue Aug 11, 2017
…ikit-learn#8371)

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX scikit-learn#7501 improvements suggested by lesteve introduced

* FIX scikit-learn#7501 whats_new file updated

* Tweak in rst
paulha added a commit to paulha/scikit-learn that referenced this issue Aug 19, 2017
…ikit-learn#8371)

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX scikit-learn#7501 improvements suggested by lesteve introduced

* FIX scikit-learn#7501 whats_new file updated

* Tweak in rst
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this issue Nov 15, 2017
…ikit-learn#8371)

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX AdaBoost ZeroDivisionError in proba scikit-learn#7501 - tests corrected

* FIX scikit-learn#7501 improvements suggested by lesteve introduced

* FIX scikit-learn#7501 whats_new file updated

* Tweak in rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

6 participants
You can’t perform that action at this time.