Skip to content

CalibratedClassifierCV doesn't work well with CatBoostClassifier . Gives wrong output. #15013

@zoj613

Description

@zoj613

Description

Problem:
When trying to calibrate the class probability estimates with scikit-learn's CalibratedClassifierCV, all I get are 1's for the negative target and 0's for the positive target in a binary classification problem. If I use CatBoostClassifier indipendently I get normal looking probabilities. This leads me to believe that this Classifier is not compatible with the calibration technique. Is there a way I can go about fixing this issue?

Steps/Code to Reproduce

from catboost import CatBoostClassifier
from sklearn.calibration import CalibratedClassifierCV
from sklearn.datasets import make_classification

X, y = make_classification(100, 10)

cat = CatBoostClassifier(verbose=0)
calib = CalibratedClassifierCV(base_estimator=cat, method='sigmoid', cv=2)
cat.fit(X, y)
calib.fit(X, y)
print(cat.predict_proba(X))
print(calib.predict_proba(X))

Expected Results

The result is supposed to be calibrated probabilities but instead I get all 1's for the first column and all zeros for the second one. This is clearly the wrong output.

Actual Results

See above

Versions

System:
    python: 3.7.3 (default, Jul  8 2019, 11:40:34)  [GCC 6.5.0 20181026]
executable: ~/.pyenv/versions/3.7.3/envs/absa-py37/bin/python
   machine: Linux-5.0.0-27-generic-x86_64-with-debian-buster-sid

Python deps:
       pip: 19.0.3
setuptools: 40.8.0
   sklearn: 0.21.3
     numpy: 1.17.0
     scipy: 1.3.1
    Cython: None
    pandas: 0.25.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions