Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Isotonic calibration changes rank-based test metrics values #16321
[As discussed with @ogrisel]
Describe the bug
Using isotonic calibration changes metrics values. This is because it is a non-strictly monotonic calibration. Sigmoid calibration being strictly monotonic doesn't suffer from this.
Steps/Code to Reproduce
Here is quick example where we split in three train/calibration/test sets and compare ROC AUC on the test set before and after calibrating for both isotonic and sigmoid..
import numpy as np from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.calibration import CalibratedClassifierCV from sklearn.metrics import roc_auc_score X, y = datasets.make_classification(n_samples=100000, n_features=20, n_informative=18, n_redundant=0, random_state=42) X_train, X_, y_train, y_ = train_test_split(X, y, test_size=0.5, random_state=42) X_test, X_calib, y_test, y_calib = train_test_split(X_, y_, test_size = 0.5, random_state = 42) clf = LogisticRegression(C=1.) clf.fit(X_train, y_train) y_pred = clf.predict_proba(X_test) print(roc_auc_score(y_test, y_pred[:,1]))
The ROC AUC is then: 0.88368
isotonic = CalibratedClassifierCV(clf, method='isotonic', cv='prefit') isotonic.fit(X_calib, y_calib) y_pred_calib = isotonic.predict_proba(X_test) print(roc_auc_score(y_test, y_pred_calib[:,1]))
After isotonic calibration, the ROC AUC becomes: 0.88338
isotonic = CalibratedClassifierCV(clf, method='sigmoid', cv='prefit') isotonic.fit(X_calib, y_calib) y_pred_calib = isotonic.predict_proba(X_test) print(roc_auc_score(y_test, y_pred_calib[:,1]))
As expected for sigmoid calibration, the ROC AUC is constant. 0.88368
Built with OpenMP: True
Thank you very much @dsleo for the detailed report. I started from your code and pushed the analysis further.
If one introspect the calibrator, on can observe the following:
y_pred_calib = isotonic.predict_proba(X_test) print(roc_auc_score(y_test, y_pred_calib[:, 1])) calibrator = isotonic.calibrated_classifiers_.calibrators_ import matplotlib.pyplot as plt plt.figure(figsize=(16, 10)) plt.plot(calibrator._necessary_X_, calibrator._necessary_y_)
Note that this kind of plot is very interesting and there is another related PR (#16289) to make it possible to expose those thresholds for inspection reasons here.
This is a bit weird to have a constant step wise function to map
Furthermore a close look at the (x, y) pairs of thresholds look as follows (for the first 10 pairs):
for x, y in zip(calibrator._necessary_X_[:10], calibrator._necessary_y_[:10]): print(x, y)
The y values are monotonic (but not strictly)
I don't know why we want to use such a piece-wise constant mapping. One could instead use a piece-wise linear mapping that would be trivially strictly monotonic. Here is a quick hack to show that this would trivially fix the issue when using isotonic calibration for classifier calibration:
X_thresholds_fixed = np.concatenate((calibrator._necessary_X_[::2], calibrator._necessary_X_[-1:])) y_thresholds_fixed = np.concatenate((calibrator._necessary_y_[::2], calibrator._necessary_y_[-1:])) plt.figure(figsize=(16, 10)) plt.plot(calibrator._necessary_X_, calibrator._necessary_y_) plt.plot(X_thresholds_fixed, y_thresholds_fixed)
Linearly interpolating using those thresholds makes the mapping strictly monotonic and the ROC-AUC of the original model is recovered:
calibrator._build_f(X_thresholds_fixed, y_thresholds_fixed) y_pred_calib = isotonic.predict_proba(X_test) print(roc_auc_score(y_test, y_pred[:, 1])) print(roc_auc_score(y_test, y_pred_calib[:, 1]))
Instead of the max values of each steps, one could have used mid-points. This could be explored in a fix in
The above analysis is wrong: I made wrong expectations on the structure of the steps. They are not always exact steps. There are already piece-wise linear components in the default prediction function.
Still it would be nice to have an option to make it possible to enforce a strict monotonicity option, maybe by adding a small eps on one of the edges whenever y is constant on a segment.