[MRG] Enforcing strict monotonicity for IsotonicRegression #17790

dsleo · 2020-06-30T17:25:01Z

Reference Issues/PRs

As per our discussion, this fixes issue #16321.

What does this implement/fix? Explain your changes.

IsotonicRegression not being strictly monotonic, calibration using IsotonicRegression have its rank-based metrics changes. This PR adds a strict argument to IsotonicRegression to enforce strict monotonicity by keeping only the unique values of the thresholds y_thresholds_.

Any other comments?

Not sure whether the strict argument should default to True for CalibratedClassifierCV if the selected method is isotonic ?

NicolasHug

Thanks @dsleo ,

I understand this is still WIP but I took a quick look

we should also add details in the User guides of both calibration and isotonic

sklearn/calibration.py

sklearn/isotonic.py

sklearn/tests/test_calibration.py

NicolasHug · 2020-07-01T11:33:52Z

Not sure whether the strict argument should default to True for CalibratedClassifierCV if the selected method is isotonic ?

It should be False for CalibratedClassifierCV and IsotonicRegression so that results don't change in the next versions

doctstring wording Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

dsleo · 2020-07-01T15:55:04Z

Thanks @NicolasHug for the review ! I did the modifications following your comments and I've added some description in the calibration's user guide. This should ready for full review now !

NicolasHug

Thanks @dsleo

I think we should also detail in the User Guide of isotonic how the strict mode works.

We could also udpate the example https://scikit-learn.org/dev/auto_examples/miscellaneous/plot_isotonic_regression.html#sphx-glr-auto-examples-miscellaneous-plot-isotonic-regression-py to illustrate the difference between strict and non-strict mode.

(Also, not directly related to this PR, but if you could include a link to that example in the User Guide it would be great!)

sklearn/tests/test_calibration.py

sklearn/tests/test_isotonic.py

dsleo · 2020-07-02T22:05:20Z

I've made the changes according to your comments and update the example. A few things to note:

to insure strict monotonicity outside the training domain, this extrapolates unless out_of_bound is set to clip. This is now made explicit in the docstring, let me know if it's not clear enough.
when fit on increasing data, setting increasing=False leads the output function to be constant, so strict monotonicity cannot be enforced. For now a ValueError is raised, but it's not technically correct. Ideally we could have a custom error, what do you think ?

I'll try to find the time tomorrow to fill in the details in the UG on the strict mode.

dsleo · 2020-07-09T14:52:46Z

little ping @NicolasHug in case this was drown in all the others issues and PRs :)

lucyleeow

A note about out_of_bounds that we should fix.

I think this shows that we should add strict=True/False with pytest parameterize to more tests.
Or at least to oob tests - test_isotonic_regression_oob_clip, test_isotonic_regression_oob_nan, test_isotonic_regression_oob_raise etc

sklearn/isotonic.py

lucyleeow · 2020-08-25T11:55:03Z

sklearn/isotonic.py

+        When set to `True`, points outside the training domain will be
+        extrapolated, unless `out_of_bounds="clip"`.


Maybe we want to add another option to out_of_bounds if we want to extrapolate values outside of the training domain.
It doesn't make sense that we give extrapolated values when out_of_bounds is: 'nan' or 'raise' (and it isn't consistent with the behaviour when strict=False). If we want to offer extrapolate option, we should add an 'extrapolate' option to out_of_bounds. But I am also not sure if it a good idea to extrapolate though @ogrisel and @NicolasHug will know more.

We should also add a .. versionadded:: here

lucyleeow · 2020-08-25T13:10:54Z

I also just wanted to raise that with strict=True we start extrapolating even within the train domain, specifically after the second to last largest train value - not sure how much of a problem this is. Here is an example (largest value is 0.919):

import numpy as np
from sklearn.isotonic import IsotonicRegression

X = np.array([0.49835541, 0.54572331, 0.91999828, 0.33876463, 0.87974298,
              0.02375396, 0.33838427, 0.43110351, 0.5300294 , 0.80951779])
y = np.array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1])

iso = IsotonicRegression(strict=False)
iso.fit(X, y)
iso.predict(np.array([0.9]))

array([1.])

iso_strict = IsotonicRegression(strict=True)
iso_strict.fit(X, y)
iso_strict.predict(np.array([0.9]))

array([1.64477914])

(Of course this is a big problem here as we are predicting a proba >1 for a x value within the training domain)

We can understand why when we plot the thresholds:
For strict=False:

For strict=True:

…onotonicity

dsleo · 2020-11-06T08:00:50Z

Sorry for the delay in following up on this !

@lucyleeow, I've simplified the code, no more unnecessary extrapolation or issues with out_of_bounds . So the behavior is consistent for the two possible mode of strict.

Regarding the example you propose, when removing flat segments, we can fix both extremities of the interval (see here).

iso_strict = IsotonicRegression(strict=True)
iso_strict.fit(X, y)
iso_strict.predict(np.array([0.9]))

array([0.98853113])

And this is the corresponding curve with strict=True:

This should now be ready for a second review then.

…otonicity

ogrisel · 2021-10-25T08:18:10Z

@dsleo there is an alternative to this PR proposed as #21454. Any comment would be appreciated.

dsleo · 2021-10-27T13:48:04Z

@ogrisel thanks for the heads up. I am not familiar with centered isotonic regression and its uses but if we can enforce monotonicity at the edges that would also handle this. Perhaps through two parameters, centered and strict (but where strict=True and centered=False wouldn't be an option) ?

ogrisel · 2021-10-29T16:18:51Z

@ogrisel thanks for the heads up. I am not familiar with centered isotonic regression and its uses but if we can enforce monotonicity at the edges that would also handle this. Perhaps through two parameters, centered and strict (but where strict=True and centered=False wouldn't be an option) ?

Let's centralize the discussion on the centered isotonic PR directly.

lorentzenchr · 2023-03-22T17:21:53Z

See #21454.

dsleo added 3 commits June 30, 2020 19:04

enforcing strict monotonicity for isotonic calibration

2c9e8f5

resolving conflict

c5e3496

resolving conflict 2

6847c22

NicolasHug reviewed Jul 1, 2020

View reviewed changes

dsleo and others added 7 commits July 1, 2020 17:03

Update sklearn/calibration.py

5bc2268

doctstring wording Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Update sklearn/calibration.py

76a0af1

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Update sklearn/calibration.py

db27e9d

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Update sklearn/isotonic.py

2d305e3

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Update sklearn/isotonic.py

7624939

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

strict monotonicity test for IsotonicRegression

cc69ac5

Mentioning strict monotonicity in calibration's UG

e503dec

NicolasHug reviewed Jul 2, 2020

View reviewed changes

sklearn/tests/test_calibration.py Outdated Show resolved Hide resolved

sklearn/tests/test_isotonic.py Outdated Show resolved Hide resolved

sklearn/tests/test_isotonic.py Outdated Show resolved Hide resolved

dsleo added 5 commits July 2, 2020 20:53

forgot to commit train-test split

33034d7

Changing data for test isotonic

bba1cf7

adding test on y_thresholds_

67ddeb7

updating user guide example

5dbd29d

add test to raise ValueError when strict is impossible

a2e284f

dsleo added 3 commits July 3, 2020 08:42

test strictly decreasing

ee26d15

updating isotonic UG

f6b5acc

isotonic auto-example

7f83d35

dsleo changed the title ~~[WIP] Enforcing strict monotonicity for IsotonicRegression~~ [MRG] Enforcing strict monotonicity for IsotonicRegression Jul 3, 2020

dsleo added 2 commits July 3, 2020 15:51

docstring fix

323a10b

merging master

d9f9978

resolving conflict calibration rst

da8af3d

This was referenced Aug 17, 2020

ENH Add CalibrationDisplay plotting class #17443

Merged

Calibration/infinite probability #17758

Closed

lucyleeow reviewed Aug 25, 2020

View reviewed changes

dsleo added 7 commits November 5, 2020 14:42

merge conflict resolution

b7ebf85

fixing conflict part deux

6c150f1

linting

346b510

Merge remote-tracking branch 'upstream/master' into isotonic-strict-m…

2132d38

…onotonicity

forgotten strict param

2fe034f

Fixing extremities when removing flat segments

a00e0c9

adjust docstring

094d652

Base automatically changed from master to main January 22, 2021 10:52

Merge remote-tracking branch 'upstream/main' into isotonic-strict-mon…

74dac9b

…otonicity

lorentzenchr closed this Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Enforcing strict monotonicity for IsotonicRegression #17790

[MRG] Enforcing strict monotonicity for IsotonicRegression #17790

dsleo commented Jun 30, 2020

NicolasHug left a comment

NicolasHug commented Jul 1, 2020

dsleo commented Jul 1, 2020

NicolasHug left a comment

dsleo commented Jul 2, 2020 •

edited

dsleo commented Jul 9, 2020

lucyleeow left a comment •

edited

lucyleeow Aug 25, 2020 •

edited

lucyleeow Aug 25, 2020

lucyleeow commented Aug 25, 2020 •

edited

dsleo commented Nov 6, 2020 •

edited

ogrisel commented Oct 25, 2021

dsleo commented Oct 27, 2021

ogrisel commented Oct 29, 2021

lorentzenchr commented Mar 22, 2023

		When set to `True`, points outside the training domain will be
		extrapolated, unless `out_of_bounds="clip"`.

[MRG] Enforcing strict monotonicity for IsotonicRegression #17790

[MRG] Enforcing strict monotonicity for IsotonicRegression #17790

Conversation

dsleo commented Jun 30, 2020

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug commented Jul 1, 2020

dsleo commented Jul 1, 2020

NicolasHug left a comment

Choose a reason for hiding this comment

dsleo commented Jul 2, 2020 • edited

dsleo commented Jul 9, 2020

lucyleeow left a comment • edited

Choose a reason for hiding this comment

lucyleeow Aug 25, 2020 • edited

Choose a reason for hiding this comment

lucyleeow Aug 25, 2020

Choose a reason for hiding this comment

lucyleeow commented Aug 25, 2020 • edited

dsleo commented Nov 6, 2020 • edited

ogrisel commented Oct 25, 2021

dsleo commented Oct 27, 2021

ogrisel commented Oct 29, 2021

lorentzenchr commented Mar 22, 2023

dsleo commented Jul 2, 2020 •

edited

lucyleeow left a comment •

edited

lucyleeow Aug 25, 2020 •

edited

lucyleeow commented Aug 25, 2020 •

edited

dsleo commented Nov 6, 2020 •

edited