Skip to content

Commit

Permalink
FIX weights computation with ties in IsotonicRegression (scikit-learn…
Browse files Browse the repository at this point in the history
  • Loading branch information
dallascard authored and maskani-moh committed Nov 15, 2017
1 parent 29be5dc commit 960707f
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 2 deletions.
8 changes: 8 additions & 0 deletions doc/whats_new/v0.20.rst
Expand Up @@ -16,6 +16,7 @@ occurs due to changes in the modelling logic (bug fixes or enhancements), or in
random sampling procedures.

- :class:`decomposition.IncrementalPCA` in Python 2 (bug fix)
- :class:`isotonic.IsotonicRegression` (bug fix)

Details are listed in the changelog below.

Expand Down Expand Up @@ -68,6 +69,13 @@ Linear, kernelized and related models
Bug fixes
.........

Classifiers and regressors

- Fixed a bug in :class:`isotonic.IsotonicRegression` which incorrectly
combined weights when fitting a model to data involving points with
identical X values.
:issue:`9432` by :user:`Dallas Card <dallascard>`

Decomposition, manifold learning and clustering

- Fix for uninformative error in :class:`decomposition.IncrementalPCA`:
Expand Down
4 changes: 2 additions & 2 deletions sklearn/_isotonic.pyx
Expand Up @@ -100,7 +100,7 @@ def _make_unique(np.ndarray[dtype=np.float64_t] X,
if x != current_x:
# next unique value
x_out[i] = current_x
weights_out[i] = current_weight / current_count
weights_out[i] = current_weight
y_out[i] = current_y / current_weight
i += 1
current_x = x
Expand All @@ -113,6 +113,6 @@ def _make_unique(np.ndarray[dtype=np.float64_t] X,
current_count += 1

x_out[i] = current_x
weights_out[i] = current_weight / current_count
weights_out[i] = current_weight
y_out[i] = current_y / current_weight
return x_out, y_out, weights_out
24 changes: 24 additions & 0 deletions sklearn/tests/test_isotonic.py
Expand Up @@ -166,6 +166,30 @@ def test_isotonic_regression_ties_secondary_():
assert_array_almost_equal(ir.fit_transform(x, y), y_true, 4)


def test_isotonic_regression_with_ties_in_differently_sized_groups():
"""
Non-regression test to handle issue 9432:
https://github.com/scikit-learn/scikit-learn/issues/9432
Compare against output in R:
> library("isotone")
> x <- c(0, 1, 1, 2, 3, 4)
> y <- c(0, 0, 1, 0, 0, 1)
> res1 <- gpava(x, y, ties="secondary")
> res1$x
`isotone` version: 1.1-0, 2015-07-24
R version: R version 3.3.2 (2016-10-31)
"""
x = np.array([0, 1, 1, 2, 3, 4])
y = np.array([0, 0, 1, 0, 0, 1])
y_true = np.array([0., 0.25, 0.25, 0.25, 0.25, 1.])
ir = IsotonicRegression()
ir.fit(x, y)
assert_array_almost_equal(ir.transform(x), y_true)
assert_array_almost_equal(ir.fit_transform(x, y), y_true)


def test_isotonic_regression_reversed():
y = np.array([10, 9, 10, 7, 6, 6.1, 5])
y_ = IsotonicRegression(increasing=False).fit_transform(
Expand Down

0 comments on commit 960707f

Please sign in to comment.