Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Allow isotonic reg 2d input with 1 feature (GH15012) #17379

Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
2a6ae7f
allow isotonic reg 2d input with 1 feature (GH15012)
fujiaxiang May 29, 2020
b832a11
added PR number in whatsnew entry
fujiaxiang May 29, 2020
379dd23
corrected tag name in whatsnew entry
fujiaxiang May 29, 2020
90c4da5
added test to expect error when input has more than 1 feature
fujiaxiang May 29, 2020
0424c32
Update sklearn/isotonic.py
fujiaxiang Jun 2, 2020
b3f1dfc
Update sklearn/isotonic.py
fujiaxiang Jun 2, 2020
05eab7e
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
f562f55
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
b507b1c
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
c9b8fd9
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
1604cc6
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
1b8f71b
refactored code
fujiaxiang Jun 2, 2020
75728d4
Update doc/whats_new/v0.24.rst
fujiaxiang Jun 2, 2020
8d45ec6
Update sklearn/tests/test_isotonic.py
fujiaxiang Jun 2, 2020
99aa688
Merge branch 'master' into isotonic_reg_allow_2darray_with_1_feature
fujiaxiang Jun 9, 2020
17323ef
Merge remote-tracking branch 'upstream/master' into isotonic_reg_allo…
fujiaxiang Jun 10, 2020
1f5290f
Merge branch 'master' into isotonic_reg_allow_2darray_with_1_feature
fujiaxiang Jun 15, 2020
c8e3670
simplied whatsnew entry
fujiaxiang Jun 16, 2020
066fd9d
Merge remote-tracking branch 'origin/isotonic_reg_allow_2darray_with_…
fujiaxiang Jun 16, 2020
10e4fd2
Apply suggestions from code review
fujiaxiang Jun 16, 2020
d5d27da
Merge remote-tracking branch 'upstream/master' into isotonic_reg_allo…
fujiaxiang Jun 16, 2020
ee3e4f5
minor correction
fujiaxiang Jun 16, 2020
98f8db2
Merge remote-tracking branch 'upstream/master' into isotonic_reg_allo…
fujiaxiang Jun 25, 2020
0a5f65e
added attributes checking in unit test
fujiaxiang Jun 25, 2020
102e587
Merge remote-tracking branch 'upstream/master' into isotonic_reg_allo…
fujiaxiang Jul 6, 2020
ebebdf5
added tests for estimator attributes
fujiaxiang Jul 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 7 additions & 0 deletions doc/whats_new/v0.24.rst
Expand Up @@ -61,6 +61,13 @@ Changelog
change since `None` was defaulting to these values already.
:pr:`16493` by :user:`Darshan N <DarshanGowda0>`.

:mod:`sklearn.isotonic`
.......................

- |Enhancement| :meth:`IsotonicRegression.fit`, :meth:`IsotonicRegression.transform`
fujiaxiang marked this conversation as resolved.
Show resolved Hide resolved
and :meth:`IsotonicRegression.predict` now accepts 2darray with 1 feature as
input array. :pr:`17379` by :user:`Jiaxiang <fujiaxiang>`.

:mod:`sklearn.metrics`
......................

Expand Down
23 changes: 13 additions & 10 deletions sklearn/isotonic.py
Expand Up @@ -199,7 +199,7 @@ class IsotonicRegression(RegressorMixin, TransformerMixin, BaseEstimator):
>>> from sklearn.datasets import make_regression
>>> from sklearn.isotonic import IsotonicRegression
>>> X, y = make_regression(n_samples=10, n_features=1, random_state=41)
>>> iso_reg = IsotonicRegression().fit(X.flatten(), y)
>>> iso_reg = IsotonicRegression().fit(X, y)
>>> iso_reg.predict([.1, .2])
array([1.8628..., 3.7256...])
"""
Expand All @@ -211,9 +211,11 @@ def __init__(self, *, y_min=None, y_max=None, increasing=True,
self.increasing = increasing
self.out_of_bounds = out_of_bounds

def _check_fit_data(self, X, y, sample_weight=None):
if len(X.shape) != 1:
raise ValueError("X should be a 1d array")
def _check_input_data_shape(self, X, y=None, sample_weight=None):
fujiaxiang marked this conversation as resolved.
Show resolved Hide resolved
if not (len(X.shape) == 1 or (len(X.shape) == 2 and X.shape[1] == 1)):
fujiaxiang marked this conversation as resolved.
Show resolved Hide resolved
msg = "Isotonic regression input X should be a 1d array or " \
"2d array with 1 feature"
raise ValueError(msg)

def _build_f(self, X, y):
"""Build the f_ interp1d function."""
Expand All @@ -234,7 +236,8 @@ def _build_f(self, X, y):

def _build_y(self, X, y, sample_weight, trim_duplicates=True):
"""Build the y_ IsotonicRegression."""
self._check_fit_data(X, y, sample_weight)
self._check_input_data_shape(X, y, sample_weight)
X = X.reshape(-1)
fujiaxiang marked this conversation as resolved.
Show resolved Hide resolved

# Determine increasing if auto-determination requested
if self.increasing == 'auto':
Expand Down Expand Up @@ -283,7 +286,7 @@ def fit(self, X, y, sample_weight=None):

Parameters
----------
X : array-like of shape (n_samples,)
X : array-like of shape (n_samples,) or (n_samples, 1)
Training data.

y : array-like of shape (n_samples,)
Expand Down Expand Up @@ -327,7 +330,7 @@ def transform(self, T):

Parameters
----------
T : array-like of shape (n_samples,)
T : array-like of shape (n_samples,) or (n_samples, 1)
Data to transform.

Returns
Expand All @@ -343,8 +346,8 @@ def transform(self, T):

T = check_array(T, dtype=dtype, ensure_2d=False)

if len(T.shape) != 1:
raise ValueError("Isotonic regression input should be a 1d array")
self._check_input_data_shape(T)
T = T.reshape(-1)
fujiaxiang marked this conversation as resolved.
Show resolved Hide resolved

# Handle the out_of_bounds argument by clipping if needed
if self.out_of_bounds not in ["raise", "nan", "clip"]:
Expand All @@ -367,7 +370,7 @@ def predict(self, T):

Parameters
----------
T : array-like of shape (n_samples,)
T : array-like of shape (n_samples,) or (n_samples, 1)
Data to transform.

Returns
Expand Down
40 changes: 37 additions & 3 deletions sklearn/tests/test_isotonic.py
Expand Up @@ -9,9 +9,10 @@
IsotonicRegression, _make_unique)

from sklearn.utils.validation import check_array
from sklearn.utils._testing import (assert_raises, assert_array_equal,
assert_array_almost_equal,
assert_warns_message, assert_no_warnings)
from sklearn.utils._testing import (assert_raises, assert_allclose,
assert_array_equal,
assert_array_almost_equal,
assert_warns_message, assert_no_warnings)
from sklearn.utils import shuffle

from scipy.special import expit
Expand Down Expand Up @@ -508,3 +509,36 @@ def test_make_unique_dtype():
w = np.ones_like(x)
x, y, w = _make_unique(x, y, w)
assert_array_equal(x, [2, 3, 5])


def test_input_shape_validation():
# Test from #15012
# Check that IsotonicRegression can handle 2darray with only 1 feature
X = np.arange(10)
X_2d = X.reshape(-1, 1)
y = np.arange(10)

iso_reg = IsotonicRegression().fit(X, y)
iso_reg_2d = IsotonicRegression().fit(X_2d, y)

y_pred1 = iso_reg.predict(X)
y_pred2 = iso_reg_2d.predict(X_2d)
assert_allclose(y_pred1, y_pred2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check that the estimator attributes match in shape. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jnothman, thanks for review. Are you referring to the _necessary_X_ and _necessary_y_ attributes? I have added checking for these two attributes. If it's something else, can you give a bit more details what you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant the public attributes X_min_, X_max_, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @jnothman, sorry for the late reply. I have added the checks as requested. Could you review again?



def test_isotonic_2darray_more_than_1_feature():
# Ensure IsotonicRegression raises error if input has more than 1 feature
X = np.arange(10)
X_2d = np.c_[X, X]
y = np.arange(10)

msg = "should be a 1d array or 2d array with 1 feature"
with pytest.raises(ValueError, match=msg):
IsotonicRegression().fit(X_2d, y)

iso_reg = IsotonicRegression().fit(X, y)
with pytest.raises(ValueError, match=msg):
iso_reg.predict(X_2d)

with pytest.raises(ValueError, match=msg):
iso_reg.transform(X_2d)