-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Closed
Description
Description
SimpleImputer
will throw ValueError
if the array passed in contains np.inf
which is not the case in the old Imputer
.
This may be caused by calling the _validate_input
before actual fitting or transforming.
Steps/Code to Reproduce
s = np.array([-np.inf, 2, 3, 4, 5, 6, np.inf]).reshape(-1, 1)
from sklearn.impute import SimpleImputer
transformer = SimpleImputer(strategy='median')
transformer.fit(s)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-ee05450ab4c6> in <module>()
3 transformer = SimpleImputer(strategy='median')
4
----> 5 transformer.fit(s)
~/anaconda3/lib/python3.6/site-packages/sklearn/impute.py in fit(self, X, y)
221 self : SimpleImputer
222 """
--> 223 X = self._validate_input(X)
224
225 # default fill_value is 0 for numerical input and "missing_value"
~/anaconda3/lib/python3.6/site-packages/sklearn/impute.py in _validate_input(self, X)
195 "".format(self.strategy, X.dtype.kind))
196 else:
--> 197 raise ve
198
199 _check_inputs_dtype(X, self.missing_values)
~/anaconda3/lib/python3.6/site-packages/sklearn/impute.py in _validate_input(self, X)
188 try:
189 X = check_array(X, accept_sparse='csc', dtype=dtype,
--> 190 force_all_finite=force_all_finite, copy=self.copy)
191 except ValueError as ve:
192 if "could not convert" in str(ve):
~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
571 if force_all_finite:
572 _assert_all_finite(array,
--> 573 allow_nan=force_all_finite == 'allow-nan')
574
575 shape_repr = _shape_repr(array.shape)
~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan)
54 not allow_nan and not np.isfinite(X).all()):
55 type_err = 'infinity' if allow_nan else 'NaN, infinity'
---> 56 raise ValueError(msg_err.format(type_err, X.dtype))
57
58
ValueError: Input contains infinity or a value too large for dtype('float64').
Expected Results
s = np.array([-np.inf, 2, 3, 4, 5, 6, np.inf]).reshape(-1, 1)
from sklearn.preprocessing import Imputer
transformer = Imputer(strategy='median')
transformer.fit(s)
/home/jzhou/anaconda3/lib/python3.6/site-packages/sklearn/utils/deprecation.py:58: DeprecationWarning:
Class Imputer is deprecated; Imputer was deprecated in version 0.20 and will be removed in 0.22. Import impute.SimpleImputer from sklearn instead.
Imputer(axis=0, copy=True, missing_values='NaN', strategy='median', verbose=0)
Versions
System:
python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0]
executable: /home/jzhou/anaconda3/bin/python
machine: Linux-4.10.0-42-generic-x86_64-with-debian-stretch-sid
BLAS:
macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
lib_dirs: /home/jzhou/anaconda3/lib
cblas_libs: mkl_rt, pthread
Python deps:
pip: 19.0.3
setuptools: 40.8.0
sklearn: 0.20.2
numpy: 1.16.2
scipy: 1.2.1
Cython: 0.28.2
pandas: 0.24.1
FelixNeutatz
Metadata
Metadata
Assignees
Labels
No labels