Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] NaN handling MinMaxScaler #11005

Merged
merged 13 commits into from Apr 21, 2018

Conversation

Projects
None yet
3 participants
@LucijaGregov
Copy link
Contributor

LucijaGregov commented Apr 21, 2018

Reference Issues/PRs

partially adressed #10404

What does this implement/fix? Explain your changes.

Pass-through NaN value in MinMaxScaler

Any other comments?

@LucijaGregov

This comment has been minimized.

Copy link
Contributor Author

LucijaGregov commented Apr 21, 2018

@glemaitre glemaitre changed the title (WiP) Nan handling minmax scaler [WIP] Nan handling minmax scaler Apr 21, 2018

@glemaitre glemaitre changed the title [WIP] Nan handling minmax scaler [WIP] NaN handling MinMaxScaler Apr 21, 2018

@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None):
"You may consider to use MaxAbsScaler instead.")

X = check_array(X, copy=self.copy, warn_on_dtype=True,
estimator=self, dtype=FLOAT_DTYPES)
estimator=self, dtype=FLOAT_DTYPES,
force_all_finite="allow-nan")

data_min = np.min(X, axis=0)

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

it should be np.nanmin because np.min([1, 2, 3, np.nan]) will return np.nan

@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None):
"You may consider to use MaxAbsScaler instead.")

X = check_array(X, copy=self.copy, warn_on_dtype=True,
estimator=self, dtype=FLOAT_DTYPES)
estimator=self, dtype=FLOAT_DTYPES,
force_all_finite="allow-nan")

data_min = np.min(X, axis=0)
data_max = np.max(X, axis=0)

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

np.max -> np.nanmax

@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Apr 21, 2018

@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Apr 21, 2018

@glemaitre glemaitre changed the title [WIP] NaN handling MinMaxScaler [MRG] NaN handling MinMaxScaler Apr 21, 2018

@@ -276,6 +276,10 @@ class MinMaxScaler(BaseEstimator, TransformerMixin):
Notes
-----

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

Remove this empty line

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
by :user: `Lucija Gregov <LucijaGregov>`

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

Remove this empty line

@@ -73,7 +73,7 @@
'RANSACRegressor', 'RadiusNeighborsRegressor',
'RandomForestRegressor', 'Ridge', 'RidgeCV']

ALLOW_NAN = ['QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer']
ALLOW_NAN = ['Imputer', 'SimpleImputer', 'MICEImputer', 'MinMaxScaler', 'QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer']

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

There some repetition in this line.

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

no space after :issue:

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
by :user: `Lucija Gregov <LucijaGregov>`

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor

no space after :user:

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`

This comment has been minimized.

Copy link
@glemaitre

glemaitre Apr 21, 2018

Contributor
:class:`preprocessing.MinMaxScaler`

LucijaGregov added some commits Apr 21, 2018

@glemaitre glemaitre changed the title [MRG] NaN handling MinMaxScaler [MRG+ 1] NaN handling MinMaxScaler Apr 21, 2018

@glemaitre glemaitre changed the title [MRG+ 1] NaN handling MinMaxScaler [MRG+1] NaN handling MinMaxScaler Apr 21, 2018

@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Apr 21, 2018

ping @rth

@rth
Copy link
Member

rth left a comment

See minor comment below

LGTM otherwise.

@@ -94,6 +94,9 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class:`preprocessing.MinMaxScaler` to pass through NaN values. :issue:`10404`
by :user:`Lucija Gregov <LucijaGregov>`

This comment has been minimized.

Copy link
@rth

rth Apr 21, 2018

Member

Dot missing at the end.

LucijaGregov added some commits Apr 21, 2018

@rth rth merged commit f1aedf6 into scikit-learn:master Apr 21, 2018

8 checks passed

ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: python2 Your tests passed on CircleCI!
Details
ci/circleci: python3 Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 95.11%)
Details
codecov/project 95.11% (+<.01%) compared to c3548a8
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details

@jnothman jnothman referenced this pull request Jun 16, 2018

Closed

Disregard NaNs in preprocessing #10404

6 of 7 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.