Normalizer and NaN values. #7

w32zhong · 2021-12-18T22:32:51Z

For StandardScaler, looks like it supports NaN values, see class Normalizer:

null_index = np.isnan(X)

However, during preprocess, _fill_na() will fill na_value for non-string.
So

for dtype=str, the X values will be string
for dtype=float/int, the X values will be na_value

In the first case, np.isnan will throw an error because X elements are of string type.
In the second case, there is no point to normalize numbers if we have a na_value there.

Is this behavior expected or not?

The text was updated successfully, but these errors were encountered:

zhujiem · 2021-12-24T08:44:03Z

The bug has been fixed by removing null_index = np.isnan(X)
nan values are not allowed for numeric data type. users must set na_value explicitly.

zhujiem closed this as completed Dec 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalizer and NaN values. #7

Normalizer and NaN values. #7

w32zhong commented Dec 18, 2021

zhujiem commented Dec 24, 2021

Normalizer and NaN values. #7

Normalizer and NaN values. #7

Comments

w32zhong commented Dec 18, 2021

zhujiem commented Dec 24, 2021