One Hot Encoder errors out if there are missing values even if there are no categorical features #3082

freddyaboulton · 2021-11-18T21:40:57Z

Repro

import pandas as pd
import woodwork as ww
from evalml.pipelines.components import OneHotEncoder
import pytest


df = pd.DataFrame({"a": [1.2, 2.3, 4.5, 6.7],
                   "b": [True, False, True, True],
                   "c": [4.5, 8.3, None, 4.3]})
df.ww.init(logical_types={"a": "Double",
                          "b": "Boolean",
                          "c": "Double"})

with pytest.raises(ValueError, match="Input contains NaN"):
    OneHotEncoder().fit_transform(df)

I would expect this to be a no-op since there are no categorical features in the data.

freddyaboulton · 2021-11-18T23:12:26Z

This is the root cause of #2967

freddyaboulton added the bug Issues tracking problems with existing features. label Nov 18, 2021

freddyaboulton self-assigned this Nov 18, 2021

freddyaboulton mentioned this issue Nov 19, 2021

Handle boolean and categorical features for time series #3083

Merged

freddyaboulton closed this as completed in #3083 Nov 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One Hot Encoder errors out if there are missing values even if there are no categorical features #3082

One Hot Encoder errors out if there are missing values even if there are no categorical features #3082

freddyaboulton commented Nov 18, 2021

freddyaboulton commented Nov 18, 2021

One Hot Encoder errors out if there are missing values even if there are no categorical features #3082

One Hot Encoder errors out if there are missing values even if there are no categorical features #3082

Comments

freddyaboulton commented Nov 18, 2021

freddyaboulton commented Nov 18, 2021