You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from sklearn.preprocessing import KBinsDiscretizer
import pandas as pd
kb = KBinsDiscretizer(n_bins=5, encode='ordinal')
kb.fit_transform(pd.DataFrame(range(100),columns=['one']))
kb.get_feature_names_out()
output:
Traceback (most recent call last):
File "", line 1, in
File "/home/smds/miniconda3/envs/ross_docker_py37/lib/python3.7/site-packages/sklearn/preprocessing/_discretization.py", line 396, in get_feature_names_out
return self._encoder.get_feature_names_out(input_features)
AttributeError: 'KBinsDiscretizer' object has no attribute '_encoder'
from sklearn.preprocessing import KBinsDiscretizer
import pandas as pd
kb = KBinsDiscretizer(n_bins=5, encode='ordinal')
kb.fit_transform(pd.DataFrame(range(100),columns=['one']))
kb.get_feature_names_out()
Expected Results
The name of the passed in feature, in this case it should be 'one'.
Actual Results
Traceback (most recent call last):
File "", line 1, in
File "/home/smds/miniconda3/envs/ross_docker_py37/lib/python3.7/site-packages/sklearn/preprocessing/_discretization.py", line 396, in get_feature_names_out
return self._encoder.get_feature_names_out(input_features)
AttributeError: 'KBinsDiscretizer' object has no attribute '_encoder'
Describe the bug
If the encode = 'onehot', you can get the names out as intended:
output:
array(['one_0.0', 'one_1.0', 'one_2.0', 'one_3.0', 'one_4.0'],
dtype=object)
If the encode is NOT onehot:
output:
Traceback (most recent call last):
File "", line 1, in
File "/home/smds/miniconda3/envs/ross_docker_py37/lib/python3.7/site-packages/sklearn/preprocessing/_discretization.py", line 396, in get_feature_names_out
return self._encoder.get_feature_names_out(input_features)
AttributeError: 'KBinsDiscretizer' object has no attribute '_encoder'
I have hunted down the source of the bug. It's because if the encode != 'onehot', the attribute _encoder won't even be established. It's only established if encode == 'onehot':
https://github.com/scikit-learn/scikit-learn/blob/37ac6788c/sklearn/preprocessing/_discretization.py#L240-L248
You can see that when you try to call get_feature_names_out(), it's looking for the self._encoder, which wouldn't have been created in the first place if you don't use encode='onehot':
https://github.com/scikit-learn/scikit-learn/blob/37ac6788c/sklearn/preprocessing/_discretization.py#L376-L396
Steps/Code to Reproduce
from sklearn.preprocessing import KBinsDiscretizer
import pandas as pd
kb = KBinsDiscretizer(n_bins=5, encode='ordinal')
kb.fit_transform(pd.DataFrame(range(100),columns=['one']))
kb.get_feature_names_out()
Expected Results
The name of the passed in feature, in this case it should be 'one'.
Actual Results
Traceback (most recent call last):
File "", line 1, in
File "/home/smds/miniconda3/envs/ross_docker_py37/lib/python3.7/site-packages/sklearn/preprocessing/_discretization.py", line 396, in get_feature_names_out
return self._encoder.get_feature_names_out(input_features)
AttributeError: 'KBinsDiscretizer' object has no attribute '_encoder'
Versions
The text was updated successfully, but these errors were encountered: