Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove CategoricalEncoder base class and get_feature_names function #1179

Merged
merged 2 commits into from
Sep 16, 2020

Conversation

dsherry
Copy link
Contributor

@dsherry dsherry commented Sep 15, 2020

Problem
The get_feature_names accessor doesn't work, now that our one-hot encoder is more than a wrapper around the sklearn one-hot encoder, and we aren't using the component_obj code path for it.

Repro

import pandas as pd
import numpy as np
import evalml
X, y = evalml.demos.load_breast_cancer()
ohe = evalml.pipelines.components.OneHotEncoder()
ohe.fit(X, y=y)
ohe.get_feature_names()

Results in

AttributeError: 'NoneType' object has no attribute 'feature_names'

Proposal
Let's delete get_feature_names. We have a different API in place for accessing the list of the names of the features provided to each component in a pipeline.

At that point, CategoricalEncoder is just an empty class only referenced by one other piece of code. So we should remove it.

@dsherry dsherry added the bug Issues tracking problems with existing features. label Sep 15, 2020
@codecov
Copy link

codecov bot commented Sep 15, 2020

Codecov Report

Merging #1179 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1179   +/-   ##
=======================================
  Coverage   99.91%   99.92%           
=======================================
  Files         197      196    -1     
  Lines       11705    11700    -5     
=======================================
- Hits        11695    11691    -4     
+ Misses         10        9    -1     
Impacted Files Coverage Δ
evalml/pipelines/__init__.py 100.00% <ø> (ø)
evalml/pipelines/components/__init__.py 100.00% <ø> (ø)
...lines/components/transformers/encoders/__init__.py 100.00% <ø> (ø)
...alml/pipelines/components/transformers/__init__.py 100.00% <100.00%> (ø)
...components/transformers/encoders/onehot_encoder.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a6470d1...5c9a16f. Read the comment docs.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this cleanup! LGTM 👍

@dsherry
Copy link
Contributor Author

dsherry commented Sep 16, 2020

Note I filed a separate issue #1183 to track adding this attribute back in, with proper support.

@dsherry dsherry merged commit 35aae78 into main Sep 16, 2020
This was referenced Sep 17, 2020
@freddyaboulton freddyaboulton deleted the ds_remove_encoder_base_class branch May 13, 2022 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues tracking problems with existing features.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants