Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode features with "unknown" class in categorical #287

merged 6 commits into from Oct 16, 2018


Copy link

Addresses #280 by checking for "unknown" class in a categorical column to be encoded and adding an unknown_token parameter to specify the string used to stand in for the unknown class. If "unknown" is present in the column and unknown_token="unknown", an AssertionError is raised telling the user to change the unknown_token parameter.

@kmax12 kmax12 self-requested a review October 15, 2018 16:48
Copy link
Contributor Author

WillKoehrsen commented Oct 15, 2018

Changed the fix to be much simpler: Unknown classes are now encoded as "{feature_name} is unknown". Thanks to @kmax12 for the suggestion.

defaults to True
to_encode (list[str]): List of feature names to encode.
features not in this list are unencoded in the output matrix
defaults to encode all necessary features.
inplace (bool): Encode feature_matrix in place. Defaults to False.
verbose (str): Print progress info.
unknown_token (str): token used to replace unknown class in categorical column.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need this anymore

Copy link

kmax12 commented Oct 15, 2018

fix linting and address my one comment, then it's good to merge.

Copy link

kmax12 commented Oct 16, 2018

Looks good. Merging

@kmax12 kmax12 merged commit 4938bb5 into master Oct 16, 2018
@gsheni gsheni deleted the encode_features_unknown branch October 24, 2018 15:37
@rwedge rwedge mentioned this pull request Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

2 participants