-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The v4 hyperactive wrappers of GFO have a feature where they encode categorical features as consecutive integers - this kind of encoding is a desirable feature, potentially as a default.
Related issues:
- There is also a potentially undesirable secondary effect, namely the encoding of numerical values as integers as well, which may or may not be desired by the user depending on circumstance.
- as an alterative to consecutive encoding - note that pure categoricals in general do not have an order - one could think of one-hot encoding
Some designs I can think of:
-
the current
hyperactive v4design that does the consecutive integer encoding by default for all categoricals and numericals -
encoding only categoricals, leaving numericals as-os
-
having tags for estimators on whether they can handle categoricals, e.g.,
capability:categorical.
Estimators that cannot handle categoricals - such as native GFO - return an error if categoricals are passed.
They can be wrapped in meta-estimators such as CategoricalEncoder.
- similar to 3, except that estimators without the capability encode automatically like
hyperactive v4.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request