[WIP] Support pandas DataFrames and feature names in ExhaustiveFeatureSelector #380
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Adds support for a feature names and pandas DataFrames in the
ExhaustiveFeatureSelector
. In particular, the EFS methods (fit
,transform
, etc.) now support pandas DataFrames in addition to NumPy(-like) arrays. Also, the feature names will be recorded inself.k_feature_names_
as well asself.subsets_['feature_names']
. If a pandas DataFrame is provided as input, these feature names are correspond to the column names. Otherwise, the column indices as string representation will be used as a placeholder.Finally, an optional
feature_names
parameter is added to theExhaustiveFeatureSelector
constructor, which allows users to pass custom feature names corresponding to column indices to improve the interpretability of the selected feature subsets viaself.subsets_['feature_names']
andself.k_feature_names_
. Note that user-provided feature names have precedence over feature names based on column indices or pandas DataFrame columns but are only used for labeling purposes.Related issues or pull requests
Pull Request Checklist
./docs/sources/CHANGELOG.md
file (if applicable)./mlxtend/*/tests
directories (if applicable)mlxtend/docs/sources/
(if applicable)nosetests ./mlxtend -sv
and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,nosetests ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv
)flake8 ./mlxtend