Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store feature names for transformed estimators #22

Merged
merged 2 commits into from
May 15, 2023

Commits on May 15, 2023

  1. Store feature names when for transformed estimators #20

    Transformed estimators like GNNRegressor run a transformer on X
    before fitting or predicting. When X is a dataframe, transforming
    converts it into an array, preventing sklearn from extracting
    feature names. To fix this, we wrap the transformed array in an
    ndarray subclass called NamedFeatureArray that is capable of
    storing a `columns` attribute prior to passing it to `fit` or
    `predict`. This tricks sklearn into thinking that it is a
    dataframe and allows feature names to be successfully accessed
    and set on the estimator.
    
    To accomplish this cleanly, we move all the actual transformation
    steps out of the individual estimators and in to the
    TransformedKNeighborsMixin methods. If we need to implement
    different `predict` methods for different estimators in the
    future, they can be re-implemented at the estimator level to use
    the _transform method of their superclass.
    
    To prevent regressions, this commit also expands the dataframe
    support test to check feature names are correctly stored.
    aazuspan committed May 15, 2023
    Configuration menu
    Copy the full SHA
    d518f4d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ab8deb4 View commit details
    Browse the repository at this point in the history