Store feature names for transformed estimators #22

Transformed estimators like GNNRegressor run a transformer on X before fitting or predicting. When X is a dataframe, transforming converts it into an array, preventing sklearn from extracting feature names. To fix this, we wrap the transformed array in an ndarray subclass called NamedFeatureArray that is capable of storing a `columns` attribute prior to passing it to `fit` or `predict`. This tricks sklearn into thinking that it is a dataframe and allows feature names to be successfully accessed and set on the estimator. To accomplish this cleanly, we move all the actual transformation steps out of the individual estimators and in to the TransformedKNeighborsMixin methods. If we need to implement different `predict` methods for different estimators in the future, they can be re-implemented at the estimator level to use the _transform method of their superclass. To prevent regressions, this commit also expands the dataframe support test to check feature names are correctly stored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store feature names for transformed estimators #22

Store feature names for transformed estimators #22

Commits on May 15, 2023