There has been some demand for Pipeline.get_feature_names (#2007, #5172, #6421) for the case where the last element in the pipeline is a feature extractor. Following on from #6372, we instead shall make get_feature_names able to transform some list of input features in the general case. I propose the following behaviour:
Pipeline.get_feature_names may be called with a list input_features as an argument only if all its estimators support get_feature_names with an argument. The initial input_features is transformed iteratively through the estimators.
Pipeline.get_feature_names may be called without an argument only if a suffix of its estimators support get_feature_names. The first of that suffix may or may not accept input_features, and the remainder must accept input_features; the output of the first get_feature_names call is iteratively modified by downstream transformers' get_feature_names.
To be cautious until we find a use-case otherwise, get_feature_names will not be supported in the case that get_feature_names is available before (but not adjacent to) that suffix.
Otherwise, a ValueError is raised. Or: should the attribute become invisible, as for predict et al.?
The text was updated successfully, but these errors were encountered:
Do you mean an AttributeError if the last step has no get_feature_names? The problem with the AttributeError is that the definition currently allows for get_feature_names that does not take an argument. Testing for this when doing the attribute lookup is fairly heavy. (Though I suspect that we will require get_feature_names to take an argument, even if unused, in any estimator where the pipeline functionality is sought.)
Ah, I didn't think about that. But these are two different errors, right? one is there is no post-fix with get_features_names and the other is feature_names was passed and there is no post-fix that takes feature_names.