feature_names_in_ in SLEP007 #23316

paulbkoch · 2022-05-09T22:53:10Z

paulbkoch
May 9, 2022

I'm wondering why the spec for feature_names_in_ and get_feature_names_out() in SLEP007 requires using a 1D numpy array of objects instead of a list? It seems to me the numpy array would be slightly more verbose to create & use, require slightly more memory, and be slightly slower than the built in datatype. Admittedly, these are all minor points, but I couldn't think of any benefits.

https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep007/proposal.html

thomasjpfan · 2022-05-10T02:55:14Z

thomasjpfan
May 10, 2022
Maintainer

I recall pointing out container type during development here, but there was no opinion against the ndarray. Personally, I would have preferred a list, but I did not want to block the PR from being merged. (The PR was opened in July 2020 and merged in Aug 2021.)

Moving forward, I can see being less restrictive, i.e. return a "Sequence of strings", but we would need to be careful to maintain backward compatibility.

0 replies

paulbkoch · 2022-06-12T02:42:33Z

paulbkoch
Jun 12, 2022
Author

Thanks @thomasjpfan -- Undoubtedly getting it in was the right choice since this is a much needed feature.

I was curious about this since I'm one of the developers of a scikit-learn compatible package (InterpretML) that has been using feature names for several years since it's critical for interpretability. We've snapped the rest of our interface to the SLEP007 spec, except for this one aspect. I think for now I'll choose the lazy route and leave our interface slightly incompatible and see if anyone raises an issue. Thanks for the insight into how this evolved.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature_names_in_ in SLEP007 #23316

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

feature_names_in_ in SLEP007 #23316

paulbkoch May 9, 2022

Replies: 2 comments

thomasjpfan May 10, 2022 Maintainer

paulbkoch Jun 12, 2022 Author

paulbkoch
May 9, 2022

thomasjpfan
May 10, 2022
Maintainer

paulbkoch
Jun 12, 2022
Author