Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH store per-transformer index into feature space in FeatureUnion #1952

Closed
wants to merge 1 commit into from

Conversation

jnothman
Copy link
Member

@jnothman jnothman commented May 8, 2013

FeatureUnion provided no simple way to tell which parts of the stack belonged to which transformer. This provides a feature_ptr_ attribute (of the form of csr_matrix.indptr) to solve that.

Caveats:

  • it can only be determined when fit_transform, not fit, is called
  • it will be incorrect if one of the sub-transformers has a change of parameters that affects the output size at transform time without a refit.

Both these caveats would be solved if each transformer in sklearn provided a way to get the number of output features. I suggest a transformed_width_ attribute or get_transformed_width() method (better name?), the latter making it more clear that the output can be affected by set_params. I also suggest this be posed as an Easy Issue for someone to tackle.

Finally, feature_ptr_ is compact and versatile, but it might be more usable if I add a method to get this data as a dict from transformer-name to slices.

@jnothman
Copy link
Member Author

Closing due to lack of interest and underspecification.

@jnothman jnothman closed this Aug 10, 2014
@jnothman jnothman reopened this Jun 28, 2017
@jnothman
Copy link
Member Author

I reckon this kind of thing is still worth having. I intend to resurrect it. I still don't have a certain solution for the fit() case.

@adrinjalali
Copy link
Member

Since having feature_names_out_ is going to fix this, and once solution or the other will be accepted hopefully soon, closing this one.

@adrinjalali adrinjalali closed this Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants