ENH store per-transformer index into feature space in FeatureUnion #1952

jnothman · 2013-05-08T23:45:04Z

FeatureUnion provided no simple way to tell which parts of the stack belonged to which transformer. This provides a feature_ptr_ attribute (of the form of csr_matrix.indptr) to solve that.

Caveats:

it can only be determined when fit_transform, not fit, is called
it will be incorrect if one of the sub-transformers has a change of parameters that affects the output size at transform time without a refit.

Both these caveats would be solved if each transformer in sklearn provided a way to get the number of output features. I suggest a transformed_width_ attribute or get_transformed_width() method (better name?), the latter making it more clear that the output can be affected by set_params. I also suggest this be posed as an Easy Issue for someone to tackle.

Finally, feature_ptr_ is compact and versatile, but it might be more usable if I add a method to get this data as a dict from transformer-name to slices.

jnothman · 2014-08-10T12:17:15Z

Closing due to lack of interest and underspecification.

jnothman · 2017-06-28T06:06:50Z

I reckon this kind of thing is still worth having. I intend to resurrect it. I still don't have a certain solution for the fit() case.

adrinjalali · 2019-08-07T12:17:24Z

Since having feature_names_out_ is going to fix this, and once solution or the other will be accepted hopefully soon, closing this one.

ENH store per-transformer index into feature space in FeatureUnion

025c118

jnothman mentioned this pull request May 27, 2013

get_feature_names support for pipelines #2007

Closed

jnothman mentioned this pull request Jul 4, 2013

[MRG] Non-categorical variables in OneHotEncoder. #2027

Merged

jnothman closed this Aug 10, 2014

jnothman reopened this Jun 28, 2017

jnothman added help wanted Stalled labels Nov 29, 2017

This was referenced Jul 10, 2018

[MRG+2] Merge discrete branch into master #9342

Merged

Support inverse_transform in ColumnTransformer #11463

Open

jnothman mentioned this pull request Nov 21, 2018

RFC Implement Pipeline get feature names #12627

Closed

3 tasks

adrinjalali closed this Aug 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH store per-transformer index into feature space in FeatureUnion #1952

ENH store per-transformer index into feature space in FeatureUnion #1952

jnothman commented May 8, 2013

jnothman commented Aug 10, 2014

jnothman commented Jun 28, 2017

adrinjalali commented Aug 7, 2019

ENH store per-transformer index into feature space in FeatureUnion #1952

ENH store per-transformer index into feature space in FeatureUnion #1952

Conversation

jnothman commented May 8, 2013

jnothman commented Aug 10, 2014

jnothman commented Jun 28, 2017

adrinjalali commented Aug 7, 2019