Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: extend Pipeline transform and inverse_transform behavior #2561

Closed

Conversation

schwarty
Copy link

Pipeline transform and inverse_transform skip last step if it lacks corresponding methods.

The extension is useful to:

  • extract the features before the last step of the pipeline for debugging purposes
  • inverse_transform the coef_ attribute of a linear classifier for example

Design discussion here: Issue #2562

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling d9260f0 on schwarty:pipeline_inverse_transform into 03926cc on scikit-learn:master.

# test the last step implements an inverse_transform method
inverse_steps = self.steps[::-1]
if not hasattr(self.steps[-1][-1], 'inverse_transform'):
inverse_steps = self.steps[:-1][::-1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please edit the docstring of the inverse_transform method to explain that it works even if the last step does not support the inverse_transform method and give motivation for this feature as done for the forward case?

@ogrisel
Copy link
Member

ogrisel commented Oct 30, 2013

Looks good to me. @schwarty please add a doc/whats_new.rst entry for this.

@schwarty
Copy link
Author

Done. @ogrisel and @GaelVaroquaux what do you think?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1906989 on schwarty:pipeline_inverse_transform into 03926cc on scikit-learn:master.

@@ -60,6 +60,10 @@ Changelog
- Added :class:`linear_model.RANSACRegressor` meta-estimator for the robust
fitting of regression models. By Johannes Schönberger.

- Extended transform and inverse_transform methods from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add double back-ticks around code words like transform and inverse_transform.

@ogrisel
Copy link
Member

ogrisel commented Oct 31, 2013

Apart from the nitpick comment, LGTM. +1 for merge.

@GaelVaroquaux
Copy link
Member

As I just told @schwarty orally, I am not in favor of merging this PR, although I had backed the idea during our first discussions. I don't like the automatic change of behavior depending on subtle properties of the final learner. For instance changing the code to use an LDA instead of a LogisticRegression would have really strange consequences with this PR.

@ogrisel
Copy link
Member

ogrisel commented Oct 31, 2013

Do you have an alternative API to propose? Maybe a new constructor param?

I don't see your point with LDA. How this PR would change the behavior in this case? It is my understanding that only cases that would have previously raised an exception are now supported.

@GaelVaroquaux
Copy link
Member

I don't see your point with LDA? How this PR would change the behavior in this
case? It is my understanding that only cases that would have previously raised
an exception are now supported.

Yes: with a LogisticRegression it would have raised an error, but not
with LDA. Now both work, but do something different. The difference is
subtle and can easily fool someone.

@ogrisel
Copy link
Member

ogrisel commented Oct 31, 2013

Hum ok I see your point. Still I think @schwarty's use cases are very valid and are not addressed by the current implementation. Maybe @schwarty you should put this PR back to WIP and raise the point on the mailing list to brain storm alternative design ideas.

@GaelVaroquaux
Copy link
Member

Still I think @schwarty's use cases are very
valid and are not addressed by the current implementation.

I agree.

Maybe @schwarty you should put this PR back to WIP and raise the point
on the mailing list to brain storm alternative design ideas.

He has opened an issue.

@jnothman
Copy link
Member

jnothman commented Nov 1, 2013

I am not in favor of merging this PR, although I had backed the idea during our first discussions. I don't like the automatic change of behavior depending on subtle properties of the final learner.

I agree with @GaelVaroquaux. It's a case of implicit behaviour that could bite you. What I wouldn't mind (though perhaps it's overkill and will create API problems) is for Pipeline to support slicing. I also think it deserves a getter for the final estimator (.steps[-1][1]) just to increase readability:

p = Pipeline([('sel', SelectKBest()), ('clf', LogisticRegression())])
p[:-1].inverse_transform(p.final_estimator.coef_)

@jnothman
Copy link
Member

jnothman commented Nov 1, 2013

Or, just as slicing a list with an integer returns an element and with a slice returns a list, you could have integer slicing return just the estimator:

p[:-1].inverse_transform(p[-1].coef_)

@GaelVaroquaux
Copy link
Member

I think that there is a consensus against merging this PR. @schwarty : is it OK with you if I close it.

@amueller
Copy link
Member

closing as @schwarty didn't object within the last 3 years

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Oct 25, 2016 via email

@jnothman
Copy link
Member

I suppose it's good when people making more money still drink beers with you. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants