[MRG+2] Implemented SelectFromModel meta-transformer #4242

MechCoder · 2015-02-12T10:00:37Z

Continuation of #3011

MechCoder · 2015-02-12T10:03:57Z

@amueller @agramfort Do you agree with the general direction in which this PR is going?

Basically a meta-transformer is coupled with an estimator to provide the transform methods, instead of inheriting from _LearntSelectorMixin.

agramfort · 2015-02-12T22:52:20Z

hum. What's the vision? too fuzzy for me now

ogrisel · 2015-02-13T10:05:01Z

Can you please summarize the discussion of #3011 in the description of this PR?

MechCoder · 2015-02-13T10:07:05Z

Sure, just give me some time.

jnothman · 2015-02-15T09:55:35Z

#3011 was meant to pilot the idea of having selection based on coefficients/importances performed by a metaestimator rather than a mixin to predictors. Part of the point is not to clutter the individual estimators' APIs with parameters only used in rare cases. Instead, a metaestimator should:

make code much more explicit as to the meaning of the transformation (i.e. constructing a metaestimator provides intrinsic documentation, while using a predictor as a transformer without comment is a bit awkward);
provide a focal point for documenting what has become a fairly sophisticated threshold argument, and;
potentially make it easier to play with thresholds over a pre-fitted model (although the interface for this is new territory without my magic freeze_model wrapper, without non-standardly adding a parameter to transform.

By way of Zen tradeoffs, it prefers "explicit is better than implicit" over "flat is better than nested". Is the vision clearer, @agramfort?

Also, it's possible @maheshakya had no intention to continue working on this, but @MechCoder it may be wise to check before simply taking it over, even if it's stale.

jnothman · 2015-02-15T09:57:19Z

The main immediate benefits of this are: there's no need to move the threshold parameter from transform to the class to make it gridsearchable; there's no need to add the mixin to all classes that have feature_importances_ or coef_ for the sake of consistency.

jnothman · 2015-02-15T09:59:06Z

@ogrisel, I think most of the discussion on #3011 was implementation detail. A surprising amount needs to be touched to make this complete as a deprecation, including examples, tests, etc.

maheshakya · 2015-02-15T10:54:48Z

Thank you @MechCoder for bringing this up again. You can continue this :)
I was not able to complete it since we did not have enough opinion whether to change every example and test according to this and so. So, as @jnothman mentioned the amount of work that needs to be done on those is quite large. I think we should wait to hear what others have to say and for a final resolution about changes.

agramfort · 2015-02-15T16:38:35Z

@jnothman I think I get the design decision but my question was more the end user point of view. Somehing along the line of "Users with be able to do XXX with estimators A, B and C by only adding this piece of code in ...".

MechCoder · 2015-02-16T13:50:33Z

@jnothman @maheshakya Sorry for taking this PR over without saying. I incorrectly assumed that the lack of activity for around 6 months that the Pull Request is stale.

@agramfort A transform can meet two different things now.
1] Implementing the transform method in a transformer means transforming the sample points to a centroid space using the euclidean distances.
2] Inheriting from _LearntSelectorMixin, which means extracting the features of X which are below a certain threshold.

Though both are indeed transforming, inheriting from _LearntSelectorMixin classifies all models as transformers, which transform into different dimensional space. By using this metatransformer, such ambiguity is resolved.

So it goes like "Users will be able to extract the n most important features by wrapping around this Metatransformer with all estimators, without having to subclass from _LearntSelectorMixin"

agramfort · 2015-02-16T17:00:25Z

ok got it. I like the idea. I guess we need to document this new SelectFromModel class and explain its usage. It should also appear in one of the examples.

jnothman · 2015-02-16T21:48:28Z

(It's more that stale doesn't strictly imply ripe for adoption. There may
be reasons why something is stale.)

On 17 February 2015 at 04:00, Alexandre Gramfort notifications@github.com
wrote:

ok got it. I like the idea. I guess we need to document this new
SelectFromModel class and explain its usage. It should also appear in
one of the examples.

—
Reply to this email directly or view it on GitHub
#4242 (comment)
.

amueller · 2015-02-25T19:47:54Z

I agree with the direction, thought I didn't have too close a look. Is the idea to remove transform from the models that already have the mixin?

MechCoder · 2015-07-28T13:19:42Z

Sorry, for the huge delay. I just had a look at the pull request today and found out all the hard work has already been done by @maheshakya . I've updated the pull request description to list out todo's about what all is left to be done.

I will not be able to work on this in the near future. I've updated the label as easy as this should be a very good issue for a new developer.

amueller · 2015-09-02T19:32:28Z

@MechCoder maybe this would be a good project for you to finish? It would be very helpful.

MechCoder · 2015-09-02T23:56:18Z

OK, if you insist.

MechCoder · 2015-09-03T03:42:24Z

Thinking about it, is it necessary to test each and every model that has a coef_ and feature_importances_ attribute after fitting. I don't think so because the implementation in SelectFromModel is independent of the implementation of the underlying base estimators.

@amueller could you verify that the tests in test_from_model are enough?

amueller · 2015-10-09T19:33:31Z

doc/modules/feature_selection.rst

+
+:class:`SelectFromModel` is a meta-transformer that can be used along with any
+estimator that has a ``coef_`` or ``feature_importances_`` attribute after fitting.
+It should be given a ``threshold`` parameter below which the features are considered


maybe remove "below"?

amueller · 2015-10-09T22:08:58Z

doctests raised some Function transform is deprecated warnings.

amueller · 2015-10-09T22:13:44Z

sklearn.utils.tests.test_estimator_checks.test_check_estimator also has one. the rest seems good. [you could raise instead to see if it happens in the tests]

MechCoder · 2015-10-10T01:57:58Z

@amueller thanks for your comments. The last commit should address all of them.

jnothman · 2015-10-10T10:05:05Z

sklearn/feature_selection/from_model.py

+        This can be both a fitted (if ``prefit`` is set to True)
+        or a non-fitted estimator.
+
+    threshold : string, float, optional default None


I think this is one of those cases where default None is not very helpful, but the description should say "By default, [this is its behaviour]..."

I have mentioned that below. (and might be too long to write here)

What I meant is that "default None" could be dropped, but say "By default,..." rather than "When None,..." below. But it's a really minor nitpick.

jnothman · 2015-10-11T01:14:41Z

Once those are fixed up, it's good for merge (LGTM).

MechCoder · 2015-10-11T02:08:40Z

@jnothman I'll merge after Travis passes?

jnothman · 2015-10-11T03:10:49Z

Yes, with two whats_new entries: one under API changes to tell people their transforms will be disappearing, and one under new features. thank you.

[MRG+1] Implemented SelectFromModel meta-transformer

MechCoder · 2015-10-11T03:15:28Z

Thanks a lot! @maheshakya @glouppe @amueller @jnothman

amueller · 2015-10-12T21:19:09Z

🍻 was lange währt währt endlich gut. Thanks @MechCoder :)

MechCoder mentioned this pull request Feb 12, 2015

[WIP] Implemented SelectFromModel meta-transformer #3011

Closed

MechCoder changed the title ~~Implemented SelectFromModel meta-transformer~~ [WIP] Implemented SelectFromModel meta-transformer Feb 12, 2015

MechCoder mentioned this pull request Feb 12, 2015

[MRG] Inherit LinearModels from _LearntSelectorMixin #4241

Closed

MechCoder force-pushed the select_from_model branch from 61505ee to 360e81a Compare February 16, 2015 13:53

amueller mentioned this pull request Mar 24, 2015

Add mask property to _LearntSelectorMixin #4445

Closed

jnothman mentioned this pull request May 11, 2015

ENH inherit from _LearntSelectorMixin where feature_importances_ is available #2160

Closed

MechCoder added the Easy Well-defined and straightforward way to resolve label Jul 28, 2015

This was referenced Aug 28, 2015

use get_feature_names from last step in pipeline if it is available #5172

Closed

Implemented Supervised PCA #5196

Closed

MechCoder force-pushed the select_from_model branch 4 times, most recently from 5dff9a9 to 22bac70 Compare September 3, 2015 03:39

amueller reviewed Oct 9, 2015
View reviewed changes

MechCoder force-pushed the select_from_model branch 2 times, most recently from 822560e to 4f09146 Compare October 10, 2015 01:56

jnothman reviewed Oct 10, 2015
View reviewed changes

Refactor tests

c805fbc

MechCoder force-pushed the select_from_model branch from 4f09146 to c805fbc Compare October 11, 2015 02:06

jnothman changed the title ~~[MRG+1] Implemented SelectFromModel meta-transformer~~ [MRG+2] Implemented SelectFromModel meta-transformer Oct 11, 2015

MechCoder added a commit that referenced this pull request Oct 11, 2015

Merge pull request #4242 from MechCoder/select_from_model

652b950

[MRG+1] Implemented SelectFromModel meta-transformer

MechCoder merged commit 652b950 into scikit-learn:master Oct 11, 2015

MechCoder deleted the select_from_model branch October 11, 2015 03:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+2] Implemented SelectFromModel meta-transformer #4242

[MRG+2] Implemented SelectFromModel meta-transformer #4242

MechCoder commented Feb 12, 2015

MechCoder commented Feb 12, 2015

agramfort commented Feb 12, 2015

ogrisel commented Feb 13, 2015

MechCoder commented Feb 13, 2015

jnothman commented Feb 15, 2015

jnothman commented Feb 15, 2015

jnothman commented Feb 15, 2015

maheshakya commented Feb 15, 2015

agramfort commented Feb 15, 2015

MechCoder commented Feb 16, 2015

agramfort commented Feb 16, 2015 via email

jnothman commented Feb 16, 2015

amueller commented Feb 25, 2015

MechCoder commented Jul 28, 2015

amueller commented Sep 2, 2015

MechCoder commented Sep 2, 2015

MechCoder commented Sep 3, 2015

amueller Oct 9, 2015

amueller commented Oct 9, 2015

amueller commented Oct 9, 2015

MechCoder commented Oct 10, 2015

jnothman Oct 10, 2015

MechCoder Oct 10, 2015

jnothman Oct 11, 2015

jnothman commented Oct 11, 2015

MechCoder commented Oct 11, 2015

jnothman commented Oct 11, 2015

MechCoder commented Oct 11, 2015

amueller commented Oct 12, 2015

[MRG+2] Implemented SelectFromModel meta-transformer #4242

[MRG+2] Implemented SelectFromModel meta-transformer #4242

Conversation

MechCoder commented Feb 12, 2015

MechCoder commented Feb 12, 2015

agramfort commented Feb 12, 2015

ogrisel commented Feb 13, 2015

MechCoder commented Feb 13, 2015

jnothman commented Feb 15, 2015

jnothman commented Feb 15, 2015

jnothman commented Feb 15, 2015

maheshakya commented Feb 15, 2015

agramfort commented Feb 15, 2015

MechCoder commented Feb 16, 2015

agramfort commented Feb 16, 2015 via email

jnothman commented Feb 16, 2015

amueller commented Feb 25, 2015

MechCoder commented Jul 28, 2015

amueller commented Sep 2, 2015

MechCoder commented Sep 2, 2015

MechCoder commented Sep 3, 2015

amueller Oct 9, 2015

Choose a reason for hiding this comment

amueller commented Oct 9, 2015

amueller commented Oct 9, 2015

MechCoder commented Oct 10, 2015

jnothman Oct 10, 2015

Choose a reason for hiding this comment

MechCoder Oct 10, 2015

Choose a reason for hiding this comment

jnothman Oct 11, 2015

Choose a reason for hiding this comment

jnothman commented Oct 11, 2015

MechCoder commented Oct 11, 2015

jnothman commented Oct 11, 2015

MechCoder commented Oct 11, 2015

amueller commented Oct 12, 2015