Skip to content

[WIP] Implemented SelectFromModel meta-transformer #3011

Closed
wants to merge 17 commits into from

5 participants

@maheshakya

Fix for #2160
Implemented a meta-transformer to with _LearntSelectorMixin. Test cases are included.
Documentation(with examples) needs to be completed.

@jnothman
scikit-learn member

Thanks for tackling this. A number of points:

  • please put this in the same file as _LearntSelectorMixin
  • although I suggested inheriting from _LearntSelectorMixin, it isn't sufficient for correct operation. _LearntSelectorMixin.transform inspects the estimator directly, e.g. to inspect the penalty parameter. You will need to turn _LearntSelectorMixin.transform into one or more functions.
  • You shouldn't set any underscore-suffixed attributes in your estimator's __init__. This should be done in fit. It's fine to postpone validation to fit.
  • You shouldn't need to have estimator_params. clone will keep any settings, and BaseEstimator will handle the delegation of attribute setting so that this can work in grid search.
  • SelectFromModel should accept two parameters: estimator and threshold.
  • I am not sure about duplicating the coefficients here to store them as an attribute. It would be more relevant to store the aggregate feature importances: because _LearntSelectorMixin currently has to sum over the coefficients for multiclass linear models, it might be useful to see the summed features.) But it's not essential to store these locally.
  • It would also be nice to have an attribute threshold_ since _LearntSelectorMixin's calculation of the actual threshold can be non-trivial.
  • SelectFromModel should inherit from sklearn.feature_selection.base.SelectorMixin and implement _get_support_mask.
  • This PR should include the deprecation of _LearntSelectorMixin. Use the @deprecated decorator on transform.

I'm changing this PR to a [WIP]. I hope that's alright with you.

@jnothman
scikit-learn member

Also, the failing tests happen because this can't be constructed as it is. I think it should be included here: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/testing.py#L441.

And although it does nothing at the moment, please inherit from sklearn.base.MetaEstimatorMixin.

@jnothman jnothman changed the title from [MRG] Implemented SelectFromModel meta-transformer to [WIP] Implemented SelectFromModel meta-transformer Mar 27, 2014
@maheshakya

I have separated out the section where threshold, importances and penalty is retrieved in _LearntSelectorMixin into two functions. One of them uses @depricated decorator.

Other issues are fixed.

@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
+ if self.estimator is not None:
+ self.estimator_ = self.estimator
+ else:
+ self.estimator_ = default
+
+ if self.estimator_ is None:
+ raise ValueError("estimator cannot be None")
+
+ def _make_estimator(self):
+ """Make and configure a copy of the `estimator_` attribute."""
+
+ estimator = clone(self.estimator_)
+
+ return estimator
+
+ def __init__(self, estimator=None, threshold=None):
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

Please move this method to right below the class docstring.

estimator=None is not useful. Leave it without a default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/utils/testing.py
@@ -439,7 +439,7 @@ def uninstall_mldata_mock():
# Meta estimators need another estimator to be instantiated.
meta_estimators = ["OneVsOneClassifier",
"OutputCodeClassifier", "OneVsRestClassifier", "RFE",
- "RFECV", "BaseEnsemble"]
+ "RFECV", "BaseEnsemble", "SelectFromModel"]
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

The test failure suggests this is not the right solution. Put it in other instead, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@maheshakya

Since threshold in now a parameter to SelectFromModel, transform function should use that value. So in that case the value taken from the transform function will be useless.
Is this a correct way?

When transform function is called from another estimator, it will act in the usual way.

@coveralls

Coverage Status

Coverage remained the same when pulling 9506120 on maheshakya:select_from_model into eb10c4c on scikit-learn:master.

@jnothman
scikit-learn member

So in that case the value taken from the transform function will be useless.

The correct way is to redefine transform in the metaestimator not to take a threshold parameter.

@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
+
+ def __init__(self, estimator, threshold=None):
+ self.estimator = estimator
+ self.threshold = threshold
+
+ def _validate_estimator(self, default=None):
+ """Check the estimator and set the `estimator_` attribute."""
+ if self.estimator is not None:
+ self.estimator_ = self.estimator
+ else:
+ self.estimator_ = default
+
+ if self.estimator_ is None:
+ raise ValueError("estimator cannot be None")
+
+ def _make_estimator(self):
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

Please inline this method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
"""Transformer mixin selecting features based on importance weights.
This implementation can be mixin on any estimator that exposes a
``feature_importances_`` or ``coef_`` attribute to evaluate the relative
importance of individual features for feature selection.
+
+ Attributes
+ ----------
+ `threshold_`: float
+ The threshold value used for feature selection.
+
+ `support_mask_`: an estimator
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

You have documented these attributes in the wrong class.

@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

And support_mask_ is not an estimator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman
scikit-learn member

I don't think it's a good idea to store support_mask_. It is inconsistent with other feature selectors, duplicates the get_support method provided by SelectorMixin, and means you can't change the threshold and just call get_support to reevaluate it.

@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
+ classification, real numbers in regression).
+
+ **fit_params : Other estimator specific parameters
+
+ Returns
+ -------
+ self : object
+ Returns self.
+ """
+
+ # Validate and make estimator
+ self._validate_estimator()
+ self.estimator = self._make_estimator()
+
+ # Convert data
+ X, y = check_arrays(X, y)
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

I don't think this belongs here. It's the base estimator's business.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
+ the median (resp. the mean) of the feature importances. A scaling
+ factor (e.g., "1.25*mean") may also be used. If None and if
+ available, the object attribute ``threshold`` is used. Otherwise,
+ "mean" is used by default.
+
+ Attributes
+ ----------
+ `estimator_`: an estimator
+ The base estimator from which the transformer is built.
+ """
+
+ def __init__(self, estimator, threshold=None):
+ self.estimator = estimator
+ self.threshold = threshold
+
+ def _validate_estimator(self, default=None):
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

I don't think this method really needs to exist. There is no public way to set a default; estimator will have to be set explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman jnothman commented on an outdated diff Mar 30, 2014
sklearn/feature_selection/from_model.py
+ "estimator's parameter to compute it?")
+ if len(importances) != X.shape[1]:
+ raise ValueError("X has different number of features than"
+ " during model fitting.")
+
+ # Retrieve threshold
+ if threshold is None:
+ if hasattr(self, "penalty") and self.penalty == "l1":
+ # the natural default threshold is 0 when l1 penalty was used
+ threshold = getattr(self, "threshold", 1e-5)
+ else:
+ threshold = getattr(self, "threshold", "mean")
+
+ return importances, threshold
+
+ def _set_parameters_meta_transfomer(self, X, threshold):
@jnothman
scikit-learn member
jnothman added a note Mar 30, 2014

You have invented a lot of code duplication, when you could just allow the estimator to be passed as an argument to a shared staticmethod or module-level function. Please also choose a more descriptive name than _set_parameters. (If you use a module-level function, you don't have a problem of potentially overriding a method in a class using the mixin.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman
scikit-learn member

Btw, it may be cleaner to implement this with the metaestimator not inheriting from the mixin. Please feel free to do it that way if you think it improves the code.

@maheshakya

I couldn't find a way to implement _get_support_mask function without storing the mask in the transform function without duplicating the code. So I made it a private member.

@coveralls

Coverage Status

Coverage remained the same when pulling be4f070 on maheshakya:select_from_model into fbe974b on scikit-learn:master.

@jnothman
scikit-learn member

This should probably support partial_fit. It should possibly also support warm_start, which would involve not using clone.

@maheshakya

@jnothman, I apologize for the inconvenience, Can you explain what exactly needs to be done in partial_fit. I checked out several estimators that defines it. But those seem to be performing different tasks.

@glouppe glouppe commented on an outdated diff Mar 31, 2014
sklearn/feature_selection/from_model.py
from ..externals import six
-from ..utils import safe_mask, atleast2d_or_csc
+from ..utils import safe_mask, atleast2d_or_csc, deprecated
+from .base import SelectorMixin
+
+
+def _get_feature_importances(estimator, X):
+ """Retrieve or aggregate feature importances from estimator"""
+ if hasattr(estimator, "feature_importances_"):
+ importances = estimator.feature_importances_
+ if importances is None:
+ raise ValueError("Importance weights not computed. Please set "
+ "the compute_importances parameter before fit.")
@glouppe
scikit-learn member
glouppe added a note Mar 31, 2014

This is deprecated in forests. You can remove this if-statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman
scikit-learn member

partial_fit allows a model to be trained without keeping all the training data in memory (or providing it) at once. Here, partial_fit should create estimator_ only on the first call, and call estimator_.partial_fit for each call.

The only reason I say it's necessary is because it's a way the current mixin can be used.

@maheshakya

I have added partial_fit and warm_start.

BTW is there something wrong with Travis CI? It doesn't seem to be building my last two commits.

@jnothman
scikit-learn member
jnothman commented Apr 1, 2014

BTW is there something wrong with Travis CI? It doesn't seem to be building my last two commits.

It's highly non user-friendly, but this means that it can't automatically merge your work with master, which it does before testing. Could you please rebase on the current master and force-push the rebased branch?

@coveralls

Coverage Status

Coverage remained the same when pulling 9a41f2f on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@coveralls

Coverage Status

Coverage remained the same when pulling 1469c48 on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@jnothman jnothman and 1 other commented on an outdated diff Apr 1, 2014
sklearn/feature_selection/from_model.py
+ X : array-like of shape = [n_samples, n_features]
+ The training input samples.
+
+ y : array-like, shape = [n_samples]
+ The target values (integers that correspond to classes in
+ classification, real numbers in regression).
+
+ **fit_params : Other estimator specific parameters
+
+ Returns
+ -------
+ self : object
+ Returns self.
+ """
+ if not hasattr(self.estimator, "partial_fit"):
+ raise(AttributeError, "estimator does not have"
@jnothman
scikit-learn member
jnothman added a note Apr 1, 2014

That ( can't be there for this to be correct syntax in Python 3.

But I think it's fine if you don't explicitly check this case. The AttributeError produced by self.estimator_.partial_fit (e.g. ''LinearSVC' object has no attribute 'partial_fit'') is clear enough.

@maheshakya
maheshakya added a note Apr 1, 2014

Yes, I agree.
I removed the test case for that as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@coveralls

Coverage Status

Coverage remained the same when pulling ccbab10 on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@coveralls

Coverage Status

Coverage remained the same when pulling 77b30e4 on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@jnothman
scikit-learn member
jnothman commented Apr 1, 2014

I'm not sure the tests are quite satisfactory yet (and I might not be around for the next few days to take another look), but I think we need to seek opinions as to whether a meta-estimator improves on the mixin. @glouppe, WDYT, and who else is likely to be opinionated on this?

@maheshakya

What are the tests that need to be improved. I can work on those.

@jnothman
scikit-learn member
@maheshakya

Thanks. I will give it a shot. (for improved test cases, examples and documentation)

@coveralls

Coverage Status

Coverage remained the same when pulling 7f12af3 on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@coveralls

Coverage Status

Coverage remained the same when pulling 11d1e81 on maheshakya:select_from_model into fec2867 on scikit-learn:master.

@maheshakya

I suppose we need to change every example that uses feature selection based on LearntSelectorMixin, right?

@jnothman
scikit-learn member

Yes, unfortunately.

@MechCoder
scikit-learn member

It seems that there are unrelated changes. I'm not sure how adding a meta-transformer should affect travis.yml

@MechCoder
scikit-learn member

Closed in favor of #4242 . I rescued this PR from git hell!

@MechCoder MechCoder closed this Feb 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.