ENH add multioutput support for RFE #16103

divyaprabha123 · 2020-01-12T07:08:25Z

What does this implement/fix? Explain your changes.

Regressor like random forest natively supports multi-output regression but when we use RFE bad input shape error is thrown, making it insufficient to utilize the multi-output facility.

This implementation checks the type of the target variable using utils.multiclass.type_of_target, if the type is equal to multioutput then "multi_output" argument of check_X_y is set to True.

Checked the type of the target variable, if the target is equal to multioutput then multioutput is set to zero before check_X_y

jnothman · 2020-01-12T09:00:32Z

Thanks for the pull request! Please fix the pep8 error, and please add a test

glemaitre · 2020-01-13T14:33:51Z

You need to add a test. Could you test that the multioutput is supported in RFECV as well. Since RFECV inherit from RFE, it should work as well quite easily

divyaprabha123 · 2020-01-14T04:22:21Z

You need to add a test. Could you test that the multioutput is supported in RFECV as well. Since RFECV inherit from RFE, it should work as well quite easily

Sure @glemaitre I will do that :)

jnothman · 2020-01-14T07:09:15Z

sklearn/feature_selection/_rfe.py

        X, y = check_X_y(X, y, "csc", ensure_min_features=2,
-                         force_all_finite=not tags.get('allow_nan', True))
+                         force_all_finite=not tags.get('allow_nan', True),
+                         multi_output=multioutput)


Why can't we just use multi_output=True in all cases?

Yeah! that's actually possible! I thought we may doing extra computation by calling check_array function in validation.py. I will change that now :)

divyaprabha123 · 2020-02-03T17:48:11Z

Sorry accidentally closed this PR reopening it again. @rth Can you please check this and let me know what you think?

rth

Thanks @divyaprabha123 ! LTGM, aside for a minor comment.

Please add an entry to the change log at doc/whats_new/v0.23.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself with :user:.

rth · 2020-02-04T08:07:13Z

sklearn/feature_selection/tests/test_rfe.py

    rfe.transform(X)
+
+
+def test_multioutput():


I imagine the same should work with RFECV? If so please parametrize this test with,

@pytest.mark.parametrize('ClsRFE', [RFE, RFECV]) def test_multioutput(ClsRFE): ... rfe_test = ClsRFE(clf)

Yeah this will work with REFCV I will also add this in the test. Thank you so much @rth : )

rth · 2020-02-04T08:15:10Z

sklearn/feature_selection/_rfe.py

        X, y = check_X_y(X, y, "csc", ensure_min_features=2,
-                         force_all_finite=not tags.get('allow_nan', True))
+                         force_all_finite=not tags.get('allow_nan', True),
+                         multi_output=True)


As a side comment, I think it is indeed best to assume that multi_output is supported here, pass it to the underlying estimator and let it error if it doesn't. As opposed to relying on multioutput estimator tag, which can potentially be not accurate.

jnothman

LGTM pending @rth's testing suggestion. Thanks @divyaprabha123

divyaprabha123 · 2020-02-04T12:34:40Z

LGTM pending @rth's testing suggestion. Thanks @divyaprabha123

Hi @rth and @jnothman I have added the test function as suggested, please go through and let me know if I have change or improve anything!

rth

One last comment below, also you need to resolve conflicts (can be done in Github UI).

doc/whats_new/v0.23.rst

rth · 2020-02-04T18:21:37Z

doc/whats_new/v0.23.rst

-  :class:`ensemble.GradientBoostingClassifier` as well as ``predict`` method of
-  :class:`tree.DecisionTreeRegressor`, :class:`tree.ExtraTreeRegressor`, and
-  :class:`ensemble.GradientBoostingRegressor`.
-  :pr:`16331` by :user:`Alexandre Batisse <batalex>`.


The merge conflict resolution went wrong somehow for the what's new. There should be no diff here aside for your changes. Will push a commit to fix it.

@rth Thanks for accepting the pull request :)

Mulioutput support for REF

c7e6fed

Checked the type of the target variable, if the target is equal to multioutput then multioutput is set to zero before check_X_y

divyaprabha123 changed the title ~~Mulioutput support for REF~~ Mulioutput support for RFE Jan 12, 2020

Multi-output support for RFE

b0154c4

divyaprabha123 added 3 commits January 12, 2020 16:47

[WIP] Multi-output support for RFE

d91a562

[WIP] Multi-ouput support for RFE

ee73f66

[WIP] Multioutput support for RFE

2319786

glemaitre changed the title ~~Mulioutput support for RFE~~ ENH add multioutput support for RFE Jan 13, 2020

ENH add multioutput support for RFE

797500a

jnothman reviewed Jan 14, 2020

View reviewed changes

divyaprabha123 added 8 commits January 14, 2020 16:10

ENH add multioutput support for RFE #16103

9d344d2

ENH add multioutput support for RFE #16103

9a9e883

ENH add multioutput support for RFE #16103

7d30314

ENH add multioutput support for RFE

5fc6e86

ENH add multioutput support for RFE

eb432cb

ENH add multioutput support for RFE

d123fd8

ENH add multioutput support for RFE

ec7b632

ENH add multioutput support for RFE

4d7ee07

divyaprabha123 requested review from jnothman and glemaitre January 14, 2020 11:06

divyaprabha123 closed this Feb 3, 2020

divyaprabha123 reopened this Feb 3, 2020

divyaprabha123 requested a review from rth February 4, 2020 06:41

rth approved these changes Feb 4, 2020

View reviewed changes

rth reviewed Feb 4, 2020

View reviewed changes

jnothman approved these changes Feb 4, 2020

View reviewed changes

ENH add multioutput support for RFE (#16103)

8107f8d

ENH add multioutput support for RFE #16103

88446b5

rth reviewed Feb 4, 2020

View reviewed changes

doc/whats_new/v0.23.rst Outdated Show resolved Hide resolved

divyaprabha123 and others added 4 commits February 4, 2020 18:40

ENH add multioutput support for RFE #16103

38b3cbc

Merge branch 'master' into 0.23.x

f77a8a0

ENH add multioutput support for RFE #16103

308dcea

DOC fix merge conflict

1484250

rth reviewed Feb 4, 2020

View reviewed changes

rth merged commit 63cfc5f into scikit-learn:master Feb 4, 2020

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 22, 2020

ENH add multioutput support for RFE (scikit-learn#16103)

dd160d2

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020

ENH add multioutput support for RFE (scikit-learn#16103)

142584e

thomasjpfan mentioned this pull request Apr 9, 2021

[WIP] Allow NaN/Inf in univariate feature selectors #15434

Closed

Uh oh!

ENH add multioutput support for RFE #16103

ENH add multioutput support for RFE #16103

Uh oh!

Conversation

divyaprabha123 commented Jan 12, 2020

What does this implement/fix? Explain your changes.

Uh oh!

jnothman commented Jan 12, 2020

Uh oh!

glemaitre commented Jan 13, 2020

Uh oh!

divyaprabha123 commented Jan 14, 2020

Uh oh!

jnothman Jan 14, 2020

Choose a reason for hiding this comment

Uh oh!

divyaprabha123 Jan 14, 2020

Choose a reason for hiding this comment

Uh oh!

divyaprabha123 commented Feb 3, 2020

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth Feb 4, 2020

Choose a reason for hiding this comment

Uh oh!

divyaprabha123 Feb 4, 2020

Choose a reason for hiding this comment

Uh oh!

rth Feb 4, 2020

Choose a reason for hiding this comment

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

divyaprabha123 commented Feb 4, 2020

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rth Feb 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

divyaprabha123 Feb 5, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rth Feb 4, 2020 •

edited

Loading