FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

glemaitre · 2021-01-12T22:22:13Z

Refactor the OOB scoring in forest.

TODO:

Share the most code possible between OOB score in regressor and classifier
Specialize regressor and classifier regarding how to get the predictions and compute the score
Disable OOB score for multioutput-multiclass since sklearn does not have any metric for this case.
Add additional tests regarding some untested classification case (multilabel)
Add additional tests regarding the state/shape/etc. of the OOB attributes

glemaitre · 2021-01-13T21:42:53Z

@ogrisel it should be OK now to have a first review.

glemaitre · 2021-01-19T09:55:03Z

@thomasjpfan @ogrisel @NicolasHug @adrinjalali @amueller Do you want to have a look at this PR. This could be useful to further review the permutation importance using the OOB.

adrinjalali

Thanks @glemaitre

doc/whats_new/v1.0.rst

sklearn/ensemble/_forest.py

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

ogrisel

I like the general idea (and also the new tests!) but here are some small suggestions to consider before merging. Let me know if you agree or not.

sklearn/ensemble/_forest.py

sklearn/ensemble/tests/test_forest.py

sklearn/ensemble/_forest.py

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

sklearn/ensemble/_forest.py

glemaitre · 2021-01-25T11:22:27Z

ping @adrinjalali for a second approval :)

adrinjalali

Thanks @glemaitre . LGTM, if you're not worried about the comment bellow, feel free to merge :)

adrinjalali · 2021-01-25T12:10:27Z

sklearn/ensemble/_forest.py

+        else:
+            # for regression, n_classes_ does not exist and we create an empty
+            # axis to be consistent with the classification case and make
+            # the array operations compatible with the 2 settings
+            oob_pred_shape = (n_samples, 1, n_outputs)


hopefully there is no classifier out there which has an effective n_classes_ > 1 and yet not setting the right flag to say it's a classifier :D

It should not pass our common test then I think.

glemaitre added 2 commits January 12, 2021 22:31

MNT refactor OOB score in forest

da306b0

Use scorer API to simplify code

b923c58

glemaitre marked this pull request as draft January 12, 2021 22:22

github-actions bot added module:ensemble module:metrics labels Jan 12, 2021

glemaitre added 12 commits January 13, 2021 16:20

some more work

54c9bc1

iter

efde289

iter

2b416d4

iter

1fa6457

MNT remove temporary test

6f6c0d7

TST add test for all supported classification

02be5fe

simplify regression scoring

4c83bb1

iter

287920f

TST update RandomTreeEmbedding test

97b3546

MNT revert setup cfg

824dfdb

TST remove wrong support for multiclass-multioutput

d0da8ba

DOC add whats new entry

24ea5e9

glemaitre changed the title ~~Refactor oob trees~~ FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring Jan 13, 2021

glemaitre marked this pull request as ready for review January 13, 2021 21:44

glemaitre mentioned this pull request Jan 13, 2021

ENH: OOB Permutation Importance for Random Forests #18603

Open

glemaitre added 3 commits January 13, 2021 23:31

iter

7d8d765

simplify code

dccfd5b

add small comment

5b36e6a

adrinjalali reviewed Jan 20, 2021

View reviewed changes

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

glemaitre and others added 2 commits January 20, 2021 11:24

Update doc/whats_new/v1.0.rst

9e5d567

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>

add suggestions adrin

255d95f

ogrisel approved these changes Jan 21, 2021

View reviewed changes

sklearn/ensemble/_forest.py Show resolved Hide resolved

sklearn/ensemble/tests/test_forest.py Outdated Show resolved Hide resolved

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

Update sklearn/ensemble/tests/test_forest.py

a37ebb5

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

glemaitre added 2 commits January 21, 2021 16:03

partially address comment ogrisel

4062d9c

finish review ogrisel

bd808a3

ogrisel reviewed Jan 21, 2021

View reviewed changes

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved

add a comment for supporting multioutpu-multiclass scoring strategy

3fbacfb

Base automatically changed from master to main January 22, 2021 10:53

Merge remote-tracking branch 'origin/main' into refactor_oob_trees

50243c0

adrinjalali approved these changes Jan 25, 2021

View reviewed changes

glemaitre merged commit 35b2bbf into scikit-learn:main Jan 25, 2021

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

johannfaouzi mentioned this pull request Oct 28, 2021

[DOC] Outdated description of attributes oob_decision_function_ and oob_prediction_ in bagging estimators #21490

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

glemaitre commented Jan 12, 2021 •

edited

glemaitre commented Jan 13, 2021

glemaitre commented Jan 19, 2021

adrinjalali left a comment

ogrisel left a comment

glemaitre commented Jan 25, 2021

adrinjalali left a comment

adrinjalali Jan 25, 2021

glemaitre Jan 25, 2021

FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

Conversation

glemaitre commented Jan 12, 2021 • edited

glemaitre commented Jan 13, 2021

glemaitre commented Jan 19, 2021

adrinjalali left a comment

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

glemaitre commented Jan 25, 2021

adrinjalali left a comment

Choose a reason for hiding this comment

adrinjalali Jan 25, 2021

Choose a reason for hiding this comment

glemaitre Jan 25, 2021

Choose a reason for hiding this comment

glemaitre commented Jan 12, 2021 •

edited