Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring #19162

Merged
merged 24 commits into from Jan 25, 2021

Conversation

glemaitre
Copy link
Member

@glemaitre glemaitre commented Jan 12, 2021

Refactor the OOB scoring in forest.

TODO:

  • Share the most code possible between OOB score in regressor and classifier
  • Specialize regressor and classifier regarding how to get the predictions and compute the score
  • Disable OOB score for multioutput-multiclass since sklearn does not have any metric for this case.
  • Add additional tests regarding some untested classification case (multilabel)
  • Add additional tests regarding the state/shape/etc. of the OOB attributes

@glemaitre glemaitre changed the title Refactor oob trees FIX/MNT do not support OOB score for multiclass-multioutput and additional refactoring Jan 13, 2021
@glemaitre
Copy link
Member Author

@ogrisel it should be OK now to have a first review.

@glemaitre glemaitre marked this pull request as ready for review January 13, 2021 21:44
@glemaitre
Copy link
Member Author

@thomasjpfan @ogrisel @NicolasHug @adrinjalali @amueller Do you want to have a look at this PR. This could be useful to further review the permutation importance using the OOB.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @glemaitre

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
glemaitre and others added 2 commits January 20, 2021 11:24
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general idea (and also the new tests!) but here are some small suggestions to consider before merging. Let me know if you agree or not.

sklearn/ensemble/_forest.py Show resolved Hide resolved
sklearn/ensemble/tests/test_forest.py Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Base automatically changed from master to main January 22, 2021 10:53
@glemaitre
Copy link
Member Author

ping @adrinjalali for a second approval :)

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @glemaitre . LGTM, if you're not worried about the comment bellow, feel free to merge :)

Comment on lines +458 to +462
else:
# for regression, n_classes_ does not exist and we create an empty
# axis to be consistent with the classification case and make
# the array operations compatible with the 2 settings
oob_pred_shape = (n_samples, 1, n_outputs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully there is no classifier out there which has an effective n_classes_ > 1 and yet not setting the right flag to say it's a classifier :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not pass our common test then I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants