Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] MAINT add base class for voting and stacking #15084

Merged
merged 12 commits into from Oct 5, 2019

Conversation

@glemaitre
Copy link
Contributor

glemaitre commented Sep 24, 2019

closes #15056

Create a base class for Voting* and Stacking*. They both are an ensemble of multiple learners type.
They could share the get_params, set_params and validation of estimators (as well as the fitted attributes then).

This base class could be contrasted with the ensemble of single learner type such as boosting (adaboost, GBDT), RF and Bagging.

glemaitre added 5 commits Sep 24, 2019
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Sep 24, 2019

@thomasjpfan @ogrisel @rth @adrinjalali

So the naming of the base class is terrible but I wanted to have a WIP PR such that we see what is in common and if it makes sense to merge code.

NB: the tests will fail because I did not add support for None to drop an estimator (only available in the voting and not in the stacking). This is easily fixed and would ease the deprecation.

WDYT?

sklearn/ensemble/_stacking.py Outdated Show resolved Hide resolved
Copy link
Member

thomasjpfan left a comment

I like this refactoring. The _BaseEnsembleHeterogeneousEstimator class has have well defined boundaries.

sklearn/ensemble/base.py Outdated Show resolved Hide resolved
sklearn/ensemble/base.py Outdated Show resolved Hide resolved
@glemaitre glemaitre changed the title [WIP] MAINT add base class for voting and stacking [MRG] MAINT add base class for voting and stacking Oct 1, 2019
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Oct 1, 2019

Good to be reviewed. I will open a PR to deprecate None support and use 'drop' instead.

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
sklearn/ensemble/base.py Show resolved Hide resolved
sklearn/ensemble/tests/test_stacking.py Outdated Show resolved Hide resolved
sklearn/ensemble/base.py Outdated Show resolved Hide resolved
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Oct 2, 2019

Copy link
Contributor

NicolasHug left a comment

I guess I'm becoming increasingly skeptical about the relevance of inheritance in some cases (like here where all it does is set a single attribute). Makes the code easy to write, but often harder to understand.

But LGTM anyway.

- |Fix| Stacking and Voting estimators now ensure that their underlying
estimators are either all classifiers or all regressors.
We introduced a new base class
:class:`ensemble.base._BaseHeterogeneousEnsemble` to raise consistent error

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Oct 2, 2019

Member

Should we include a private class in the whats new? This can be something like:

Stacking and Voting estimators now raise consistent error messages.

This comment has been minimized.

Copy link
@glemaitre

glemaitre Oct 2, 2019

Author Contributor

I might be misunderstood what @NicolasHug meant by adding a link?
Did you mean mentioning the class or do you expect something else?

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Oct 2, 2019

Member

Since we are not generating a new API doc for _BaseHeterogeneousEnsemble, there is nothing to link to: https://76528-843222-gh.circle-artifacts.com/0/doc/whats_new/v0.22.html#sklearn-ensemble

This comment has been minimized.

Copy link
@NicolasHug

NicolasHug Oct 2, 2019

Contributor

The links referred to the Stacking and Voting estimators, sorry if that wasn't clear. I agree we shouldn't link a private class. (and I'm also fine not linking the estimators... it's just a nit)

This comment has been minimized.

Copy link
@glemaitre

glemaitre Oct 3, 2019

Author Contributor

Oh ok make sense.

@glemaitre glemaitre added this to TO BE MERGED in Guillaume's pet Oct 3, 2019
@@ -236,6 +236,15 @@ Changelog
:user:`Matt Hancock <notmatthancock>` and
:pr:`5963` by :user:`Pablo Duboue <DrDub>`.

- |Fix| Stacking and Voting estimators now ensure that their underlying
estimators are either all classifiers or all regressors.
We introduced a new base class

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Oct 4, 2019

Member

We do not need a "new base class" part?

:class:ensemble.StackingClassifier, :class:ensemble.StackingRegressor, :class:ensemble.VotingClassifier, and :class:ensemeble.VotingRegressor now raise consistent error messages.

This comment has been minimized.

Copy link
@glemaitre

glemaitre Oct 4, 2019

Author Contributor

wrong edit

@thomasjpfan thomasjpfan merged commit 7dd03e0 into scikit-learn:master Oct 5, 2019
19 checks passed
19 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python 2 new alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc artifact Link to 0/doc/_changed.html
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.03%)
Details
codecov/project 96.73% (+0.7%) compared to 5e4b275
Details
scikit-learn.scikit-learn Build #20191004.24 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl) Linux pylatest_conda_mkl succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_pip_openblas_pandas) Linux pylatest_pip_openblas_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@thomasjpfan

This comment has been minimized.

Copy link
Member

thomasjpfan commented Oct 5, 2019

Thank you @glemaitre !

@glemaitre glemaitre moved this from TO BE MERGED to MERGED in Guillaume's pet Oct 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
4 participants
You can’t perform that action at this time.