Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FIX solve consistency between predict and predict_proba in AdaBoost #14114

Merged
merged 10 commits into from Jul 16, 2019

Conversation

@glemaitre
Copy link
Contributor

glemaitre commented Jun 18, 2019

closes #14084
closes #2974

compute the probabilities in AdaBoostClassifier as specified in "Multiclass AdaBoost"

@glemaitre glemaitre changed the title [WIP] FIX solve consistency between predict and predict_proba in AdaBoost [MRG] FIX solve consistency between predict and predict_proba in AdaBoost Jun 19, 2019
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jun 19, 2019

@NicolasHug @amueller I am playing with something which is not my strong suit. Could you have a look at the PR and if the fix proposed seems theoretically reasonable?

Copy link
Contributor

NicolasHug left a comment

As far as I can tell the changes are correct.

What is strange to me is that the previous implementation is also a softmax of the decision function, so other than the fact that the new version is much clearer, I don't understand where the fix is.

(to be honest this isn't really my specialty either)

sklearn/ensemble/weight_boosting.py Outdated Show resolved Hide resolved
2009.
"""
if n_classes == 2:
decision = np.vstack([-decision, decision]).T

This comment has been minimized.

Copy link
@NicolasHug

NicolasHug Jun 19, 2019

Contributor

can you explain this? is it in the paper too?

This comment has been minimized.

Copy link
@glemaitre

glemaitre Jun 19, 2019

Author Contributor

My original thought was that we were keeping one of the 2 columns when computing the decision function. Actually, we do something a bit different:

        if n_classes == 2:
            pred[:, 0] *= -1
            return pred.sum(axis=1)

You originally have symmetry between both classes. However, I should divide decision by 2 since that we are summing.

@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jun 21, 2019

What is strange to me is that the previous implementation is also a softmax of the decision function, so other than the fact that the new version is much clearer, I don't understand where the fix is.

It uses the predict_proba of the underlying classifier instead of the predict (used in the decision function). It is the main difference.

glemaitre added 5 commits Jun 21, 2019
fix
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jul 1, 2019

@NicolasHug do you have any other comments?

Any second reviewer? @rth @thomasjpfan

@thomasjpfan thomasjpfan self-assigned this Jul 4, 2019
Copy link
Member

thomasjpfan left a comment

Strange how master's implementation used predict_proba instead of predict which is more inline with the paper's use of missclassification error rate.

sklearn/ensemble/weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/weight_boosting.py Show resolved Hide resolved
@glemaitre

This comment has been minimized.

Copy link
Contributor Author

glemaitre commented Jul 10, 2019

@thomasjpfan I addressed the comments.

Copy link
Member

thomasjpfan left a comment

LGTM

@agramfort agramfort merged commit c0c5313 into scikit-learn:master Jul 16, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.82%)
Details
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +3.17% compared to 7b8cbc8
Details
scikit-learn.scikit-learn Build #20190710.12 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@agramfort

This comment has been minimized.

Copy link
Member

agramfort commented Jul 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.