Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX fix multinomial deviance by taking the weighted average instead of the sum #17694

Merged
merged 46 commits into from Jun 26, 2020
Merged

Conversation

ghost
Copy link

@ghost ghost commented Jun 24, 2020

Reference Issues/PRs

Fixes #10055
Closes #10081

What does this implement/fix? Explain your changes.

Changed MultinomialDeviance from total logloss to average logloss.

Any other comments?

I fixed stalled PR #10081. The main modification is from #10081.

What I did is to merge recently master branch and fix conflicts and flake8 errors.

@cmarmo
Copy link
Contributor

cmarmo commented Jun 25, 2020

Thanks @t-kusanagi2 for taking care of that. @glemaitre you reviewed the original PR, do you mind checking if your comments have been addressed? Thanks!

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need an entry in whats new for 0.24

@glemaitre glemaitre changed the title Multinomial deviance mean FIX fix multinomial deviance by taking the weighted average instead of the sum Jun 25, 2020
t-kusanagi2 and others added 12 commits June 26, 2020 14:33
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@ghost
Copy link
Author

ghost commented Jun 26, 2020

We will need an entry in whats new for 0.24

I added to doc/whats_new/v0.24.rst. I would be happy if you check that.

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with it. We will need a second reviewer.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few comments.

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever I read this code I am not sure that fitting weighted regression to predict the unweighted deviance gradient + the leaf value fix-up is equivalent to fitting an weighted tree on the weighted deviance.

At some point we should write some tests to check this but this is outside of the scope of this PR.

@glemaitre glemaitre merged commit 068c4e6 into scikit-learn:master Jun 26, 2020
@glemaitre
Copy link
Member

Thanks @t-kusanagi2

@ghost ghost deleted the multinomial-deviance-mean branch June 29, 2020 21:50
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Jul 17, 2020
…f the sum (scikit-learn#17694)

Co-authored-by: Markus Rempfler <markus.rempfler@tum.de>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel ogrisel added this to the 0.23.2 milestone Aug 3, 2020
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Aug 3, 2020
…f the sum (scikit-learn#17694)

Co-authored-by: Markus Rempfler <markus.rempfler@tum.de>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…f the sum (scikit-learn#17694)

Co-authored-by: Markus Rempfler <markus.rempfler@tum.de>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MultinomialDeviance in GradientBoostingClassifier should use average logloss instead of total logloss.
4 participants