Possible math error in Binomial deviance calculations

The calculations of the binomial deviance and its gradient in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/gradient_boosting.py are bothering me. I could be missing something, but:

The deviance is calculated as `log(1 + exp(-2 * y * pred))` [here](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/gradient_boosting.py#L339). This matches equation 10.18 on p 346 of [Elements of Statistical Learning](http://www-stat.stanford.edu/~tibs/ElemStatLearn/). However, that derivation assumes that y is {-1, 1} valued, whereas in sklearn y is {0, 1}. Effectively, the calculation is insensitive to `pred` whenever `y=0`. The fix is to change the return line to `np.sum(np.logaddexp(0.0, -2 * (2 * y - 1) * pred)) / y.shape[0]`.

The calculation of the gradient makes sense to me if the `pred` values map to class probabilities via `P(y=1) = 1 / (1 + exp(-pred))`. However, the loss function calculation above seems to follow the convention that `P(y=1) = 1 / (1 + exp(-2 * pred))` (again, see the link above). One way to make the two equations consistent with each other is to remove the first 2 in the above equation:

 `np.sum(np.logaddexp(0.0, -(2 * y - 1) * pred)) / y.shape[0]`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Possible math error in Binomial deviance calculations #1625

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Possible math error in Binomial deviance calculations #1625

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions