Skip to content

Commit

Permalink
FIX use float64 in metrics.r2_score() to prevent overflow
Browse files Browse the repository at this point in the history
Without this, if the input arrays are of type np.float32, their sums
may be computed with an large accumulated error, resulting in the wrong
scor with very long arrays (millions of elements).

The "1 - numerator / denominator" calculation at the very end produces
a float64 anyway, so the returned type does not change--only the accuracy.

Fixes scikit-learn#2158.
  • Loading branch information
jzwinck authored and larsmans committed Jul 16, 2013
1 parent 22dbecc commit e433d20
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions sklearn/metrics/metrics.py
Expand Up @@ -2392,8 +2392,8 @@ def r2_score(y_true, y_pred):
if len(y_true) == 1:
raise ValueError("r2_score can only be computed given more than one"
" sample.")
numerator = ((y_true - y_pred) ** 2).sum()
denominator = ((y_true - y_true.mean(axis=0)) ** 2).sum()
numerator = ((y_true - y_pred) ** 2).sum(dtype=np.float64)
denominator = ((y_true - y_true.mean(axis=0)) ** 2).sum(dtype=np.float64)

if denominator == 0.0:
if numerator == 0.0:
Expand Down

0 comments on commit e433d20

Please sign in to comment.