r2_score metric incorrect? #5570

drei34 · 2015-10-23T16:43:31Z

Hello,

The r2_score metric is something that many functions in sklearn use but right now I think it generally is set as 1 - RSS/SYY, which would the right formula to use if you run a regression with an intercept.

If you did not use this, however, and ran a regression with no intercept then the r2_score should be equal to (y_pred^2).sum()/(y_true^2).sum() and notice that we do not demean.

Moreover, for other models, 1 - RSS/SYY might be negative or not between 1 and 0, which again is bad.

For a standard regression with an intercept term, 1 - RSS/SYY = corr(y_pred,y_true)^2 and this number, no matter what the model is, is between 0 and 1 with 1 as the goal.

I think the definition should then be changed on a per model basis or it should be changed to corr(y_pred,y_true)**2. The book "Applied Linear Regression" by S Weisberg mentions the issue I address above on page 84 of the third edition. It suggest to use corr(y_pred,y_true)^2 for nonlinear models and to alternate the definition as above for regression through the origin. Finally, with regards to regression, statsmodels does use a different formula for the r2_score depending on if you use or do not use the intercept in a regression.

Maybe this is known already, but the codebase does not seem to differentiate at all so that's why I am putting the issue here.

Thanks!

amueller · 2016-10-07T21:33:59Z

Sorry for the slow reply. It's true that for non-linear models, R^2 doesn't need to be >0, but it's always <= 1.
We're basically always using the wikipedia definition without taking the model into account:
https://en.wikipedia.org/wiki/Coefficient_of_determination

This is somewhat non-standard but unfortunately it's a bit hard to change. In particular when using a test set, it's a bit unclear to me what the R^2 means.
I'm not super familiar with stats, but (y_pred^2).sum()/(y_true^2).sum() seems really odd to me. When would that make sense?

jnothman · 2018-01-28T11:00:24Z

I think this can be closed in part just because it's not going to change (although I suppose docs could be improved). But I also think the OP neglects the fact that in a machine learning context, we are usually estimating generalisation error, not goodness of fit alone; hence negative scores seem appropriate.

amueller added the Question label Oct 7, 2016

jnothman mentioned this issue Jan 28, 2018

r2_score returns r correlation coefficient not R2 coefficient of determination #10543

Closed

jnothman closed this as completed Jan 28, 2018

vicramr mentioned this issue May 4, 2019

r2_score documentation #13788

Closed

exalate-issue-sync bot mentioned this issue May 13, 2023

Inadvisable R^2 calculation for non-linear models h2oai/h2o-3#12248

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

r2_score metric incorrect? #5570

r2_score metric incorrect? #5570

drei34 commented Oct 23, 2015

amueller commented Oct 7, 2016

jnothman commented Jan 28, 2018 •

edited

r2_score metric incorrect? #5570

r2_score metric incorrect? #5570

Comments

drei34 commented Oct 23, 2015

amueller commented Oct 7, 2016

jnothman commented Jan 28, 2018 • edited

jnothman commented Jan 28, 2018 •

edited