# R Square的定义

R Square(又叫R2)是一个用于衡量一个回归模型用于预测将来的可靠性 (It provides a measure of how well future samples are likely to be predicted by the model). 这个值跟MSE Error的区别是:

R2计算的是 1 减去 ``真值和预测值之间的差异`` 占 ``真值差异的比例`` (分子就是MSE). 如果是1说明完美, 越小说明越差.



Reference:

- R2 Wiki: https://en.wikipedia.org/wiki/Coefficient_of_determination
- R2 in [sklearn](http://scikit-learn.org/): http://scikit-learn.org/stable/modules/model_evaluation.html#r2-score-the-coefficient-of-determination
- [小象学院视频, 6分钟开始](https://www.youtube.com/watch?v=yMsnsQ2ZTSQ&index=8&list=PLMRbXzyA0qdWsZXemrdKnqgrPtPFVfA-z)

In [23]:
import numpy as np
from sklearn.metrics import r2_score

y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

score = r2_score(y_true, y_pred)

def cal_r2_score(y_true, y_pred):
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    y_true_mean = y_true.mean()
    y_pred_mean = y_pred.mean()
    score = 1 - ((y_true - y_pred) ** 2).sum() / ((y_true - y_true_mean) ** 2).sum()
    return score
    
assert score == cal_r2_score(y_true, y_pred)