layout | mathjax | author | affiliation | e_mail | date | title | chapter | section | topic | theorem | sources | proof_id | shortcut | username | |||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
proof |
true |
Joram Soch |
BCCN Berlin |
joram.soch@bccn-berlin.de |
2021-10-27 08:31:00 -0700 |
Relationship between coefficient of determination and correlation coefficient in simple linear regression |
Statistical Models |
Univariate normal data |
Simple linear regression |
Coefficient of determination in terms of correlation coefficient |
|
P280 |
slr-rsq |
JoramSoch |
Theorem: Assume a simple linear regression model with independent observations
and consider estimation using ordinary least squares. Then, the coefficient of determination is equal to the squared correlation coefficient between
Proof: The ordinary least squares estimates for simple linear regression are
$$ \label{eq:slr-ols} \begin{split} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \ \hat{\beta}1 &= \frac{s{xy}}{s_x^2} ; . \end{split} $$
The coefficient of determination
Using the explained and total sum of squares for simple linear regression, we have:
$$ \label{eq:slr-R2-s2} \begin{split} R^2 &= \frac{\sum_{i=1}^{n} (\hat{y}i - \bar{y})^2}{\sum{i=1}^{n} (y_i - \bar{y})^2} \ &= \frac{\sum_{i=1}^{n} (\hat{\beta}_0 + \hat{\beta}1 x_i - \bar{y})^2}{\sum{i=1}^{n} (y_i - \bar{y})^2} ; . \end{split} $$
By applying \eqref{eq:slr-ols}, we can further develop the coefficient of determination:
$$ \label{eq:slr-R2-s3} \begin{split} R^2 &= \frac{\sum_{i=1}^{n} (\bar{y} - \hat{\beta}1 \bar{x} + \hat{\beta}1 x_i - \bar{y})^2}{\sum{i=1}^{n} (y_i - \bar{y})^2} \ &= \frac{\sum{i=1}^{n} \left( \hat{\beta}1 (x_i - \bar{x}) \right)^2}{\sum{i=1}^{n} (y_i - \bar{y})^2} \ &= \hat{\beta}1^2 , \frac{\frac{1}{n-1} \sum{i=1}^{n} (x_i - \bar{x})^2}{\frac{1}{n-1} \sum_{i=1}^{n} (y_i - \bar{y})^2} \ &= \hat{\beta}_1^2 , \frac{s_x^2}{s_y^2} \ &= \left( \frac{s_x}{s_y} , \hat{\beta}_1 \right)^2 ; . \end{split} $$
Using the relationship between correlation coefficient and slope estimate, we conclude:
$$ \label{eq:slr-R2-qed} R^2 = \left( \frac{s_x}{s_y} , \hat{\beta}1 \right)^2 = r{xy}^2 ; . $$