Skip to content

Latest commit

 

History

History
144 lines (114 loc) · 5.38 KB

slr-ols.md

File metadata and controls

144 lines (114 loc) · 5.38 KB
layout mathjax author affiliation e_mail date title chapter section topic theorem sources proof_id shortcut username
proof
true
Joram Soch
BCCN Berlin
joram.soch@bccn-berlin.de
2021-10-27 01:56:00 -0700
Ordinary least squares for simple linear regression
Statistical Models
Univariate normal data
Simple linear regression
Ordinary least squares
authors year title in pages url
Penny, William
2006
Linear regression
Mathematics for Brain Imaging
ch. 1.2.2, pp. 14-16, eqs. 1.24/1.25
authors year title in pages url
Wikipedia
2021
Proofs involving ordinary least squares
Wikipedia, the free encyclopedia
retrieved on 2021-10-27
P271
slr-ols
JoramSoch

Theorem: Given a simple linear regression model with independent observations

$$ \label{eq:slr} y = \beta_0 + \beta_1 x + \varepsilon, ; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), ; i = 1,\ldots,n ; , $$

the parameters minimizing the residual sum of squares are given by

$$ \label{eq:slr-ols} \begin{split} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \ \hat{\beta}1 &= \frac{s{xy}}{s_x^2} \end{split} $$

where $\bar{x}$ and $\bar{y}$ are the sample means, $s_x^2$ is the sample variance of $x$ and $s_{xy}$ is the sample covariance between $x$ and $y$.

Proof: The residual sum of squares is defined as

$$ \label{eq:rss} \mathrm{RSS}(\beta_0,\beta_1) = \sum_{i=1}^n \varepsilon_i^2 = \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i)^2 ; . $$

The derivatives of $\mathrm{RSS}(\beta_0,\beta_1)$ with respect to $\beta_0$ and $\beta_1$ are

$$ \label{eq:rss-der} \begin{split} \frac{\mathrm{d}\mathrm{RSS}(\beta_0,\beta_1)}{\mathrm{d}\beta_0} &= \sum_{i=1}^n 2 (y_i - \beta_0 - \beta_1 x_i) (-1) \\ &= -2 \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i) \\ \frac{\mathrm{d}\mathrm{RSS}(\beta_0,\beta_1)}{\mathrm{d}\beta_1} &= \sum_{i=1}^n 2 (y_i - \beta_0 - \beta_1 x_i) (-x_i) \\ &= -2 \sum_{i=1}^n (x_i y_i - \beta_0 x_i - \beta_1 x_i^2) \end{split} $$

and setting these derivatives to zero

$$ \label{eq:rss-der-zero} \begin{split} 0 &= -2 \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}1 x_i) \ 0 &= -2 \sum{i=1}^n (x_i y_i - \hat{\beta}_0 x_i - \hat{\beta}_1 x_i^2) \end{split} $$

yields the following equations:

$$ \label{eq:slr-norm-eq} \begin{split} \hat{\beta}1 \sum{i=1}^n x_i + \hat{\beta}0 \cdot n &= \sum{i=1}^n y_i \ \hat{\beta}1 \sum{i=1}^n x_i^2 + \hat{\beta}0 \sum{i=1}^n x_i &= \sum_{i=1}^n x_i y_i ; . \end{split} $$

From the first equation, we can derive the estimate for the intercept:

$$ \label{eq:slr-ols-int} \begin{split} \hat{\beta}0 &= \frac{1}{n} \sum{i=1}^n y_i - \hat{\beta}1 \cdot \frac{1}{n} \sum{i=1}^n x_i \ &= \bar{y} - \hat{\beta}_1 \bar{x} ; . \end{split} $$

From the second equation, we can derive the estimate for the slope:

$$ \label{eq:slr-ols-sl} \begin{split} \hat{\beta}1 \sum{i=1}^n x_i^2 + \hat{\beta}0 \sum{i=1}^n x_i &= \sum_{i=1}^n x_i y_i \ \hat{\beta}1 \sum{i=1}^n x_i^2 + \left( \bar{y} - \hat{\beta}1 \bar{x} \right) \sum{i=1}^n x_i &\overset{\eqref{eq:slr-ols-int}}{=} \sum_{i=1}^n x_i y_i \ \hat{\beta}1 \left( \sum{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i \right) &= \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i \ \hat{\beta}1 &= \frac{\sum{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} ; . \end{split} $$

Note that the numerator can be rewritten as

$$ \label{eq:slr-ols-sl-num} \begin{split} \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i y_i - n \bar{x} \bar{y} \\ &= \sum_{i=1}^n x_i y_i - n \bar{x} \bar{y} - n \bar{x} \bar{y} + n \bar{x} \bar{y} \\ &= \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i - \bar{x} \sum_{i=1}^n y_i + \sum_{i=1}^n \bar{x} \bar{y} \\ &= \sum_{i=1}^n \left( x_i y_i - x_i \bar{y} - \bar{x} y_i + \bar{x} \bar{y} \right) \\ &= \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) \end{split} $$

and that the denominator can be rewritten as

$$ \label{eq:slr-ols-sl-den} \begin{split} \sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i &= \sum_{i=1}^n x_i^2 - n \bar{x}^2 \\ &= \sum_{i=1}^n x_i^2 - 2 n \bar{x} \bar{x} + n \bar{x}^2 \\ &= \sum_{i=1}^n x_i^2 - 2 \bar{x} \sum_{i=1}^n x_i - \sum_{i=1}^n \bar{x}^2 \\ &= \sum_{i=1}^n \left( x_i^2 - 2 \bar{x} x_i + \bar{x}^2 \right) \\ &= \sum_{i=1}^n (x_i - \bar{x})^2 ; . \end{split} $$

With \eqref{eq:slr-ols-sl-num} and \eqref{eq:slr-ols-sl-den}, the estimate from \eqref{eq:slr-ols-sl} can be simplified as follows:

$$ \label{eq:slr-ols-sl-qed} \begin{split} \hat{\beta}1 &= \frac{\sum{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} \ &= \frac{\sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2} \ &= \frac{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2} \ &= \frac{s_{xy}}{s_x^2} ; . \end{split} $$

Together, \eqref{eq:slr-ols-int} and \eqref{eq:slr-ols-sl-qed} constitute the ordinary least squares parameter estimates for simple linear regression.