layout | mathjax | author | affiliation | e_mail | date | title | chapter | section | topic | theorem | sources | proof_id | shortcut | username | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
proof |
true |
Joram Soch |
BCCN Berlin |
joram.soch@bccn-berlin.de |
2021-10-27 01:56:00 -0700 |
Ordinary least squares for simple linear regression |
Statistical Models |
Univariate normal data |
Simple linear regression |
Ordinary least squares |
|
P271 |
slr-ols |
JoramSoch |
Theorem: Given a simple linear regression model with independent observations
the parameters minimizing the residual sum of squares are given by
$$ \label{eq:slr-ols} \begin{split} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \ \hat{\beta}1 &= \frac{s{xy}}{s_x^2} \end{split} $$
where
Proof: The residual sum of squares is defined as
The derivatives of
and setting these derivatives to zero
$$ \label{eq:rss-der-zero} \begin{split} 0 &= -2 \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}1 x_i) \ 0 &= -2 \sum{i=1}^n (x_i y_i - \hat{\beta}_0 x_i - \hat{\beta}_1 x_i^2) \end{split} $$
yields the following equations:
$$ \label{eq:slr-norm-eq} \begin{split} \hat{\beta}1 \sum{i=1}^n x_i + \hat{\beta}0 \cdot n &= \sum{i=1}^n y_i \ \hat{\beta}1 \sum{i=1}^n x_i^2 + \hat{\beta}0 \sum{i=1}^n x_i &= \sum_{i=1}^n x_i y_i ; . \end{split} $$
From the first equation, we can derive the estimate for the intercept:
$$ \label{eq:slr-ols-int} \begin{split} \hat{\beta}0 &= \frac{1}{n} \sum{i=1}^n y_i - \hat{\beta}1 \cdot \frac{1}{n} \sum{i=1}^n x_i \ &= \bar{y} - \hat{\beta}_1 \bar{x} ; . \end{split} $$
From the second equation, we can derive the estimate for the slope:
$$ \label{eq:slr-ols-sl} \begin{split} \hat{\beta}1 \sum{i=1}^n x_i^2 + \hat{\beta}0 \sum{i=1}^n x_i &= \sum_{i=1}^n x_i y_i \ \hat{\beta}1 \sum{i=1}^n x_i^2 + \left( \bar{y} - \hat{\beta}1 \bar{x} \right) \sum{i=1}^n x_i &\overset{\eqref{eq:slr-ols-int}}{=} \sum_{i=1}^n x_i y_i \ \hat{\beta}1 \left( \sum{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i \right) &= \sum_{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i \ \hat{\beta}1 &= \frac{\sum{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} ; . \end{split} $$
Note that the numerator can be rewritten as
and that the denominator can be rewritten as
With \eqref{eq:slr-ols-sl-num} and \eqref{eq:slr-ols-sl-den}, the estimate from \eqref{eq:slr-ols-sl} can be simplified as follows:
$$ \label{eq:slr-ols-sl-qed} \begin{split} \hat{\beta}1 &= \frac{\sum{i=1}^n x_i y_i - \bar{y} \sum_{i=1}^n x_i}{\sum_{i=1}^n x_i^2 - \bar{x} \sum_{i=1}^n x_i} \ &= \frac{\sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2} \ &= \frac{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y})}{\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2} \ &= \frac{s_{xy}}{s_x^2} ; . \end{split} $$
Together, \eqref{eq:slr-ols-int} and \eqref{eq:slr-ols-sl-qed} constitute the ordinary least squares parameter estimates for simple linear regression.