Skip to content

Latest commit

 

History

History
113 lines (80 loc) · 3.52 KB

resvar-biasp.md

File metadata and controls

113 lines (80 loc) · 3.52 KB
layout mathjax author affiliation e_mail date title chapter section topic theorem sources proof_id shortcut username
proof
true
Joram Soch
BCCN Berlin
joram.soch@bccn-berlin.de
2022-12-21 06:15:00 -0800
Maximum likelihood estimator of variance in multiple linear regression is biased
Model Selection
Goodness-of-fit measures
Residual variance
Maximum likelihood estimator is biased (p > 1)
authors year title in pages url
ocram
2022
Why is RSS distributed chi square times n-p?
StackExchange CrossValidated
retrieved on 2022-12-21
P398
resvar-biasp
JoramSoch

Theorem: Consider a linear regression model with known design matrix $X$, known covariance structure $V$, unknown regression parameters $\beta$ and unknown noise variance $\sigma^2$:

$$ \label{eq:mlr} y = X\beta + \varepsilon, ; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) ; . $$

Then,

  1. the maximum likelihood estimator of $\sigma^2$ is

$$ \label{eq:sigma-mle} \hat{\sigma}^2 = \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) $$

where

$$ \label{eq:beta-mle} \hat{\beta} = (X^\mathrm{T} V^{-1} X)^{-1} X^\mathrm{T} V^{-1} y $$

  1. and $\hat{\sigma}^2$ is a biased estimator of $\sigma^2$

$$ \label{eq:resvar-var} \mathrm{E}\left[ \hat{\sigma}^2 \right] \neq \sigma^2 ; , $$

more precisely:

$$ \label{eq:resvar-biasp} \mathrm{E}\left[ \hat{\sigma}^2 \right] = \frac{n-p}{n} \sigma^2 ; . $$

Proof:

  1. This follows from maximum likelihood estimation for multiple linear regression and is a special case of maximum likelihood estimation for the general linear model in which $Y = y$, $B = \beta$ and $\Sigma = \sigma^2$:

$$ \label{eq:sigma-mle-qed} \begin{split} \hat{\sigma}^2 &= \frac{1}{n} (Y-X\hat{B})^\mathrm{T} V^{-1} (Y-X\hat{B}) \\ &= \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) ; . \end{split} $$

  1. We know that the residual sum of squares, divided by the true noise variance, is following a chi-squared distribution:

$$ \label{eq:rss-dist} \begin{split} \frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{\sigma^2} &\sim \chi^2(n-p) \\ \text{where} \quad \hat{\varepsilon}^\mathrm{T} \hat{\varepsilon} &= (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) ; . \end{split} $$

Thus, combining \eqref{eq:rss-dist} and \eqref{eq:sigma-mle-qed}, we have:

$$ \label{eq:resvar-bias-s1} \frac{n \hat{\sigma}^2}{\sigma^2} \sim \chi^2(n-p) ; . $$

Using the relationship between chi-squared distribution and gamma distribution

$$ \label{eq:chi2-gam} X \sim \chi^2(k) \quad \Rightarrow \quad cX \sim \mathrm{Gam}\left( \frac{k}{2}, \frac{1}{2c} \right) ; , $$

we can deduce from \eqref{eq:resvar-bias-s1} that

$$ \label{eq:resvar-bias-s2} \hat{\sigma}^2 = \frac{\sigma^2}{n} \cdot \frac{n \hat{\sigma}^2}{\sigma^2} \sim \mathrm{Gam}\left( \frac{n-p}{2}, \frac{n}{2\sigma^2} \right) ; . $$

Using the expected value of the gamma distribution

$$ \label{eq:gam-mean} X \sim \mathrm{Gam}(a,b) \quad \Rightarrow \quad \mathrm{E}(X) = \frac{a}{b} ; , $$

we can deduce from \eqref{eq:resvar-bias-s2} that

$$ \label{eq:resvar-bias-s3} \mathrm{E}\left[ \hat{\sigma}^2 \right] = \frac{\frac{n-p}{2}}{\frac{n}{2\sigma^2}} = \frac{n-p}{n} \sigma^2 $$

which proves the relationship given by \eqref{eq:resvar-biasp}.