layout

mathjax

author

affiliation

e_mail

date

title

chapter

section

topic

theorem

sources

proof_id

shortcut

username

proof

true

Joram Soch

BCCN Berlin

joram.soch@bccn-berlin.de

2022-12-21 06:15:00 -0800

Maximum likelihood estimator of variance in multiple linear regression is biased

Model Selection

Goodness-of-fit measures

Residual variance

Maximum likelihood estimator is biased (p > 1)

authors	year	title	in	pages	url
ocram	2022	Why is RSS distributed chi square times n-p?	StackExchange CrossValidated	retrieved on 2022-12-21	https://stats.stackexchange.com/a/20230

P398

resvar-biasp

JoramSoch

Theorem: Consider a linear regression model with known design matrix $X$, known covariance structure $V$, unknown regression parameters $\beta$ and unknown noise variance $\sigma^2$:

$$ \label{eq:mlr} y = X\beta + \varepsilon, ; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) ; . $$

Then,

the maximum likelihood estimator of $\sigma^2$ is

$$ \label{eq:sigma-mle} \hat{\sigma}^2 = \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) $$

where

$$ \label{eq:beta-mle} \hat{\beta} = (X^\mathrm{T} V^{-1} X)^{-1} X^\mathrm{T} V^{-1} y $$

and $\hat{\sigma}^2$ is a biased estimator of $\sigma^2$

$$ \label{eq:resvar-var} \mathrm{E}\left[ \hat{\sigma}^2 \right] \neq \sigma^2 ; , $$

more precisely:

$$ \label{eq:resvar-biasp} \mathrm{E}\left[ \hat{\sigma}^2 \right] = \frac{n-p}{n} \sigma^2 ; . $$

Proof:

This follows from maximum likelihood estimation for multiple linear regression and is a special case of maximum likelihood estimation for the general linear model in which $Y = y$, $B = \beta$ and $\Sigma = \sigma^2$:

$$ \label{eq:sigma-mle-qed} \begin{split} \hat{\sigma}^2 &= \frac{1}{n} (Y-X\hat{B})^\mathrm{T} V^{-1} (Y-X\hat{B}) \\ &= \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) ; . \end{split} $$

We know that the residual sum of squares, divided by the true noise variance, is following a chi-squared distribution:

$$ \label{eq:rss-dist} \begin{split} \frac{\hat{\varepsilon}^\mathrm{T} \hat{\varepsilon}}{\sigma^2} &\sim \chi^2(n-p) \\ \text{where} \quad \hat{\varepsilon}^\mathrm{T} \hat{\varepsilon} &= (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) ; . \end{split} $$

Thus, combining \eqref{eq:rss-dist} and \eqref{eq:sigma-mle-qed}, we have:

$$ \label{eq:resvar-bias-s1} \frac{n \hat{\sigma}^2}{\sigma^2} \sim \chi^2(n-p) ; . $$

Using the relationship between chi-squared distribution and gamma distribution

$$ \label{eq:chi2-gam} X \sim \chi^2(k) \quad \Rightarrow \quad cX \sim \mathrm{Gam}\left( \frac{k}{2}, \frac{1}{2c} \right) ; , $$

we can deduce from \eqref{eq:resvar-bias-s1} that

$$ \label{eq:resvar-bias-s2} \hat{\sigma}^2 = \frac{\sigma^2}{n} \cdot \frac{n \hat{\sigma}^2}{\sigma^2} \sim \mathrm{Gam}\left( \frac{n-p}{2}, \frac{n}{2\sigma^2} \right) ; . $$

Using the expected value of the gamma distribution

$$ \label{eq:gam-mean} X \sim \mathrm{Gam}(a,b) \quad \Rightarrow \quad \mathrm{E}(X) = \frac{a}{b} ; , $$

we can deduce from \eqref{eq:resvar-bias-s2} that

$$ \label{eq:resvar-bias-s3} \mathrm{E}\left[ \hat{\sigma}^2 \right] = \frac{\frac{n-p}{2}}{\frac{n}{2\sigma^2}} = \frac{n-p}{n} \sigma^2 $$

which proves the relationship given by \eqref{eq:resvar-biasp}.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resvar-biasp.md

resvar-biasp.md

Files

resvar-biasp.md

Latest commit

History

resvar-biasp.md

File metadata and controls