# Unbiased estimator and consistent estimator

[Back to index](https://shotahorii.github.io/math-for-ds/)

---

## Table of contents
1. **Unbiased estimator**  
1.1. Definition  
1.2. Examples  
1.3. Asymptotically unbiased estimator  
2. **Consistent Estimator**  
2.1. Definition  
2.2. Examples 

---

## 1. Unbiased estimator
### 1.1. Definition

An estimator $\hat{\theta}$ is an unbiased estimator when following is true.

$E[\hat{\theta}] = \theta$

Where $\theta$ is the true parameter.

---

### 1.2. Examples

Assume: a set of samples $X_i$ ($i=1,2,...,n$) is i.i.d. with the true mean of $\mu$ and the true variance of $\sigma^2$.

**Sample mean** $\bar{X}$ : An example of unbiased estimator

$\bar{X} = \frac{X_1+X_2+...+X_n}{n} = \frac{1}{n}\sum_{i=1}^nX_i$

$E[\bar{X}] = E[\frac{1}{n}\sum_{i=1}^nX_i] =\frac{1}{n}\sum_{i=1}^nE[X_i] = \frac{1}{n}\sum_{i=1}^n\mu = \frac{1}{n}n\mu = \mu$


**Sample variance** $S^2$ : An example of **biased** estimator

$S^2 = \frac{1}{n}\sum_{i=1}^n(X_i-\bar{X})^2$

$E[S^2] = E[\frac{1}{n}\sum_{i=1}^n(X_i-\bar{X})^2]=\frac{1}{n}E[\sum_{i=1}^n(X_i-\bar{X})^2]$

$=\frac{1}{n}E[\sum_{i=1}^n\{(X_i-\mu)-(\bar{X}-\mu)\}^2]$

$=\frac{1}{n}E[\sum_{i=1}^n\{(X_i-\mu)^2-2(X_i-\mu)(\bar{X}-\mu)+(\bar{X}-\mu)^2\}]$

$=\frac{1}{n}E[\sum_{i=1}^n(X_i-\mu)^2-2(\bar{X}-\mu)\sum_{i=1}^n(X_i-\mu)+n(\bar{X}-\mu)^2]$

Note: $\sum_{i=1}^n(X_i-\mu) = X_1+X_2+...+X_n - n\mu = n(\bar{X}-\mu)$

$=\frac{1}{n}E[\sum_{i=1}^n(X_i-\mu)^2-2n(\bar{X}-\mu)^2+n(\bar{X}-\mu)^2]
=\frac{1}{n}E[\sum_{i=1}^n(X_i-\mu)^2-n(\bar{X}-\mu)^2]$

$=\frac{1}{n}\sum_{i=1}^nE[(X_i-\mu)^2]-E[(\bar{X}-\mu)^2]$

Note: $E[(X_i-\mu)^2] = E[(X_i-E[X_i])^2] = V[X_i]$

Note: $E[(\bar{X}-\mu)^2] = E[(\bar{X}-E[\bar{X}])^2] = V[\bar{X}]$

$=\frac{1}{n}\sum_{i=1}^nV[X_i]-V[\bar{X}]$

Note: $V[X_i] = \sigma^2$

$=\frac{1}{n}n\sigma^2-V[\bar{X}]=\sigma^2-V[\bar{X}]$

Note: $V[\bar{X}] = V[\frac{1}{n}\sum_{i=1}^nX_i] = \frac{1}{n^2}V[\sum_{i=1}^nX_i]
= \frac{1}{n^2}\sum_{i=1}^nV[X_i] = \frac{1}{n^2}n\sigma^2 = \frac{\sigma^2}{n}$

$=\sigma^2-\frac{\sigma^2}{n} = \frac{n-1}{n}\sigma^2$

Hence, sample variance $S^2 \ne \sigma^2$.

**Unbiased Sample variance** $S^2$ : An example of unbiased estimator

$S^2 = \frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2$

$E[S^2] = \frac{1}{n-1}E[\sum_{i=1}^n(X_i-\bar{X})^2]$

Note: use the calculation above.

$= \frac{1}{n-1}E[\sum_{i=1}^n(X_i-\mu)^2-n(\bar{X}-\mu)^2]$

$=\frac{1}{n-1}\sum_{i=1}^nE[(X_i-\mu)^2]-\frac{n}{n-1}E[(\bar{X}-\mu)^2]$

$=\frac{n}{n-1}\sigma^2-\frac{n}{n-1}\frac{\sigma^2}{n}$

$=\frac{n-1}{n-1}\sigma^2=\sigma^2$

---

### 1.3. Asymptotically unbiased estimator

As seen above, an estimator $\hat{\theta}$ is unbiased when $E[\hat{\theta}] = \theta$. When an estimator is **biased**, the bias is calculated as below. 

$bias(\hat{\theta}) = E[\hat{\theta}] - \theta$

Where $\theta$ is the true parameter.

An estimator $\hat{\theta}$ is an **asymptotically unbiased estimator** when following is true.

$\lim\limits_{n \to \infty} bias(\hat{\theta}) = 0$

**Example**  
Assume: $n$ samples $X_i$ ($i=1,2,...,n$) are sampled from a continuous uniform distribution $U(0,\theta)$.

Consider an estimator $\hat{\theta} = max_i(X_i)$, then $E[\hat{\theta}] = \frac{n}{n+1}\theta$.

Now, the bias of this estimator is below.

$bias(\hat{\theta}) = E[\hat{\theta}] - \theta = \frac{n}{n+1}\theta - \theta = -\frac{\theta}{n+1}$

So, this is **NOT** an unbiased estimator as $bias(\hat{\theta}) \ne 0$. However, this is an asymptotically unbiased estimator as $\lim\limits_{n \to \infty} bias(\hat{\theta}) = 0$.

**Reference**: [最大値と最小値の分布（一般論と例）](https://mathtrain.jp/orderbunpu)

---

## 2. Consistent estimator
### 2.1. Definition

An estimator $\hat{\theta}_n$ from $n$ samples is an unbiased estimator when following is true.

$\forall \varepsilon > 0, \lim\limits_{n \to \infty}Pr(|\hat{\theta}_n - \theta| < \varepsilon) = 1$

Where $\theta$ is the true parameter.

This can also be written as below.

$\forall \varepsilon > 0, plim_{n \to \infty}\hat{\theta}_n = \theta $

---

### 2.2. Examples

Assume: a set of samples $X_i$ ($i=1,2,...,n$) is i.i.d. with the true mean of $\mu$ and the true variance of $\sigma^2$.

**Sample mean** $\bar{X}$ : An example of consistent estimator

$\bar{X} = \frac{X_1+X_2+...+X_n}{n} = \frac{1}{n}\sum_{i=1}^nX_i$

$\forall \varepsilon > 0, \lim\limits_{n \to \infty}Pr(|\bar{X} - \mu| < \varepsilon) = 1$

This is exactly what Weak Law of Large Numbers tells.

**An example of NOT consistent estimator**

$\hat{\mu} = \frac{X_1+X_n}{2}$

Firstly, this estimator is an unbiased estimator, as below.

$E[\hat{\mu}] = E[\frac{X_1+X_n}{2}] = \frac{1}{2}(E[X_1]+E[X_n]) = \frac{1}{2}2\mu = \mu$

However, this is not a consistent estimator, because...

$V[\hat{\mu}] = V[\frac{X_1+X_n}{2}] = \frac{1}{4}(V[X_1]+V[X_n])=\frac{1}{4}2\sigma^2 = \frac{\sigma^2}{2}$

Hence, even if we have $\infty$ samples, variance is still not 0 as below.

$\lim\limits_{n \to \infty} V[\hat{\mu}] = \frac{\sigma^2}{2}$

So, 

$\exists \varepsilon > 0, \lim\limits_{n \to \infty}Pr(|\hat{\mu} - \mu| < \varepsilon) \ne 1$