# Sufficiency and Unbiasedness

## Rao-Blackwell Theorem
- Rao-Blackwell Theorem is based on iterative expectation:

    $E(X)=E(E(X|Y))$

    $Var(X)=Var(E(X|Y))+E(Var(X|Y))$

    - $E(X|Y)$ is a function of $Y$

**Rao-Blackwell Theorem:**

Let $W$ be any unbiased estimator of $\tau(\theta)$.

let $T$ be a sufficient statistic for $\theta$. 

Let $\phi(T)=E(W|T)$.

Then $E(\phi(T))=\tau(\theta)$ and $Var(\phi(T))\leq Var(W)$ for all $\theta$, 

that is , $\phi(T)$ is a uniformaly better unbiased estimator of $\tau(\theta)$.

Note: Rao-Blackwell 只能找到一个相比于$W$更好（更小的方差）的无偏估计器 $\phi(T)=E(W|T)$

**Proof:**

$E(W)=\tau(\theta)$

$E(\phi(T))=E(E(W|T))=E(W)=\tau(\theta)$

$\phi(T)$ is a also unbiased estimator of $\tau(\theta)$

$Var(\phi(T))=Var(E(W|T))=Var(W)-E(Var(X|Y))$

- $Var(X|Y)\geq 0\Rightarrow E(Var(X|Y))\geq 0$
    - $Var(\phi(T))\leq Var(W)$ for all $\theta$

$T$ is sufficient statistic, so $f(\mathbf{X}|T)$ does not depend on $\theta$.

**Furthermore**, $T$ is a sufficient statistic, so $f(\mathbf{X}|T)$ does not depend on $\theta$.

$W$ is a function of $\mathbf{X}$, so $f(W|T)$ does not depend on $\theta$ either.

Thus, $E(W|T)$ does not depend on $\theta$.

Therefore, $\phi(T)$ is a sufficient statistic, since $\phi(T)=E(W|T)$ is a fuction of $T$. 

And we can always find a better estimator by conditioning on a sufficient statistic.

## Sufficiency and unbiased estimators

- The Rao-Blackwell Theorem means that in our search for best unbiased estimators we only need to consider functions of sufficient statistics!
    - Narrows the search!

**Theorem 7.3.18**

If $\phi(T)$ is a best unbiased estimator of $\tau(\theta)$, then $\phi(T)$ is unique.
- Any other unbias estimator will have a larger variance than it.

**Theorem 7.3.20**

If $E(\phi(T))=\tau(\theta)$, then $\phi(T)$ is a best unbiased estimators of $\tau(\theta)$, *if and only if* $\phi(T)$ is uncorrelated with all unbiased estimators of 0.
- *if and only if* $T$ is complete sufficient statistic.

**Theorem 7.3.23 - Lehmann-Scheffé (LS)**

Let **$T$ be a complete sufficient statistic for a parameter $\theta$**, and let $\phi(T)$ be any estimator based only on $T$.

Then $\phi(T)$ is the unique best unbiased estimator of its expected value $\big(E(\phi(T))=\tau(\theta)\big)$.

Note：完备话 $T$ 之后的 $\phi(T)=E(W|T)$，才是 $\tau(\theta)$ 的 $UMVUE$。 

### Using Rao-Blackwell + LS to find UMVUE

Say:
- want to find a UMVUE if $\tau(\theta)$
- have a complete sufficient statistic $T$

If $E(T)=\tau(\theta)$, then by LS, $T$ is the UMVUE of $\tau(\theta)$.

If $E(T)\neq\tau(\theta)$, then
1. Sometimes there is a obvious transformation $\phi(T)$ that gives $E(\phi(T))=\tau(\theta)$.
2. If not, we can use $RB$ to find $\phi(T)$, using some simple unbiased estimator $W$, then $\phi(T)=E(W|T)$.

by LS, $\phi(T)$ is the UMVUE of $\tau(\theta)$.

### Example:
- Say $X_1,X_2,...,X_n$ are iid. $Poisson(\theta)$. Show that $\bar{X}$ is the best unbiased estimator of $\theta$.

    Saw before: $T=\sum_{i=1}^n X_i$ is a complete sufficient statistic of $\theta$.

    $E(T)=\sum_{i=1}^n E(X_i)=n\theta\neq\theta$

    but $E(\frac{T}{n})=E(\bar{X})=\theta$

    $\bar{X}$ is a function of a complete statistic, because it is only based on $T$.

    Therefore, $\bar{X}$ is the $UMVUE$ of $\theta$, by LS.

### Example:
- Say $X_1,X_2,...,X_n$ are iid. $Poisson(\theta)$. Find the $UMVUE$ of

    $\begin{aligned}
    P(X_i \leq 1) 
    &= P(X_i = 0) + P(X_i = 1) \\
    &= \frac{e^{-\theta} \theta^0}{0!} + \frac{e^{-\theta} \theta^1}{1!} \\
    &= e^{-\theta} + \theta e^{-\theta} = (1 + \theta)e^{-\theta} = \tau(\theta) \\
    \end{aligned}$

    It is easy to know $T=\sum_{i=1}^n X_i$ is complete sufficient statistic of $\theta$.

    Obviously, it is hard to find a $\phi(T)$ as a UMVUE of $\tau(\theta)$, so let's use Rao-Blackwell.

    Consider: 
    
    $W(\mathbf{X})=
    \begin{cases}
    1 &X_1\leq 1\\
    0 &Otherwise
    \end{cases}$

    $W\sim Bernoulli(p)$, where $p=P(W=1)=P(X_1 \leq 1)=(1 + \theta)e^{-\theta}$

    $E(W)=p=(1 + \theta)e^{-\theta}$, so it is a unbiased estimator of $\tau(\theta)$.

    According to the Rao-Blackwell,

    $\begin{aligned}
    \phi(T)
    &=E(W|T)\\
    &=\sum_w w\cdot P(W=w|T=t)\\
    &=0\cdot P(W=0|T=t)+1\cdot P(W=1|T=t)\\
    &=P(W=1|T=t)\\
    &=P(X_1\leq 1|\sum_{i=1}^n X_i=t)\\
    &=\frac{P(X_1\leq 1,\sum_{i=1}^n X_i=t)}{P(\sum_{i=1}^n X_i=t)}\\
    &=\frac{P(X_1= 0)\cdot P(\sum_{i=2}^n X_i=t) + P(X_1= 1)\cdot P(\sum_{i=2}^n X_i=t-1)}{P(\sum_{i=1}^n X_i=t)}\\
    \end{aligned}$

    $T\sim Poisson(n\theta)\Rightarrow P(T=t)=\frac{e^{-n\theta}(n\theta)^t}{t!}$

    $\sum_{i=2}^n X_i=t\sim Poisson((n-1)\theta)\Rightarrow P(\sum_{i=2}^n X_i=t)=\frac{e^{-(n-1)\theta}((n-1)\theta)^t}{t!}$

    $\sum_{i=2}^n X_i=t-1\sim Poisson((n-1)\theta)\Rightarrow P(\sum_{i=2}^n X_i=t-1)=\frac{e^{-(n-1)\theta}((n-1)\theta)^{(t-1)}}{(t-1)!}$

    $\begin{aligned}
    \phi(T)
    &=\frac{\frac{e^{-\theta} \theta^0}{0!}\cdot\frac{e^{-(n-1)\theta}((n-1)\theta)^t}{t!}+\frac{e^{-\theta} \theta^1}{1!}\cdot\frac{e^{-(n-1)\theta}((n-1)\theta)^{(t-1)}}{(t-1)!}}{\frac{e^{-n\theta}(n\theta)^t}{t!}}\\
    &=\frac{(n-1)^{n\bar{x}-1}(n\bar{x}+n-1)}{n^{n\bar{x}}}
    \end{aligned}$


## Best unbiased estimators - Summary

- Let $W$ be an unbiased estimator of a parameter $\theta$
- Then $MSE(W)=Var(W)$
- In general, we want our unbiased estimators to have low variance (and hence low MSE)

    **Best unbiased estimator = Uniform minimum variance unbiased estimator (UMVUE)**
- The estimator that has the smallest variance in the set of all unbiased estimators
- Section 7.3 is mostly about different theoretical tricks to see if we have a best unbiased estimator

### Trick 1: Cramer-Rao Lower bound

- Puts a lower bound on the variance of all estimators - given that some "nicety" conditions hold

- Our estimators can’t have a smaller variance than the Cramer-Rao lower bound (CRLB)

- So if our unbiased estimator has a variance that is equal to CRLB we know that it is the best unbiased estimator

**Efficient estimators**

An estimator W is called and efficient estimator if it has a variance that is equal to its CRLB
- Note that a best unbiased estimator is not necessarily efficient
- The CRLB may not be obtainable

### Trick 2: Complete statistic

All we need is completeness:

**Lehmann-Scheffé**

Let $T$ be a complete sufficient statistic for a parameter $\theta$, and let $\phi(T)$ be any estimator based only on $T$. Then $\phi(T)$ is the unique best unbiased estimator of its expected value.

- Rao-Blackwell + Lehmann-Scheffé:
    - Find a complete statistic $T$. If unbiased of $\tau(\theta)$, then we are done!
    - If *not* unbiased, find a simple unbiased estimator $W$ and set

        $\phi(T)=E(W|T)$

        then $\phi(T)$ is unbiased and based only on a complete sufficient statistic $T$. Therefore $\phi(T)$ is the best unbiased estimator of $\tau(\theta)$.