## 1. IID



Independent and identically distributed random variables(iid)

#### example:
toss 10 fair coins, the results $X_1, X_2, ... X_{10}$ are iid

-----------------------

## 2. Point Estimator

$\hat{\Theta} = h(X_1, X_2, ... , X_n)$, where $X_i$s are iid, $h(X_1, X_2, ..., X_n)$ is a **statistic**, a real-valued function of data.

#### example:
- sample mean
$\hat{\Theta} = \bar{X} =  \frac{X_1 + X_2 + ... + X_n}{n}$

- sample variance
$\hat{\Theta} = S^2 =  \frac{1}{n-1} \sum_{k=1}^{n}(X_k - \bar{X})^2 $
----------------------------------------------------

## 3. Mean Square Error and Bias

#### 3.1 Mean Square Error(MSE):
$MSE(\hat{\Theta})=E[(\hat{\Theta}-\theta)^2]$

$\hat{\Theta}$: estimated value

$\theta$: true value

------------------------------------------

#### 3.2 Bias:
$B(\hat{\Theta}) = E[\hat{\Theta}] - \theta$

#### example:
- sample mean
$B(\bar{X}) = E[\bar{X}] - \mu$
   - $\bar{X}$: sample mean
   - $\mu$: population mean

- sample variance
$B(S^2) = E[S^2] - \sigma^2$
   - $S^2$: sample variance
   - $\sigma^2$: population variance
-----------------------------------

#### 3.3 Bias Variance Decoposition


$\begin{align}
    MSE(\hat{\Theta}) &= E[(\hat{\Theta} - \theta)^2] \\
                      &= Var(\hat{\Theta} - \theta) + (E[\hat{\Theta}-\theta])^2 \\
                      &= Var(\hat{\Theta}) + B(\hat{\Theta})^2
\end{align}$

--------------------

## 4. Sample Variance

#### sample variance: $S^2 =  \frac{1}{n-1} \sum_{k=1}^{n}(X_k - \bar{X})^2$

> ** Why $S^2 =  \frac{1}{n-1} \sum_{k=1}^{n}(X_k - \bar{X})^2$ NOT $\bar{S}^2 =  \frac{1}{n} \sum_{k=1}^{n}(X_k - \bar{X})^2$ **

<span style="color:red">Because $\bar{S}^2 =  \frac{1}{n} \sum_{k=1}^{n}(X_k - \bar{X})^2$ is a **biased** estimator</span>

---------------

**Proof**:
    
> $\begin{align}
    \bar{S}^2 &= \frac{1}{n}( \sum_{k=1}^{n}{X_k}^2 - n \bar{X}^2 ) \\
    E[\bar{X}^2] &= \frac{\sigma^2}{n} + \mu^2 \\ \\
    E[\bar{S}^2] &= \frac{1}{n} (\sum_{k=1}^{n}E[{X_k}^2] - nE[\bar{X}^2])\\
    E[\bar{S}^2] &= \frac{1}{n} (n(\mu^2+\sigma^2) - n(\mu^2+\frac{\sigma^2}{n})) = \sigma^2(\frac{n-1}{n})
\end{align}$

$B(\bar{S}^2) = E[\bar{S}^2] - \sigma^2 = - \frac{\sigma^2}{n} \Leftarrow \text{NOT 0}$

since $E[\bar{S}^2] = \sigma^2(\frac{n-1}{n})$ then we can modify it by multiply $\frac{n-1}{n}E[\bar{S}^2] = E[\frac{n-1}{n}\bar{S}^2] = \sigma^2$, finally we got unbias estimator $S^2 =  \frac{1}{n-1} \sum_{k=1}^{n}(X_k - \bar{X})^2$

--------------------

## 5. Chi-Squared Distribution, $t$-Distribution, F-Distribution

#### 5.1 Chi-Squared Distribution Definition:

$Y = Z_1^2 + Z_2^2 + ... + Z_n^2$, where $Z_1, Z_2, ... Z_n$ are iid $\mathcal{N}(0,1)$, then Y is $\chi^2(n)$

- if $X_1, X_2, ..., X_n$ is iid <span style="color:red">normal</span> rv, sample variance $S^2 =  \frac{1}{n-1} \sum_{k=1}^{n}(X_k - \bar{X})^2$, follow Chi-Squared distribution
- if $Y \sim \chi^2(n)$, when degree of freedom($k$) grows large, $\frac{Y-n}{\sqrt{2n}} \sim \mathcal{N}(0,1)$

![](figures/chi2.gif)

-------------------------------

#### 5.2 $t$-Distribution Definition:

$T = \frac{Z}{\sqrt{Y/n}}$ where $Z \sim \mathcal{N}(0,1)$, $Y \sim \chi^2(n)$

- If $X_1, X_2, ..., X_n$ is iid $\mathcal{N}(\mu, \sigma ^ 2)$, normalized sample mean $T = \frac{\bar{X}-\mu}{S/\sqrt{n}}$ follows $t-$distribution

- when $n \rightarrow \infty$ $t-$distribution converges to $\mathcal{N}(0,1)$

![](figures/t.gif)

---------------------------

#### 5.3 F-distribution

F-distribution definition:

$F = \frac{U/m}{V/n}$, where$U \sim \chi^2(m)$, $V\sim \chi^2(n)$, and $U,V$ are independent, then $F \sim \mathcal{F}(m,n)$

![](figures/f.png)

-----------------------

some proof in case you don't know:

$\begin{align}
\bar{S}^2 &=  \frac{1}{n} \sum_{k=1}^{n}(X_k - \bar{X})^2 \\
          &=  \frac{1}{n} \sum_{k=1}^{n}({X_k}^2 + \bar{X}^2 - 2X_k \bar{X}) \\
          &=  \frac{1}{n}( \sum_{k=1}^{n}{X_k}^2 + \sum_{k=1}^{n} \bar{X}^2 - \sum_{k=1}^{n} 2X_k \bar{X}) \\
          &= \frac{1}{n}( \sum_{k=1}^{n}{X_k}^2 + n \bar{X}^2 -  2\bar{X} \sum_{k=1}^{n} X_k ) \\
          &= \frac{1}{n}( \sum_{k=1}^{n}{X_k}^2 + n \bar{X}^2 -  2\bar{X} \cdot n \bar{X} ) \\
          &= \frac{1}{n}( \sum_{k=1}^{n}{X_k}^2 - n \bar{X}^2 )
\end{align}$

$\begin{align}
Var(\bar{X}) &= Var(\frac{\sum_{i=1}^{n}{X_i}}{n}) = \frac{n Var(X_i)}{n^2} = \frac{\sigma^2}{n}  \\
E[\bar{X}^2] &= Var(\bar{X}) + (E[\bar{X}]^2) =  \frac{\sigma^2}{n} + \mu^2
\end{align}$

$Var(Y)= E[(Y-E[Y])^2] = E[Y^2] - E[Y]^2$

$Var(Y+a) = Var(Y)$