# Sample moments
---
Sample mean<br>
$\hat{\mu} = \frac{1}{n} \sum_{i=1}^n X_i$<br>

Sample variance<br>
$\hat{s}^2 = \frac{1}{n} \sum_{i=1}^n (X_i-\hat{\mu})^2$

Population variance<br>
$\hat{S}^2 = \frac{1}{n} \sum_{i=1}^n (X_i-\mu)^2$

Notice that since the sample variance is built upon the first sample moment (which is not perfect itself), it is biased<br>$\mathbb{E}(\hat{s}^2) \leqslant \sigma^2$<br>

Intuitively, the distances are centered around the biased estimate, not the real mean => variance is underestimated

To overcome this bias a new adjusted estimate was proposed<br>
$\hat{s}_{unbiased}^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i-\mu)^2$<br>Luckily the correction requires very subtle change - we replace the denominator $n$ with a smaller value $n-1$ (seek for the details below):

$$
\mathbb{E}[\hat{s}_{unbiased}^2] = \mathbb{E}\left[\frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})^2\right] = \sigma^2
$$


## Sample Mean Properties
---

__Mean (Unbiasedness)__<br>
$\mathbb{E}[\hat{\mu}] = \mu$<br>
The estimate misses symmetrically around the true value<br>
Proof: $\mathbb{E}[\bar{X}] = \mathbb{E}\left[\frac{1}{n} \sum_{i=1}^n X_i\right]
= \frac{1}{n} \sum_{i=1}^n \mathbb{E}[X_i]
= \frac{1}{n} \cdot n \cdot \mu = \mu$ <br><br>


__Variance__<br>
$\text{Var}(\hat{\mu})$ aka standard error (of the mean) $=\frac{\sigma^2}{n}$<br>
Proof: $\text{Var}(\hat{\mu})
= \text{Var}\left(\frac{1}{n} \sum_{i=1}^n X_i\right)
= \frac{1}{n^2} \sum_{i=1}^n \text{Var}(X_i)
= \frac{1}{n^2} \cdot n \cdot \sigma^2
= \frac{\sigma^2}{n}$ <br><br>

__Consistency__<br>
$\bar{X} \xrightarrow{p} \mu \,\, \text{as} \,\, n \to \infty$<br>
The estimate converges towards the real value as sample size grows<br>
Proof: <br><br>

Let's apply Chebyshev inequality to evaluate the probability of extreme values:
$$P(|\bar{X}_n - \mu| \geq \varepsilon) \leq \frac{\text{Var}(\bar{X}_n)}{\varepsilon^2} = \frac{\sigma^2}{n \varepsilon^2}$$

It depends on the sample size. Let's write down the limit:
$$\lim_{n \to \infty} P(|\bar{X}_n - \mu| \geq \varepsilon) \leq \lim_{n \to \infty} \frac{\sigma^2}{n \varepsilon^2} = 0$$


__Efficiency__<br>
$\hat{\mu}(x)$ is the <u>best estimator</u> in class of linear unbiased estimators (BLUE)<br>

Sample average $\hat{\mu}(x)$ is an OLS estimator for a set of data - it minimizes the variance of this data (aka squared error). This can be easily shown by minimizing the sum of squares:<br>$\sum_{i=1}^n
  (y_i - \hat{y})^2 \rightarrow \mathrm{min}$<br>$\hat{y} = \frac{1}{n} \sum_{i=1}^n y_i$

Gauss-Markov theorem states that under certain conditions OLS estimator is BLUE in class of unbiased linear estimators. This means it has minimum possible variance

Those conditions include: linearity, exogeneity, homoskedastiity, full rank and they are met in our setting<br><br>

If X is normally distributed then it is also MVUE <u>efficient</u> in terms of Rao-Cramer boundary - best in the class of unbiased estimators

Let's write down the loglikelihood function of the normal distribution
$\log L(\mu) = \sum_{i=1}^n \log \left[ \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(X_i - \mu)^2}{2\sigma^2} \right) \right]
= -\frac{n}{2} \log(2\pi \sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (X_i - \mu)^2
$

Now let's take the second derivative with respect to parameter $\mu$:<br>
$\frac{\partial^2}{\partial \mu^2} \log L(\mu) = -\frac{n}{\sigma^2}
$

Notice it is a constant. Let's take an expectation over X:<br>
$\mathcal{I}_n(\mu) = - \mathbb{E} \left[ \frac{\partial^2}{\partial \mu^2} \log L(\mu) \right] = \frac{n}{\sigma^2}
$

Then by Rao-Cramer theorem the variance of the estimate is limited by:
$$\text{Var}(\hat{\mu}) \geq \frac{1}{\mathcal{I}_n(\mu)} = \frac{\sigma^2}{n}
$$

It can be clearly seen that this boundary matches our estimator variance $\mathrm{Var}(\bar{X})$



 <br><br>



__Asymptotic normality__<br>
$\sqrt{n}(\bar{X} - \mu) \xrightarrow{d} \mathcal{N}(0, \sigma^2)$<br>As sample size grows the distribution of the estimate $\bar{X}$ converges to normal distribution with parameters $(\mu, \frac{\sigma^2}{\sqrt{n}})$ no matter what was actual data distribution<br><br>

Let's standardize the sample mean estimator with parameters $\mathbb{E}(\bar{X}) = \mu$ and $\mathrm{Var}(\bar{X}) = \frac{\sigma}{\sqrt{n}}$<br>

This gives us:<br>
$Z = \frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} = \frac{1}{\sqrt{n}} \sum_{i=1}^n \frac{X_i - \mu}{\sigma} = \frac{1}{\sqrt{n}} \sum_{i=1}^n Y_i$<br>
CLT tells us that sum of random variables converges to normal distribution
$\frac{1}{\sqrt{n}} \sum_{i=1}^n Y_i \xrightarrow{d} \mathcal{N}(0,1)$<br>
Multiplying by $\sigma$ we get the original statement:
$(\bar{X} - \mu) \xrightarrow{d} \mathcal{N}(0, \frac{\sigma^2}{\sqrt{n}})$

If dataset is small and normal the distribution of $\bar{X}$ is Student's t<br><br>


 <br><br>






## Sample Variance Properties
---
TBD


## Sample Variance bias
---

Let's show that sample statistic $s^2$ is biased<br>
$s^2 = \mathbb{E}\left[\sum_{i=1}^n (X_i - \bar{X})^2\right] \neq \sigma^2$

After opening the brackets:
$$\sum_{i=1}^n (X_i - \bar{X})^2 = \sum_{i=1}^n X_i^2 - n \bar{X}^2$$

and taking expectations
$$\mathbb{E}\left[\sum_{i=1}^n (X_i - \bar{X})^2 \right] = \mathbb{E}\left[\sum_{i=1}^n X_i^2\right] - \mathbb{E}\left[n \bar{X}^2\right]
$$

The first argument becomes:
$$\mathbb{E}\left[\sum_{i=1}^n X_i^2\right] = \sum_{i=1}^n \mathbb{E}[X_i^2] = n \mathbb{E}[X_1^2]
$$
$$\mathbb{E}[X^2] = \sigma^2 + \mu^2
$$
$$\Rightarrow \quad \mathbb{E}\left[\sum_{i=1}^n X_i^2\right] = n(\sigma^2 + \mu^2)
$$

The second argument becomes:
$$\mathbb{E}[n \bar{X}^2] = n \mathbb{E}[\bar{X}^2]
$$
$$\mathbb{E}[\bar{X}^2] = \mathrm{Var}(\bar{X}) + (\mathbb{E}[\bar{X}])^2 = \frac{\sigma^2}{n} + \mu^2
$$
$$\Rightarrow \quad \mathbb{E}[n \bar{X}^2] = n \left(\frac{\sigma^2}{n} + \mu^2 \right) = \sigma^2 + n \mu^2
$$

Overall,
$$\mathbb{E}\left[\sum_{i=1}^n (X_i - \bar{X})^2\right] = n(\sigma^2 + \mu^2) - (\sigma^2 + n \mu^2) = (n - 1) \sigma^2
$$

This proves that
$$\mathbb{E}(\hat{\sigma}^2) = (n - 1) \sigma^2 \neq \sigma^2
$$

Putting $(n-1)$ to the left side we get the unbiased version of the estimator
$$\mathbb{E}\left[\frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2\right] =  \sigma^2
$$

$$\mathbb{E}\hat{\sigma} = \sigma$$


sdfsdf $\mathbb{E}\hat{\sigma}} = \sigma$