Permalink
Switch branches/tags
Nothing to show
Find file
Fetching contributors…
Cannot retrieve contributors at this time
389 lines (295 sloc) 14.8 KB

Statistics

Basic Properties

  1. $E(X) = ∑ x p(x)$
  2. $Var(X) = ∑ (x-μ)^2f(x)$
  3. X is around $E(X)$, give or take $SD(X)$
  4. $E(aX + bY) = aE(X) + bE(Y)$
  5. $Var(aX + bY) = a^2Var(X) + b^2Var(Y)$
  6. $Var(X) = E(X^2) - [E(X)]^2$
  7. $Cov(X_1, X_2) = E(X_1X_2) - E(X_1)E(X_2)$
  8. $P(A\intersect B) = P(A)P(B)$ if A and B independent
  9. RV is centered when $E(X)=0$, and any RV can be centered via $Y = X - E(X)$, with SD and variance unaffected
  10. In $X = μ + ε$, $μ$ is the unknown constant of interest, and $ε$ represents random measurement error.
  11. if $X$, $Y$ are independent:
    1. $MX+Y(t) = M_X(t)M_Y(t)$
    2. $E(XY)=E(X)E(Y)$, converse is true if $X$ and $Y$ are bivariate normal, extends to multivariate normal

Approximations

Law of Large Numbers

Let $X_1, X_2, …, X_n$ be IID, with expectation $μ$ and variance $σ^2$. $\overline{X_n} = \frac{1}{n}∑ni=1X_i\xrightarrow[n]{∞}μ$. Let $x_1, x_2, …, x_n$ be realisations of the random variable $X_1, X_2, …, X_n$, then $\overline{x_n} = \frac{1}{n}∑ni=1x_n \xrightarrow[n]{∞} μ$

Central Limit Theorem

Let $S_n = ∑ni=1X_i$ where $X_1, X_2, …, X_n$ IID. $\frac{S_n - nμ}{\sqrt{n}σ} \xrightarrow[n]{∞} \mathcal{N}(0,1)$

Distributions

Poisson($λ$)

$E(X) = Var(X) = λ$

Normal $X ∼ \mathcal{N}(μ, σ^2)$

$f(x) = \frac{1}{\sqrt{2π}σ} exp \left(-\frac{(x-μ)^2}{2σ^2}\right), -∞<x<∞$

  1. When $μ = 0$, $f(x)$ is an even function, and $E(X^k) = 0$ where $k$ is odd
  2. $Y = \frac{X-E(X)}{SD(X)}$ is the standard normal

Gamma $Γ$

$g(t) = \frac{λ^α}{Γ(α)}tα-1e^{-λ t}, t ≥ 0$

$μ_1 = \frac{α}{λ}, μ_2 = \frac{α(α+1)}{λ^2}$

$χ^2$ Distribution

Let $\mathcal{Z} ∼ \mathcal{N}(0,1)$, $\mathcal{U} = \mathcal{Z}^2$ has a $χ^2$ distribution with 1 d.f.

$f\mathcal{U}(u) = \frac{1}{\sqrt{2π}} u-\frac{1{2}} e-\frac{u{2}}, u ≥ 0$

$χ_1^2 ∼ Γ(α=\frac{1}{2}, λ=\frac{1}{2})$

Let $U_1, U_2, …, U_n$ be $χ_1^2$ IID, then $V=∑ni=1U_i$ is $χ_n^2$ with n degree freedom, $V ∼ Γ(α=\frac{n}{2}, λ=\frac{1}{2})$

$E(χ_n^2) = n, Var(χ_n^2) = 2n$

$M(t) = \left(1 - 2t\right)-\frac{n{2}}$

t-distribution

Let $\mathcal{Z} ∼ \mathcal{N}(0,1)$, $\mathcal{U}_n ∼ χ_n^2$ be independent, $t_n = \frac{\mathcal{Z}}{\sqrt{U_n / n}}$ has a t-distribution with n d.f.

$f(t) = \frac{Γ([(n+1)/2])}{\sqrt{n}π\Gamma(n/2)}\left(1 + \frac{t^2}{n} \right)-\frac{n+1{2}}$

  1. t is symmetric about 0
  2. $t_n \xrightarrow[n]{∞} \mathcal{Z}$

F-distribution

Let $U ∼ χ_m^2, V ∼ χ_n^2$ be independent, $W = \frac{U/m}{V/n}$ has an F distribution with (m,n) d.f.

If $X ∼ t_n$, $X^2 = \frac{\mathcal{Z}/1}{U_n/n}$ is an F distribution with (1,n) d.f, with $w ≥ 0$:

For $n > 2$, $E(W) = \frac{n}{n-2}$

Sampling

Let $X_1, X_2, …, X_n$ be IID $\mathcal{N}(μ, σ^2)$.

$\text{sample mean, } \overline{X} = \frac{1}{n}∑ni=1X_i$

$\text{sample variance, } S^2 = \frac{1}{n-1}∑ni=1\left(X_i-\overline{X}\right)^2$

Properties of $\overline{X}$ and $S^2$

  1. $\overline{X}$ and $S^2$ are independent
  2. $\overline{X} ∼ \mathcal{N}(μ, \frac{σ^2}{n})$
  3. $\frac{(n-1)S^2}{σ^2} ∼ χn-1^2$
  4. $\frac{\overline{X} - μ}{S/\sqrt{n}} ∼ tn-1$

Survey Sampling

In population of size $N$, we are interested in a variable $x$. The ith individual has fixed value $x_i$.

$\text{mean of population} = μ = \frac{1}{N}∑Ni=1x_i$

$\text{total of population} = τ = ∑Ni=1x_i =μ N$

$\text{SD of population} = σ$

$σ^2 = ∑Ni=1\left(x_i-μ\right)^2 \frac{1}{N}∑ni=1x_i^2 - μ^2$

Dichotomous case

Population are members with value 0 or 1. Let $p$ be the proportion of members with value 1. $μ = p, σ^2 = p(1-p)$

Simple Random Sampling (SRS)

Assume $n$ random draws are made without replacement. (Not SRS, will be corrected for later).

Lemma A

The draws $X_i$ have the same distribution, and denote $ξ_1, ξ_2, … ξ_n$ as values assumed by the population, and let the number of members with value $ξ_j$ be $n_j$

$P(X_i =ξ_j) = \frac{n_j}{N}$

$E(X_i) = μ, Var(x_i) = σ^2$

Lemma B

For $i ≠ j$, $Cov(X_i, X_j) = - \frac{σ^2}{N-1}$

We use sample mean $\overline{X}$ to estimate $μ$:

$E(\overline{X}) = μ$ from Lemma A, and

$Var(\overline{X}) = \frac{σ^2}{n} \left(\frac{N-n}{N-1}\right)$ from Lemma B, where $\frac{N-n}{N-1}$ is the finite population correction factor.

In 0-1 population, let $\hat{p}$ be proportion of 1s in the sample:

$E(\hat{p}) = p, SD(\hat{p}) = \sqrt{\frac{p(1-p)}{n}{\frac{N-n}{N-1}}}$

Estimation Problem

Let $X_1, X_2, …, X_n$ be random draws with replacement. Then $\overline{X}$ is an estimator of $μ$. and the observed value of $\overline{X}$, $\overline{x}$ is an estimate of $μ$.

Standard Error (SE)

Since $E(\overline{X}) = μ$, the estimator is unbiased.

The error in a particular estimate $\overline{X}$ is unknown, but on average its size is about $SD(\overline{x}) = \frac{σ}{\sqrt{n}}$

Standard error of an $\overline{X}$ is defined to be $SD(\overline{X})$

An unbiased estimator for $σ^2$ is $s^2 = \frac{1}{n-1}∑ni=1(X_i - \overline{X})^2$

paramestSEEst. SE
$μ$$\overline{X}$$\frac{σ}{\sqrt{n}}$$\frac{s}{\sqrt{n}}$
$p$$\hat{p}$$\sqrt{\frac{p(1-p)}{n}}$$\sqrt{\frac{\hat{p}(1-\hat{p})}{n-1}}$

Without Replacement

SE is multiplied by $\frac{N-n}{N-1}$, because $s^2$ is biased for $σ^2$: $E(\frac{N-1}{N}s^2) = σ^2$, but N is normally large.

Confidence Interval

An approximate $1-α$ CI for $μ$ is

$(\overline{x} - zα/2\frac{s}{\sqrt{n}}, \overline{x} + zα/2\frac{s}{\sqrt{n}})$

Measurement Error

Let $x_1, x_2, …, x_n$ be independent measurements of unknown constant $μ$. $X_i = μ + ε_i$.

The errors are IID with expectation 0 , and variance $σ^2$. $x_i = μ + e_i$, where $x_i$ and $e_i$ are realisations of the RV. Then $\overline{x}$ is an estimate of $μ$, with SE $\frac{σ}{\sqrt{n}}$.

Biased Measurements

Let $X = μ + ε$, where $E(ε) = 0$, $Var(ε) = σ^2$

Suppose X is used to measure an unknown constant a, $a ≠ μ$. $X = a + (μ - a) + ε$, where $μ-a$ is the bias.

Mean square error (MSE) is $E((X-a)^2) = σ^2 + (μ - a)^2$

with n IID measurements, $\overline{x} = μ + \overline{ε}$

$E((x - a)^2) = \frac{σ^2}{n} + \left(μ - a\right)^2$

$\text{MSE} = \text{\text{SE}}^2 + \text{bias}^2$, hence $\sqrt{\text{MSE}}$ is a good measure of the accuracy of the estimate $\overline{x}$ of a.

Estimation of a Ratio

Consider a population of $N$ members, and two characteristics are recorded: $(X_1, Y_1), (X_2, Y_2), … , (X_n, Y_n)$, $r = \frac{μ_y}{μ_x}$.

An obvious estimator of r is $R = \frac{\overline{Y}}{\overline{X}}$

$Cov(\overline{X},\overline{Y}) = \frac{σxy}{n}$, where

xy := \frac{1}{N}∑Ni=1(x_i-μ_x)(x_i-μ_y)$ is the population covariance.

Properties

With SRS, the approx variance of $R = \overline{Y}/\overline{X}$ is \begin{align*} Var(R) &≈ \frac{1}{μ_x^2}\left(r^2σ\overline{X}^2 + σ\overline{Y}^2 - 2rσ\overline{X\overline{Y}}\right) \ &= \frac{1}{n}\frac{N-n}{N-1}\frac{1}{μ_x^2}\left(r^2σ\overline{X}^2 + σ\overline{Y}^2 - 2rσ\overline{X\overline{Y}}\right) \end{align*}

Population coefficient $ρ = \frac{σxy}{σxσy}$

$E(R) ≈ r + \frac{1}{n}\left(\frac{N-n}{N-1}\right)\frac{1}{μ_x^2}\left(rσ_x^2-ρ\sigma_xσ_y\right)$

$sxy = \frac{1}{n-1}∑ni=1\left(X_i - \overline{X}\right)\left(Y_i - \overline{Y}\right)$

Ratio Estimates

$\overline{Y}_R = \frac{μ_x}{\overline{X}}\overline{Y} = μ_xR$

$Var(\overline{Y}_R) ≈ \frac{1}{n}\frac{N-n}{N-1}(r^2σ_x^2 + σ_y^2 -2rρ\sigma_xσ_y)$

$E(\overline{Y}_R) - μ_y ≈ \frac{1}{n}\frac{N-n}{N-1}\frac{1}{μ_x}\left(rσ_x^2 -ρ\sigma_xσ_y\right)$

The bias is of order $\frac{1}{n}$, small compared to its standard error.

$\overline{Y}_R$ is better than $\overline{Y}$, having smaller variance, when $ρ > \frac{1}{2}\left(\frac{C_x}{C_y}\right)$, where $C_i = σ_i/μ_i$

Variance of $\overline{Y}_R$ can be estimated by

$s\overline{Y_R}^2 = \frac{1}{n}\frac{N-n}{N-1}\left(R^2s_x^2+s_y^2-2Rsxy\right)$

An approximate $1-α$ C.I. for $μ_y$ is $\overline{Y}_R ± zα/2s\overline{Y_R}$

Estimation

Let $X_1, X_2, …, X_n$ be IID random variables with density $f(x|θ)$, where $θ ∈ \mathcal{R}^P$ is an unknown constant. Realisations $x_1, x_2, …, x_n$ will be used to estimate $θ$, the estimate a realisation of RV $\hat{θ}$. The bias and SE are:

$\text{bias} = E(\hat{θ}) - θ, SE = SD(\hat{θ})$

Moments

Let $X_1, X_2, …, X_n$ be IID with the same distribution as $X$.

$\hat{μ}_k = \frac{1}{n}∑ni=1X_i^k$ is an estimator of $μ_k$, where $μ_k$ is the kth moment. An estimate is also denoted $\hat{μ}_k$.

Method of Moments

To estimate $θ$, express it as a function of moments $g(\hat{μ}_1,\hat{μ}_2,…)$

The bias and SE in an estimate, still depends on the unknown value of the constant. Suppose 1.67 and 0.38 are estimates of $λ$ and $α$. Data is generated from $Γ(1.67, 0.38)$, and the MOM estimators are written as $\widehat{1.67}$ and $\widehat{0.38}$. Because the sample size is large, $(\hat{λ} - λ, \hat{α}-α) ≈ (\widehat{1.67} - 1.67, \widehat{0.38} - 0.38)$

Monte Carlo is used to generate many realisations of $\widehat{1.67}$ via the $Γ(1.67,0.38)$ distribution. With 10,000 realisations,

$bias(1.67) = E1.67,0.38(\widehat{1.67} - 1.67) ≈ 0.09$

$SE(1.67) = SD1.67,0.38(\widehat{0.38}) ≈ 0.35$

and $λ$ is estimated as $1.58 ± 0.35$

$\overline{X} \xrightarrow[n]{∞} α/λ, \hat{σ}^2 \xrightarrow[n]{∞}α/λ^2$, MOM estimators are consistent (asymptotically unbiased).

$\text{Poisson}(λ)$: $\text{bias} = 0, SE ≈ \sqrt{\frac{\overline{x}}{n}}$

$N(μ, σ^2)$: $μ = μ_1$, $σ^2 = μ_2 - μ_1^2$

$Γ(λ, α)$: $\hat{λ} = \frac{\hat{μ}_1}{\hat{μ}_2-\hat{μ}_1^2}=\frac{\overline{X}}{\hat{σ}^2}, \hat{α} = \frac{\hat{μ}_1^2}{\hat{μ}_2-\hat{μ}_1^2}=\frac{\overline{X}^2}{\hat{σ}^2}$

Maximum Likelihood Estimator (MLE)

Let ${f(⋅ | θ) : θ ∈ Θ}$ be a (identifiable) parametric identity

Suppose $X_1, X_2, …,X_n$ are IID with density $f(⋅|θ)$, where $θ_0 ∈ Θ$ is an unknown constant, we want to estimate $θ_0$ using realisations $x_1, x_2, …, x_n$.

$Pr(X_1=x_1, X_2=x_2,…) = ∏ni=1f(x_i|θ)$ for a discrete distribution.

$θ → L(θ) = ∏ni=1f(x_i|θ)$

The maximum likelihood (ML) estimate of $θ_0$ is the number that maximises the likelihood over $θ$.

The estimate is a realisation of the ML estimator $\hat{θ}_0$, which can also be found my maximising $L(θ) = ∏ni=1f(X_i|θ)$

The bias and SE are:

$\text{bias} = Eθ_0(\hat{θ}_0)-θ_0, SE = SD(\hat{θ}_0)$

Poisson Case

$L(λ) = ∏^ni=1\frac{λx_ie}{x_i!} = \frac{λ\sum^ni=1x_ie-nλ}{∏ni=1x_i!}$

$l(λ) = ∑ni=1x_ilog\lambda - nλ - ∑ni=1log x_i!$

ML estimate of $λ_0$ is $\overline{x}$. ML estimator is $\hat{λ}_0 = \overline{X}$

Normal case

$l(μ, σ) = -nlog\sigma - \frac{nlog 2π}{2} - \frac{∑ni=1\left(X_i-μ\right)^2}{2σ^2}$

$\frac{∂ l}{∂ μ} = \frac{∑ \left(X_i - μ\right)}{σ^2} \implies \hat{μ} = \overline{x}$

$\frac{∂ l}{∂ σ} = \frac{∑ni=1\left(X_i-μ\right)^2}{σ^3} - \frac{n}{σ} \ \implies \hat{σ^2} = \frac{1}{n}∑ni=1\left(X_i-\overline{X}\right)^2$

Gamma case

$l(θ) = nα\logλ + (α -1)∑ni=1log X_i - λ\sumni=1 X_i - nlog\Gamma(α)$

$\frac{∂ l}{∂ α} = nlog\alpha + ∑ni=1log X_i - ∑ni=1X_i - \frac{n}{Γ(α)}Γ ‘(α)$

$\frac{∂ l}{∂ λ} = \frac{nα}{λ} - ∑ni=1X_i$

$\hat{λ} = \frac{\hat{α}}{\hat{x}}$

bias and SE are estimated through Monte Carlo and Bootstrap methods.

Multinomial Case

$f(x_1, …, x_r) = {n \choose {x_1, x_2, … x_r}} ∏ni=1 p_iX_i$

where $X_i$ is the number of times the value occurs, and not the number of trials. and $x_1, x_2, … x_r$ are non-negative integers summing to $n$. $∀ i$:

$E(X_i) = np_i, Var(X_i)=np_i(1-p_i)$

$Cov(X_i,X_j) = -np_ip_j, ∀ i ≠ j$

$l(p) = Κ + ∑r-1i=1x_ilog p_i + x_rlog(1-p_1-…-pr-1)$

$\frac{∂ l}{∂ p_i} = \frac{x_i}{p_i} - \frac{x_r}{p_r} = 0 \text{ assuming MLE exists}$

$\frac{x_i}{\hat{p}_i} = \frac{x_r}{\hat{p}_r} \implies \hat{p}_i = \frac{x_i}{c}, c=\frac{x_r}{\hat{p}_r}$

$∑^ri=1\hat{p}_i = ∑^ri=1\frac{x_i}{c} = 1 \ \implies c = ∑ri=1x_i = n \implies \hat{p}_i = \frac{\overline{x}_i}{n}$

same as MOM estimator.

MLE vs MOM

  1. ML estimates have smaller SEs than MOM estimates
  2. In some cases bias and SE have to be computed numerically via methods like Newton-Rhapson, and requires bootstrap and Monte Carlo

Hardy-Weinberg Equilibrium

Let a locus have two alleles A and a, where the proportion of $a$ in the population is $θ$.

Assuming, the population is large, and mating is random, then in the next generation, the proportion of a alleles is the sum of 2 Be RV, $Bin(2,θ)$ and the number of $a$ alleles is $Bin(2n,θ)$

CIs in MLE

When sample size is large, $\hat{θ}_0$ is approximately normal.

$\frac{\hat{X} - μ}{s/\sqrt{n}} ∼ tn-1$

Given the realisations $\overline{x}$ and $s$,

$\left(\overline{x} - tn-1, α/2\frac{s}{\sqrt{n}}, \overline{x} + tn-1, α/2\frac{s}{\sqrt{n}}\right)$

is the exact $1-α$ CI for $μ$.

$\frac{n\hat{σ}^2}{σ^2} ∼ χn-1^$

$\left(\frac{n\hat{σ}^2}{χn-1,α/2^2}, \frac{n\hat{σ}^2}{χn-1,1-α/2^2}\right)$

is the exact $1-α$ CI for $σ$.