### Thalia's note: Hypothesis Testing

When we try to find point or interval estimator of unknown parameters, we make assumptions that the observed quantity X can be modeled by some density function $f(x,\theta)$. 

This assumption is a **Statistical hypothesis $H$**, it is usually about the parameter $\theta$. 

* The hypothesis to be tested is called the **Null Hypothesis $H_{0}$**

* The negation of the null hypothesis is called the **Alternative Hypothesis $H_{a}$**

A hypothesis test is said to be a **Simple hypothesis** if $H$ completely specifies the density of the population, otherwise it is called a **Composite hypothsis**.

**Example:**

When you flip a coin, there are two possible outcomes: heads and tails. In the case of fair coins, heads and tails each have the same probability of 1/2. In this case we say the outcome of flipping a coin follows Bernoulli distribution, $Bernoulli(x;p)$ with success rate $p = 0.5$.

Suppose we wanted to check whether a coin was fair and balanced. 

**Define the hypothesis:**

$$ H_0: \text{ This is a fair coin} \qquad  H_a: \text{ This is a fair coin}$$
This is same with: 
$$ H_0: p=0.5 \qquad  H_a: p \neq 0.5$$

In this case, $H_{0}$ completely specifies the density of population, so it is a simple hypothesis. However, $H_{a}$ is a composite hypothsis.


### Test statistics
Evaluation of the null hypotheses is done by using a **test statistic** $W(X_1, X_2,...,X_n)$ amd a set $C$ called the **critical region**.

> **Definition of statistic**:
> Given a random sample $X_1$, $X_2$,..., $X_n$ from population $X$ with pdf $f(x;\theta)$, a statistic is a function $T$ of $X_1$, $X_2$,..., $X_n$, that is: $T = T(X_1, X_2,..., X_n)$ which is free of parameter.

If the outcome of $W$ turn out to be inside the critial region $W \in C$ then we **reject** the null hypothesis and accept the alternative hypothesis.

A commonly used test statistics is **likelihood ratio test statistic**. 

* If both of null hypothesis and alternative hypothesis are composite tests:

> $$ H_0: \theta \in \Omega_{0} \qquad  H_a:\theta \in \Omega_{a}$$
>
> Base on a set of random sample data $X_1$, $X_2$,..., $X_n$, the likelihood test statistics is:
>
> $$W(X_1, X_2,..., X_n) = \frac{\max_{\theta \in \Omega_{0}}L(\theta, X_1, X_2,..., X_n)}{\max_{\theta \in \Omega_{a}}L(\theta, X_1,  X_2,..., X_n)}$$
Where $L$ is the likelihood function:
$$ L(\theta, X_1, X_2,..., X_n)=\Sigma_{i=i}^nf(x_i;\theta)$$

* If both of null hypothesis and alternative hypothesis are simple tests:

> $$ H_0: \theta = \theta_{0} \qquad  H_a:\theta = \theta_{a}$$ The likelihood test statistics is:
> $$W(X_1, X_2,..., X_n) = \frac{L(\theta_{0}, X_1, X_2,..., X_n)}{L(\theta_{a}, X_1, X_2,..., X_n)}$$

* If the null hypothesis but alternative hypothesis is composite test:

> $$ H_0: \theta = \theta_{0} \qquad  H_a:\theta \neq \theta_{0}$$ The likelihood test statistics is:
> $$W(X_1, X_2,..., X_n) = \frac{L(\theta_{0}, X_1, X_2,..., X_n)}{\max_{\theta \in \Omega}L(\theta, X_1, X_2,..., X_n)}$$

The **likelihood ratio test** is any test has a critical region $C$, of the form:

$$C = \{(X_1, X_2,..., X_n) \mid W(X_1, X_2,..., X_n)\leq k\}$$

where $k$ is a number in interval $[0, 1]$

How to determine the value of $k$ in our critical region? That depends on the goodness (significance level) of the statistical test.

### Evaluation of test

In hypothesis test, the basic problem is to decide, base on the sample information, whether the null hypothesis is true. There are four situations that determines our decision is right or in error.

|| $H_{0}$ is true | $H_{0}$ is false | 
|---|:-:|:-:|
|Accept $H_{0}$|Correct|  $\color{red}{\text{Type II error}}$ |
|Reject $H_{0}$|$\color{red}{\text{Type I error}}$|Correct|

The **size $\alpha$** of a hypothesis test is the probability of Type $I$ error:

$$\alpha = P(\text{ Type I error }) = P(\text{ Reject } H_{0} \mid H_{0} \text{ is true })$$

A test is said to have **significance level** $\alpha$ if its size is less than or equal to $\alpha$.

Similarly, we have the defintion of the probability of Type $II$ error: 

$$\beta = P(\text{ Type II error }) = P(\text{ Accept } H_{0} \mid H_{0} \text{ is false })$$

The **power** of a hypothesis test is:

$$power = P(\text{ Reject } H_{0} \mid H_{0} \text{ is false }) = 1-\beta$$

According to Neyman-Pearson Theorem: any critical region C of the form:

$$C = \{(X_1, X_2,..., X_n) \mid \frac{L(\theta_{0}, X_1, X_2,..., X_n)}{L(\theta_{a}, X_1, X_2,..., X_n)}\leq k\}$$

is best (or **uniformly most powerful**) of its size for testing:

$$ H_0: \theta = \theta_{0} \qquad  H_a:\theta = \theta_{a}$$

#### Eample

Let $X_1, X_2,..., X_{12}$ be a random sample (i.i.d) from normal distribution with mean 0 and variance $\sigma^2$.
What is the most powerful test of significance level (size) 0.025 for test:

$$ H_0: \sigma^2 = 10 \qquad  H_a: \sigma^2 = 5$$ 


The critical region for the uniformly most powerful test is:
$$
\begin{align}
C &= \{(X_1, X_2,..., X_n) \mid \frac{L(\sigma_{0}^2, X_1, X_2,..., X_n)}{L(\sigma_{a}^2, X_1, X_2,..., X_n)}\leq k\} \\
& = \{(X_1, X_2,..., X_n) \mid \frac{\prod_{i=1}^{12} \frac{1}{\sqrt{2\pi\sigma_0^2}}exp(\frac{x_i^2}{-2\sigma_0^2})}{\prod_{i=1}^{12} \frac{1}{\sqrt{2\pi\sigma_a^2}}exp(\frac{x_i^2}{-2\sigma_a^2})}\leq k\} \\
& = \{(X_1, X_2,..., X_n) \mid{(\frac{1}{6})^6 \exp{(\frac{1}{20}\Sigma_{i=1}^{12} x_i^2})} \leq k\} \\
& = \{(X_1, X_2,..., X_n) \mid \Sigma_{i=1}^{12} x_i^2 \leq a\} \;  \text{for some value a}
\end{align} 
$$

Because $X_i \sim N(0, \sigma^2)$, we have $(\frac{X_i}{\sigma}) \sim N(0,1)$, thus $\Sigma_{i=1}^{12} (\frac{X_i}{\sigma})^2 \sim \chi^2(12)$.

The size of the test:
$$
\begin{align}
\alpha &= P(\text{ Reject } H_{0} \mid H_{0} \text{ is true }) \\
&= P(\Sigma_{i=1}^{12} X_i^2 \leq a \mid \sigma^2=10) \\
&= P(\Sigma_{i=1}^{12} (\frac{X_i}{\sigma})^2 \leq \frac{a}{\sigma^2} \mid \sigma^2=10) \\
&= P(\Sigma_{i=1}^{12} (\frac{X_i}{\sigma})^2 \leq \frac{a}{10}) \\
&= P(\chi^2(12) \leq \frac{a}{10}) \\
&= 0.025
\end{align} 
$$

From the chi-squre table, we can find $\frac{a}{10} = 4.4$ and $a = 44$. So for this test, if the summation of the squared samples is less than 44, we can reject the null hypothesis and accept the alternative one.

**So the procedure of a hypothesis test is:**

1. State the hypotheses 
2. Define the test statistic and determine the significance level $\alpha$
3. Calculate the value of test statistic
4. Interpret results (and see whether test statistic is inside critical region)