# Statistical Tests

**Definition**. Let $x_1,x_2\in\mathbb{R}$. The notation $x_1\neq x_2$ stands for, there exists an index $i\in\mathbb{N}_{\leq n}$ such that $x_{1i}\neq x_{2i}$.

Let $E=\text{span}\{x\}$, every hypothesis $H$ consists of an assumption about the probability density [2] 
$$ f(x;\lambda)\;, $$
where $\lambda\in\mathbb{R}^n$.

The hypothesis $H_0$ is said to be the null hypothesis if, for a given $\lambda_0\in\mathbb{R}^n$,
$$ H_0(\lambda=\lambda_0)\;. $$
The alternative hypothesis is formulated as
$$ H_1(\lambda\neq\lambda_0)\;. $$

Since the null hypothesis makes a statement about the probability density in the sample space, it also predicts the probability for observing a point $X$. The critical region $S_c$ with a significance level $\alpha$ is given by
$$ P(X\in S_c|H_0)=\alpha $$

In other words, we determine $S_c$ such that the probability to observe a point $X\in E$ within $S_c$ is $α$, under the assumption that $H_0$ is true. If the point $X$ from the sample actually falls into the region $S_c$, then the hypothesis $H_0$ is rejected. Note that the above equation does not define the critical region $S_c$ uniquely.

In practice, the set $E$ is not available due to the lack of knowledge of the population. Instead one constructs a test statistic
$$ T(X) $$
and determines a region $U$ of the variable $T$ such that it corresponds to the critical region $S_c$, i.e.,
$$X\mapsto T(X), S_c(X)\mapsto U(X).$$

The null hypothesis is rejected, whenever $T\in U$

# Errors

Because of the statistical nature of the sample, it is clearly possible that the null hypothesis could be true, even though it was rejected since $X \in S_c$. The probability for such an error, an error of the first kind, is equal to $\alpha$.

There is in addition another possibility to make a wrong decision, if one does not reject the hypothesis $H_0$ because $X$ was not in the critical region $S_c$, even though the hypothesis was actually false and an alternative hypothesis was true. This is an error of the second kind. The probability for this,
$$P(X\notin S_c|H_1)=\beta$$

This connection with the alternative hypothesis $H_1$ provides us with a method to specify the critical region $S_c$. A test is clearly most reasonable if for a given significance level $\alpha$ the critical region is chosen such that the probability $\beta$ for an error of the second kind is a minimum. The critical region and therefore the test itself naturally depend on the alternative hypothesis under consideration.

Once the critical region has been determined, we can consider the probability for rejecting the null hypothesis as a function of the ``true'' hypothesis, or rather as a function of the parameters that describe it. 
$$ M(S_c,\lambda)=P(X\in S_c|H)=P(X\in S_c|\lambda)$$

## Example

Test of the hypothesis that a normal distribution with given variance $\sigma^2$ has the mean $\lambda=\lambda_0$.

# Import

## Modules

# References

[1] M. Bonamente, "Statistics and Analysis of Scientific Data", Springer, 2017

[2] S. Brandt, "Data Analysis", Springer, 2014

[3] L.-G. Johansson, "Philosophy of Science for Scientists", Springer 2016