# <span style="color:#54B1FF">Hypothesis Tests:</span> &nbsp; <span style="color:#1B3EA9"><b>Overview</b></span>

<br>

This lesson considers [classical hypothesis testing](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing), which refers to a family of procedures which test a [null hypothesis](https://en.wikipedia.org/wiki/Null_hypothesis).

<br>
<br>

A null hypothesis ($H_0$) is a prediction of an experimental result, and it is always a statement of **equality**:

<center>$H_0: \ x = c$</center>

where $x$ is a random variable, and $c$ is a constant value.

<br>
<br>

For most null hypotheses, $c = 0$.  This is why it is called a "null hypothesis".

<br>
<br>

Alternative hypotheses ($H_1$) are statements of **inequality**.  Example alternative hypotheses are:

<center>$H_1: \ x > c$</center>

<center>$H_1: \ x < c$</center>

<center>$H_1: \ x \ne c$</center>

<br>
<br>

The most important difference between null and alternative hypotheses is that <font color="red">only null hypotheses are testable.</font> This is why "*hypothesis testing*" is sometimes called "*null hypothesis testing*".

For all hypothesis tests, the final result is usually a probability value, or "p value".

The null hypothesis is rejected when $p < \alpha$, where $\alpha$ is the pre-specified [Type I error rate](https://en.wikipedia.org/wiki/Type_I_and_type_II_errors).

By convention, $\alpha$=0.05.

This null hypothesis rejection is sometimes called "statistical significance". The probabilistic meaning of this will be discussed in the next lesson.



<br>

**Goals**:
* To learn the most common hypothesis testing procedures, and how they are related.
* To learn how to conduct these tests in Python.

In the next lesson, we will consider how hypothesis testing and probability are related. In particular, we will show that hypothesis testing and the previous lesson (regarding Probability) are very closely related.

___

## Types of hypothesis tests

The following five types of hypothesis tests are considered in this notebook:

* One-sample t test
* Paired t test
* Two-sample t test
* Regression
* One-way ANOVA

These tests are very commonly reported in many different scientific fields.

While these tests have different names, they are all very closely related.

<br>
<br>

In particular:  **all classical hypothesis tests are mathematical consequences of the** [<b>Normal distribution</b>](https://en.wikipedia.org/wiki/Normal_distribution).

<br>
<br>

It may be difficult to mathematically understand why the statement above is true, but it is relatively easy to understand the underlying concept. To understand the conceptual connection between these tests and the Normal distribution, it is easiest to conduct the tests first, and to understand their basics, then later to consider how they relate to the Normal distribution. 

This lesson thus focusses on the tests themselves. The next lesson (Simulating Experiments) will consider more deeply how the tests are conceptually connected to the Normal distribution.

<br>
<br>

For now, it is sufficient to understand that these five tests are just special names for specific cases of a single [independent variable](https://en.wikipedia.org/wiki/Dependent_and_independent_variables) (IV) and a single [dependent variable](https://en.wikipedia.org/wiki/Dependent_and_independent_variables) (DV) . Those IV cases are summarized in the following table:

| IV type        | Number of IV values | Type of DV | Hypothesis test name  |
| :------------- |:-------------:| -----:| -----:|
| Categorical   | 1 | Scalar | One-sample t test |
| Categorical   | 1 | Paired difference (scalar) | Paired t test |
| Categorical   | 2 | Scalar | Two-sample t test |
| **Continuous**    | $n$ | Scalar | Regression |
| Categorical   | $g$ | Scalar | One-way ANOVA |
| Categorical   | $g$ | Paired difference (scalar) | One-way repeated-measures ANOVA |

where:

* $n$ = sample size
* $g$ = number of groups

<br>
<br>

Note that this notebook does not consider repeated-measures ANOVA, which is also a very common procedure. This test is possible in Python, but it requires a separate Python package called [statsmodels](https://www.statsmodels.org/stable/index.html).  If you are interested in trying one-way repeated-measures ANOVA, please read this [this blog](http://www.pybloggers.com/2018/10/repeated-measures-anova-in-python-using-statsmodels/).