<img src="https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png" style="float: left; margin: 15px;">

## Frequentist vs. Bayesian Statistics and the t-test

Week 2 | 4.2

---


## "Bayesian" vs. "Frequentist"

"Bayesian" statistics has been getting a lot of hype in recent years (with good reason), but it is easy to forget that both types of statistics are correct, they just focus on different things.

We will go deeper into Bayesian stats later in the course; for now we will explore the differences between the two approaches to probability.

The overarching goal of both approaches is the same:

---

**We want to make a statement about ALL data points (the population) based on a SAMPLE of data points, and describe our UNCERTAINTY about that statement.**

---

### Estimating a mean value

Say we want to measure the **mean height of male professional athletes**. We measure the height of 100 different athletes and thus have 100 data points in our sample.

Our scenario:

- We want to make a statement about the mean height of **ALL** male professional athletes.
- We only have a **sample** of 100 measured heights from that **population**.

### Estimating the mean height from a Bayesian approach...

As a **Bayesian** my approach will be to make a statement about the probability of the mean height given the data I have:

- The mean height of all male professional athletes is indeed a single value, but I only have a subset of the measurements.
- I have a belief about how tall these athletes are.
- I am going to make a statement about the probability the height is a value given the information I have.

- I have 100 observations, or data points, that I will use to **update** my "prior" belief about the heights.

- I have collected **fixed data** which I use to update my inference of the probability for the mean height, which is called my **posterior distribution** of mean heights.
- Thus, there is a **distribution of values for the true mean height with varying probability.**

### Estimating the mean height from a Frequentist approach...

As a **Frequentist** I believe:

- The mean height of male professional athletes is an unknown but **fixed, "true" value**.

- My 100 data points are a **random sample.** That is to say, I have collected **at random** 100 heights from the **population pool**.

- This random sampling procedure is considered **infinitely repeatable**. My inferences about height are based on the idea that this sample is just one of an infinite number of hypothetical population samples.

- Our **data sampled is random**, but the **true value of height is fixed** across all hypothetical samples.
- There is **a distribution of possible samples given the true fixed value**.

### The inverse approaches

**FREQUENTISTS** ask:

### $$P(\text{data}\;|\;\text{true mean})$$

What is the probability of our data given a true and fixed population mean?

---

**BAYESIANS** ask:

### $$P(\text{true mean}\;|\;\text{data})$$

What is the probability of the true mean given the data that we have?



### Pros and cons to both

**Bayesian methods**:

- Pros: Inference on measure is more intuitive. No "absurd" results. Does not require the hypothetical "infinite sampling".
- Cons: Computationally intensive. Does not "guarantee" success rate of experiments. Requires prior belief.

---

** Frequentist methods:**

- Pros: Requires no justification of prior belief. Direct analogy to experimental design theory. Not as computationally expensive.
- Cons: Inference not as intuitive. Requires "asymptotic" sampling axioms. Allows "absurd" results (it is ok if some of the "experiments" are nonsense as long as most are correct.)

## Hypothesis testing with frequentist methods

**Frequentist** methods lend themselves well to the idea of experimental design. For example, say we are testing a new drug:

- We randomly select 50 people to be in the placebo control condition and 50 people to recieve the treatment.
- Our sample is selected from the broader, unknown population pool.
- In a parallel world we could have ended up with any random sample from the population pool of 100 people.


### Steps:

1. Form a null and alternative hypothesis
2. Select a significance level, alpha (usually 5%)
3. Select a statistical test
4. Calculate the appropriate statistic 
5. Compare the p-value with alpha
6. If the p-value > alpha, then we fail to reject the null hypothesis
7. If the p-value <= alpha, then we reject the null hypothesis in favor of the alternative hypothesis

### The "null hypothesis"

The **null hypothesis** is a fundamental concept for Frequentist statistical tests. We can define this as **H0**. 

The null hypothesis is, in this example, the hypothesis that there is no difference between placebo and treatment.

**H0:** The measured difference is equal to zero.

The **alternative hypothesis** is the other possible outcome of the experiment: the difference between the placebo and the treatment is real.

**H1:** The measured parameter is different not zero (greater than or less than zero for two-tailed, and one or the other for one-tailed tests).

### The p-value

The **p-value** is the probability that, **GIVEN THE NULL HYPOTHESIS IS TRUE**, we would have sampled the current set of data. 

---

Say in our experiment we follow-up with the experimental and control groups:

- 5 out of 50 patients in the control group indicate that their symptoms are better
- 20 out of 50 patients in the experimental group indicate that their symptoms are better

The **p-value** would be the **probability that we measured this rate of recovery over placebo in our experiment (400%) given that in fact there is no difference in recovery rate.**

### Calculating the probability the null hypothesis is true: the t-test

Recall that as Frequentist we want to know:

### $$P(\text{data}\;|\;\text{true mean})$$

We obviously don't know the true mean difference in rate of recovery. Instead, **we will assume that the true mean difference is zero: the null hypothesis H0:**

### $$P(\text{data}\;|\;\text{true mean}=0)$$

This is known as the **likelihood**.

In [2]:
import numpy as np
import scipy.stats as stats
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_style('whitegrid')

%config InlineBackend.figure_format = 'retina'
%matplotlib inline


In [6]:
# Create the experimental and control group results

# Control: 5 people got better, 45 did not:
# concatenate two arrrays: one has zeros for people that didn't get better, the other has ones

# Experimental: 20 people got better, 30 did not:
# concatenate two arrrays: one has zeros for people that didn't get better, the other has ones

# calculate the difference between their means in recovery rates:

### t-tests: calculating the t-statistic

---

How do we calculate the p-value, or significance of our experimental results? For comparing two means (as is the case in this example: the mean difference in symptoms between conditions) we can use the **t-test**, which uses **t-statistics**.

The p-value will be a conversion of the two-sample t-test **t-statistic**, which we calculate first:

### $$t = \frac{mean(sample_E) - mean(sample_C)}{\sqrt {\big(\frac{(var(sample_E)}{n_E} + \frac{var(sample_C)}{n_C}\big)}}$$

**Calculate the t-statistic for the difference in rates:**

[For a gentle overview, see http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-t-values-and-p-values-in-statistics]


In [None]:
# Create the experimental and control group results

# Control: 5 people got better, 45 did not:
# concatenate two arrrays: one has zeros for people that didn't get better, the other has ones

# Experimental: 20 people got better, 30 did not:
# concatenate two arrrays: one has zeros for people that didn't get better, the other has ones

# calculate the difference between their means in recovery rates:

# calculate the t-statistic 

- The numerator: the difference between the mean of your sample and the hypothesized mean. Recall that our hypothesized mean is the null hypothesis H0. This is the **difference in means**.

- The denominator: the standard deviation of your sample measurements divided by the square root of your sample size. This is the **standard error of the mean**. 

### Plotting the t-statistic

From the central limit theorem, we know that (with enough samples), the distribution of means is normal. In the case of smaller sample sizes, [we adjust this to be a more conservative student-t distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution).

We can plot the student-t distribution centered on zero, which corresponds to our null hypothesis.

In [4]:
# generate points on the x axis between -4 and 4:


# use stats.t.pdf to get values on the probability density function for the t-distribution


In [7]:
# initialize a matplotlib "figure"


# plot the lines using matplotlib's plot function:


# plot a vertical line for our measured difference in rates t-statistic


In [8]:
# run a t-test on the two populations 
# use stats.ttest_ind