# Inferential Statistics (frequentist)

## Concepts covered in this lesson

1. Estimation and Estimators
2. Confidence intervals (quantifying sampling error)
3. Hypothesis testing

## Estimation and Estimators

Think of the following study:
- Research question: What's the average weight of people in the Los Angeles (LA) metro area?
- Sampling technique: Ask every third account on Instagram who posts mainly in the LA metro area.

Now, let's answer the following questions:
1. What is estimation? Obtain information about a parameter using a statistic
2. What is an estimator? Some statistical method for estimation based on observable data
3. What is estimator bias? Long-run error between sample statistic and population statistic
4. What is sampling error? Error caused by technique for random sampling is not representative of the population
5. What is the difference between standard error and standard deviation? Standard error is computed on an estimator; standard error becomes smaller as sample size increases
6. What is sampling bias? Selecting groups that are not representative of the full population
7. What is measurement error? Error in the data collection process

## Confidence intervals (quantifying sampling error)

Let's go back to the weight study above. Say that we will begin collecting our data.

1. How do we know when to stop?
2. How do we quantify the significance of the data we have collected so far?

### Calculating CIs using Python

Study: `as_datasets/ExamScores.csv` (exam scores of a class over time)

Write a function that computes confidence intervals for a mean given a `pd.Series` of data, using the following signature.
```
def get_confidence_interval(dataset: pd.Series, ci_level: float) -> Tuple[float, float]:
```
Then, use your function to get the confidence interval for each column in `ExamScores.csv`.

In [1]:
import numpy as np
import scipy.stats
import pandas as pd
from typing import Tuple


def get_confidence_interval(dataset: pd.Series, ci_level: float = 0.95) -> Tuple[float, float]:
    n = len(dataset)
    mean = dataset.mean()
    stdev = dataset.std()
    stderr = stdev / np.sqrt(n)
    if n > 30:
        return scipy.stats.norm.interval(ci_level, mean, stderr)
    else:
        ddof = n - 1
        return scipy.stats.t.interval(ci_level, ddof, mean, stderr)

In [7]:
df_exam = pd.read_csv('../as_datasets/ExamScores.csv')

df_exam_cis = df_exam.apply(get_confidence_interval, axis=0)  # axis=0 for columns

Unnamed: 0,Exam1,Exam2,Exam3,Exam4
0,80.124504,75.427229,67.310128,74.266876
1,85.275496,83.372771,79.369872,78.733124


## Hypothesis testing

Continuing with the exam scores, **how do we know that everyone _did better_ on the second exam than the first exam?**

In other words, what is the **significance** of our test statistic?  

How do we determine that this is **statistically significant**?

When would **statistical significance** not be important **practically**?

### Choosing statistical tests
![statistical test table](testing_table.PNG)

### Errors in hypothesis testing
![confusion matrix with Type 1/2 errors](confusion_matrix.PNG)

### Mean-based testing

#### 1-sample t-test

File: `as_datasets/ExamScores.csv`

Research question: Is the class's scores for Exam 2 different from the expected score of 86?

#### 2-sample unpaired t-test

File: `as_datasets/Memory.csv`

Research question: Does this memory enhancement drug actually reduce the number of memory-related tasks?

#### 2-sample paired t-test

File: `as_datasets/ExamScores.csv`

Research question: Did the class improve on the second exam?

#### One-way ANOVA

File: `as_datasets/ExamScores.csv`

Research question: Do the exam scores truly have different means?

#### Linear statistical modeling with OLS

File: `as_datasets/ExamScores.csv`

Same research question as above: Do the exam scores truly have different means?

#### Multiple comparison with Tukey's HSD

File: `as_datasets/ExamScores.csv`

Research question: Which exam(s) did the course struggle with?

### Proportion-based testing:

#### 1-sample z-test on a proportion

File: `as_datasets/ExamScores.csv`

Research question: Does sufficient evidence exist that the proportion of scores over 80 on exam 1?

#### 2-sample z-test on a proportion

10,000 individuals are divided evenly into two groups. The treatment group is given a vaccine and the control group is given a placebo. 95 of the 5,000 individuals in the treatment group developed a disease. 125 of the 5,000 individuals in the control group developed a particular disease. A research team wants to determine whether the vaccine is effective in decreasing the incidence of disease. Does sufficient evidence exist to conclude that the proportion of developing a disease in individuals given the vaccine is less than that of individuals given a placebo?