# Hypothesis testing

## The null and the alternative hypotheses

One of the most important methods in inferential statistics is hypothesis testing. In hypothesis testing, we try to determine whether a certain hypothesis or research question is true to a certain degree. One example of a hypothesis would be this: Eating spinach improves long-term memory.

The goal of classical hypothesis testing is to answer the question, “Given a sample and
an apparent effect, what is the probability of seeing such an effect by chance?” Here’s
how we answer that question:

* The first step is to quantify the size of the apparent effect by choosing a test statis‐
tic.
* The second step is to define a null hypothesis, which is a model of the system based
on the assumption that the apparent effect is not real.
* The third step is to compute a p-value, which is the probability of seeing the ap‐
parent effect if the null hypothesis is true.
* The last step is to interpret the result. If the p-value is low, the effect is said to be
statistically significant, which means that it is unlikely to have occurred by chance.
In that case we infer that the effect is more likely to appear in the larger population.

In [1]:
# import libraries
import pandas as pd
import numpy as np
import scipy.stats as ss
import seaborn as sns
import matplotlib.pyplot as plt

# set seaborn style
sns.set()

In [2]:
"""Funciones especiales"""

# Importamos la biblioteca necesaria para enlazar con el archivo requerido
import sys
sys.path.insert(0, '../statistics')

# importamos la función que necesitamos
#from DataManipulation import SampleRows
import DataManipulation as dm
from functions import ecdf
from functions_corr import cov, correlation

In [3]:
# Data
df = pd.read_csv('data/TSheightweight.csv', header=0, index_col=0)
df.head(10)

Unnamed: 0,age,sex,wtyrago,finalwt,wtkg2,htm3
0,82.0,2,76.363636,185.870345,70.91,157.0
1,65.0,2,72.727273,126.603027,72.73,163.0
2,48.0,2,,181.06321,,165.0
3,61.0,1,73.636364,517.926275,73.64,170.0
4,26.0,1,88.636364,1252.62463,88.64,185.0
5,42.0,1,118.181818,415.161314,109.09,183.0
6,40.0,2,50.0,422.810541,50.0,157.0
7,24.0,2,131.818182,1280.58598,122.73,178.0
8,37.0,1,87.727273,1245.06044,90.0,178.0
9,65.0,1,77.272727,382.738158,77.27,173.0
