In [None]:
from datascience import *
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

## A/B Testing Example: Maternal Smoking Status and Baby Weight

Some researchers are interested in the determining whether there is an association between maternal
smoking status and baby weight. They collect the following data of randomly selected new mothers from
a nearby hospital:

In [None]:
baby = Table.read_table('https://www.inferentialthinking.com/data/baby.csv')
smoking_and_birthweight = baby.select('Maternal Smoker', 'Birth Weight')
smoking_and_birthweight

In [None]:
smoking_and_birthweight.group('Maternal Smoker')

In [None]:
smoking_and_birthweight.group('Maternal Smoker', np.average)

In [None]:
smoking_and_birthweight.hist('Birth Weight', group='Maternal Smoker')

Let's set up a hypothesis testing procedure to determine whether or not there exists a significant
association between a mother's smoking status their baby's birth weight.

### Step 1: State the null and alternative hypotheses, and state the significance cutoff

**Null**: In the population, the distributions of the birth weights of the babies in the two groups are the same
(they are different in the sample just due to chance).

**Alternative**: In the population, the babies of the mothers who smoked weigh less, on average, than the babies of
the non-smokers.

**Significance Cutoff**: 0.05 (5%)

### Step 2: Decide on a test statistic.

We're going to perform an A/B testing procedure. Let's define the test statistic as the difference between the average
baby weights of mothers who do and don't smoke while pregnant.

If the observed test statistic is very small, then there is evidence that the data is consistent with the alternative
hypothesis.

### Step 3: Compute the observed test statistic

In [None]:
ave_birth_weight = smoking_and_birthweight.group('Maternal Smoker', np.average)
ave_birth_weight

In [None]:
obs_test_stat = ave_birth_weight.column('Birth Weight average').item(1) - ave_birth_weight.column('Birth Weight average').item(0)
obs_test_stat

### Step 4: Simulate the distribution of the test statistic under the null hypothesis

Since we're going to be repeating the procedure from Step 3 many, many times, let's encapsulate it in a function for ease of use.

In [None]:
def difference_of_means(table, label, group_label):
    """Takes: name of table, column label of numerical variable,
    column label of group-label variable
    Returns: Difference of means of the two groups"""
    
    #table with the two relevant columns
    reduced = table.select(label, group_label)  
    
    # table containing group means
    means_table = reduced.group(group_label, np.average)
    # array of group means
    means = means_table.column(1)
    
    return means.item(1) - means.item(0)

difference_of_means(smoking_and_birthweight, 'Birth Weight', 'Maternal Smoker')

Next, let's define a function that randomly shuffles the smoking status label of the `smoking_and_birthweight` table,
and then spits out the difference between the permuted samples.

In [None]:
def one_simulated_difference(table, label, group_label):
    """Takes: name of table, column label of numerical variable,
    column label of group-label variable
    Returns: Difference of means of the two groups after shuffling labels"""
    
    # array of shuffled labels
    shuffled_labels = table.sample(with_replacement = False).column(group_label)
    
    # table of numerical variable and shuffled labels
    shuffled_table = table.select(label).with_column(
        'Shuffled Label', shuffled_labels
    )
    
    return difference_of_means(shuffled_table, label, 'Shuffled Label')

one_simulated_difference(smoking_and_birthweight, 'Birth Weight', 'Maternal Smoker')

Finally, let's repeat this process many, many times to simulate the distribution of our test statistic under the null
hypothesis:

In [None]:
repititons = 1000
differences  = make_array()

for i in np.arange(repititons):
    new_difference = one_simulated_difference(smoking_and_birthweight, 'Birth Weight', 'Maternal Smoker')
    differences = np.append(differences, new_difference)

### Step 5: Interpret the results of the permutation test

First, let's plot the distribution of the test statistic under the null. 

In [None]:
Table().with_column('Difference Between Group Means', differences).hist()
plots.title('Prediction Under the Null Hypothesis');

Does it seem plausible that the observed test statistic was generated under the null hypothesis? Not really. The p-value
will provide us with a more quantitative metric of consistency with the null.

In [None]:
p_value = np.count_nonzero(obs_test_stat > differences)/repititons
p_value

Since our test's p-value is smaller than our significance cutoff, we claim that the data is inconsistent with the null
hypothesis. There is reason to believe that there exists an association between maternal smoking status and baby weight:
mothers who smoke tend to have babies who weigh less than those of non-smoking mothers.