# ANOVA  - Lab

## Introduction

In this lab, you'll get some brief practice generating an ANOVA table (AOV) and interpreting its output. You'll also perform some investigations to compare the method to the t-tests you previously employed to conduct hypothesis testing.

## Objectives

In this lab you will: 

- Use ANOVA for testing multiple pairwise comparisons 
- Interpret results of an ANOVA and compare them to a t-test

## Load the data

Start by loading in the data stored in the file `'ToothGrowth.csv'`: 

In [1]:
# Your code here
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

df = pd.read_csv('ToothGrowth.csv')
print(df.info())
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 60 entries, 0 to 59
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   len     60 non-null     float64
 1   supp    60 non-null     object 
 2   dose    60 non-null     float64
dtypes: float64(2), object(1)
memory usage: 1.5+ KB
None


Unnamed: 0,len,supp,dose
0,4.2,VC,0.5
1,11.5,VC,0.5
2,7.3,VC,0.5
3,5.8,VC,0.5
4,6.4,VC,0.5


In [2]:
df.len.unique()

array([ 4.2, 11.5,  7.3,  5.8,  6.4, 10. , 11.2,  5.2,  7. , 16.5, 15.2,
       17.3, 22.5, 13.6, 14.5, 18.8, 15.5, 23.6, 18.5, 33.9, 25.5, 26.4,
       32.5, 26.7, 21.5, 23.3, 29.5, 17.6,  9.7,  8.2,  9.4, 19.7, 20. ,
       25.2, 25.8, 21.2, 27.3, 22.4, 24.5, 24.8, 30.9, 29.4, 23. ])

In [3]:
df.supp.unique()

array(['VC', 'OJ'], dtype=object)

In [4]:
df.dose.unique()

array([0.5, 1. , 2. ])

## Generate the ANOVA table

Now generate an ANOVA table in order to analyze the influence of the medication and dosage:  

In [5]:
# Your code here
formula = 'len ~ C(supp) + C(dose)'
lm = ols(formula, df).fit()
table = sm.stats.anova_lm(lm, typ=2)
table

Unnamed: 0,sum_sq,df,F,PR(>F)
C(supp),205.35,1.0,14.016638,0.0004292793
C(dose),2426.434333,2.0,82.810935,1.8711630000000002e-17
Residual,820.425,56.0,,


## Interpret the output

Make a brief comment regarding the statistics and the effect of supplement and dosage on tooth length: 

In [6]:
# Your comment here
# From last lecture (and everything else we have been studying recently), 
#    "Values less than 0.05 (or whatever we set α to) indicate rejection of the null hypothesis."
#    
# Since both are << 0.05, we therefore conclude that both supp and dose influence tooth length
# with statistical significance.

## Compare to t-tests

Now that you've had a chance to generate an ANOVA table, its interesting to compare the results to those from the t-tests you were working with earlier. With that, start by breaking the data into two samples: those given the OJ supplement, and those given the VC supplement. Afterward, you'll conduct a t-test to compare the tooth length of these two different samples: 

In [7]:
# Your code here
df_OJ = df.query("supp=='OJ'")
df_OJ.head()

Unnamed: 0,len,supp,dose
30,15.2,OJ,0.5
31,21.5,OJ,0.5
32,17.6,OJ,0.5
33,9.7,OJ,0.5
34,14.5,OJ,0.5


In [8]:
df_VC = df.query("supp=='VC'")
df_VC.head()

Unnamed: 0,len,supp,dose
0,4.2,VC,0.5
1,11.5,VC,0.5
2,7.3,VC,0.5
3,5.8,VC,0.5
4,6.4,VC,0.5


Now run a t-test between these two groups and print the associated two-sided p-value: 

In [9]:
# Calculate the 2-sided p-value for a t-test comparing the two supplement groups
import flatiron_stats as fistats

p_val_welch = fistats.p_value_welch_ttest(df_OJ['len'], df_VC['len'], two_sided=True)
p_val_welch

0.06063450788093383

## A 2-Category ANOVA F-test is equivalent to a 2-tailed t-test!

Now, recalculate an ANOVA F-test with only the supplement variable. An ANOVA F-test between two categories is the same as performing a 2-tailed t-test! So, the p-value in the table should be identical to your calculation above.

> Note: there may be a small fractional difference (>0.001) between the two values due to a rounding error between implementations. 

In [10]:
# Your code here; conduct an ANOVA F-test of the oj and vc supplement groups.
lm = ols('len ~ C(supp)', df).fit()
table = sm.stats.anova_lm(lm, typ=2)
table
# Compare the p-value to that of the t-test above. 
# They should match (there may be a tiny fractional difference due to rounding errors in varying implementations)

Unnamed: 0,sum_sq,df,F,PR(>F)
C(supp),205.35,1.0,3.668253,0.060393
Residual,3246.859333,58.0,,


## Run multiple t-tests

While the 2-category ANOVA test is identical to a 2-tailed t-test, performing multiple t-tests leads to the multiple comparisons problem. To investigate this, look at the various sample groups you could create from the 2 features: 

In [11]:
for group in df.groupby(['supp', 'dose'])['len']:
    group_name = group[0]
    data = group[1]
    print(group_name)

('OJ', 0.5)
('OJ', 1.0)
('OJ', 2.0)
('VC', 0.5)
('VC', 1.0)
('VC', 2.0)


While bad practice, examine the effects of calculating multiple t-tests with the various combinations of these. To do this, generate all combinations of the above groups. For each pairwise combination, calculate the p-value of a 2-sided t-test. Print the group combinations and their associated p-value for the two-sided t-test.

In [12]:
# Your code here; reuse your t-test code above to calculate the p-value for a 2-sided t-test
# for all combinations of the supplement-dose groups listed above. 
# (Since there isn't a control group, compare each group to every other group.)
from itertools import combinations
groups = [group[0] for group in df.groupby(['supp', 'dose'])['len']]
combos = list(combinations(groups, 2))
for combo in combos:
    s1 = df.query(f"supp=='{combo[0][0]}' and dose=={combo[0][1]}")
    s2 = df.query(f"supp=='{combo[1][0]}' and dose=={combo[1][1]}")
    p_val_welch = fistats.p_value_welch_ttest(s1['len'], s2['len'], two_sided=True)
    print(f"combo: {combo}, pval: {p_val_welch}")

combo: (('OJ', 0.5), ('OJ', 1.0)), pval: 8.784919055160323e-05
combo: (('OJ', 0.5), ('OJ', 2.0)), pval: 1.323783877626994e-06
combo: (('OJ', 0.5), ('VC', 0.5)), pval: 0.00635860676409683
combo: (('OJ', 0.5), ('VC', 1.0)), pval: 0.04601033257637566
combo: (('OJ', 0.5), ('VC', 2.0)), pval: 7.196253523966689e-06
combo: (('OJ', 1.0), ('OJ', 2.0)), pval: 0.03919514204624397
combo: (('OJ', 1.0), ('VC', 0.5)), pval: 3.655206737285255e-08
combo: (('OJ', 1.0), ('VC', 1.0)), pval: 0.0010383758722998238
combo: (('OJ', 1.0), ('VC', 2.0)), pval: 0.09652612338267019
combo: (('OJ', 2.0), ('VC', 0.5)), pval: 1.3621326289126046e-11
combo: (('OJ', 2.0), ('VC', 1.0)), pval: 2.3610742028168374e-07
combo: (('OJ', 2.0), ('VC', 2.0)), pval: 0.9638515887233756
combo: (('VC', 0.5), ('VC', 1.0)), pval: 6.811017703167721e-07
combo: (('VC', 0.5), ('VC', 2.0)), pval: 4.681577414622495e-08
combo: (('VC', 1.0), ('VC', 2.0)), pval: 9.155603056631989e-05


## Summary

In this lesson, you implemented the ANOVA technique to generalize testing methods to multiple groups and factors.