In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("ANOVA.ipynb")

# ANOVA

### Comparing Many Means

The goal of this activity is to solve some problems using what we have learned about ANOVA.  First, let's start with a brief introduction to this notebook.

# 1. What is a Jupyter notebook?
This webpage is called a Jupyter notebook. A notebook is a place to write code and view the results of that code.  It is also a place to share and write text.
In a notebook, each rectangle containing text or code is called a *cell*.

**Text cells** (like this one) can be edited by double-clicking on them. They're written in a simple format called [Markdown](http://daringfireball.net/projects/markdown/syntax) to add formatting and section headings.
After you edit a text cell, click the "run cell" button at the top that looks like ▶| or hold down `shift` + `return` to confirm any changes. 

**Code cells** contain code in the Python 3 language. Running a code cell will execute all of the code it contains.
To run the code in a code cell, first click on that cell to activate it.  It'll be highlighted with a little green or blue rectangle.  Next, either press ▶| or hold down `shift` + `return`.

**Activity 1:** This is a text cell. It is the cell type where we can type text that isn't code. Go ahead and double click in this cell and you will see that you can edit it. 

**Type something here:** ....


**Activity 2:** Click on the code cell below and run the code:

In [None]:
#This cell is a code cell. It is where we can type code that can be executed.
#The hashtag at the start of this line makes it so that this text is a comment not code. 

print("Hello, World! \N{EARTH GLOBE ASIA-AUSTRALIA}!")

And this one:

In [None]:
# This coding cell imports some python libraries that we will be using throughout this notebook
# Don't worry about what they are, just run this cell before running any other cells below this one

from datascience import *
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
import numpy as np
import otter
grader = otter.Notebook("ANOVA.ipynb")

%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

# ANOVA Review

We use ANOVA when we want to investigate the relationship between one numerical variable and one catergorical variable with many levels.  Running an ANOVA test allows us to see whether the variabitity in sample means from one level to the next is so large that it is unlikely due to random chance.

The number of levels of the categorical variable determines the number of groups we will compare.  We also need to check conditions before running this test!

1. The observations are independent within and across groups.
2. The data within each group are nearly normal.
3. The variability across the groups is about equal.

## Definitions
Here is a brief review of definitions that will be usefull in this lab assignment.  To take a deeper dive into how these are derived, check out [this resource](https://www.openintro.org/go/?id=stat_extra_anova_calculations&referrer=os4_pdf).

$ SSG $ - Sums of Squares between the Groups. This is a measure of how much each of the group means ($\bar{x}_i$) differ from the overall mean ($\bar{x}$).  

$ df_G $ - Degrees of freedom for the Groups.  If there are $ k $ different groups, then $$ df_G = k-1$$

$ MSG  $ - Mean Square between the Groups.  This measures the degree to which the means vary from group to group.  $$ MSG = \frac{SSG}{df_G}$$

$ SSE $ - Sum of Squared Errors is the sum of the squared differences between
each sample’s observations with each respective sample mean.

$ df_E $ - Degrees of freedom for the Errors.  If there are a total of $n$ observations and $k$ different groups, then $$df_E = n - k $$

$ MSE  $ - Mean Square Error.  This measures the degree of variability *within* each group.  $$ MSE = \frac{SSE}{df_E}$$

$ SST $ - Sum of Squares Total.  Think of this as the total variability for all observations.  It is found by adding all the squared differences between each individual observation and the average for ALL outcomes. $$ SST = SSG + SSE $$

$ df_T $ - Degrees of freedom Total. If there are $ n $ observations, then $$df_T = n-1$$

$ F $ Statistic - When the null hypothesis is true, any differences among the sample means are just due to random chance, so MSG and MSE would be about equal and the $F$ Statistic is a value close to 1.  When MSG gets large relative to MSE, then the F statistic gets big and we have evidence in favor of the alternative, that at least one mean is different. $$F = \frac{MSG}{MSE} $$


# 2. Chicken Diets
The `chickwts` dataset contains observations on two variables: which type of feed chicks were randomly assigned to, and their weight in grams.  ANOVA allows us to compare the mean weight for chicks fed either casein, horsebean, linseed, meat meal, soybean, or sunflower all at once.  

**Question 2.1** Which of the following are the correct hypotheses for this hypothesis test?
- Set **hypotheses2** to the correct number choice from the following list:
    1. $H_o: \mu_c = \mu_h = \mu_l = \mu_m = \mu_{soy} = \mu_{sun}$ <br> $H_A:$ At least one pair of means is the same
    2.  $H_o: \mu_c = \mu_h = \mu_l = \mu_m = \mu_{soy} = \mu_{sun}$ <br> $H_A:$ At least one of the means is different
    3.  $H_o: \mu_c = \mu_h = \mu_l = \mu_m = \mu_{soy} = \mu_{sun}$ <br> $H_A:\mu_c \ne \mu_h \ne \mu_l \ne \mu_m \ne \mu_{soy} \ne \mu_{sun}$
    4. $H_o: \mu_c \ne \mu_h \ne \mu_l \ne \mu_m \ne \mu_{soy} \ne \mu_{sun}$ <br> $H_A: $ At least one pair of means is the same.

In [None]:
#replace the ... with the correct answer
hypotheses2 = ...

Now you can check your work by running the grader check below. If it passes, great job, go on to the next section! 

If it fails, don't worry, you just need to go back and try again.

In [None]:
grader.check("q21")

# Descriptive Statistics 
We start by looking at numerical and visual summaries of the data.  The next cell will load the data into the notebook and give us a view of the first 20 observations of the dataset.  Run the next cell.

In [None]:
#Just run this cell
chickwts = Table.read_table('chickwts.csv')
chickwts.show(20)

**Using some coding**, the cell below will organize the data into a table where each row is one of the types of feed (one of the groups) and each column is a summary statistic.  Run the cell to take a look.

In [None]:
#Just run this cell to see the summary statistics for chick weight by feed type
chickwts.group('feed').relabeled('count', 'sample size').join(
    'feed', chickwts.group(('feed'), np.mean)).join(
        'feed', chickwts.group(('feed'), np.std))


**We should also visualize** the distribution of weight for each feed type.  Run the next cell to see side by side boxplots.

In [None]:
#Let's look at side by side boxplots.  Run this cell
c = pd.DataFrame({'casein':chickwts.where('feed', 'casein').column('weight')})
h = pd.DataFrame({'horsebean':chickwts.where('feed', 'horsebean').column('weight')})
l = pd.DataFrame({'linseed':chickwts.where('feed', 'linseed').column('weight')})
m = pd.DataFrame({'meatmeal':chickwts.where('feed', 'meatmeal').column('weight')})
s1 = pd.DataFrame({'soybean':chickwts.where('feed', 'soybean').column('weight')})
s2 = pd.DataFrame({'sunflower':chickwts.where('feed', 'sunflower').column('weight')})
df = pd.concat([c, h, l, m, s1, s2], axis = 1) 
df.boxplot()

**Question 2.2:** Remember that there are three conditions we should check before doing ANOVA.  The first condition "The observations are independent within and across groups." has likely been met since the chicks were randomly assigned to the feed types.  For the second two assumptions, we should take a look at the boxplots and the summary statistics.   Assign the variable `normal` to True if it the boxplots look relatively symmetric without too extreme skew. Otherwise, assign `normal` to False.  
Also, assign `similar_var` to True if it appears that the variability of weight for each group is approximately equal.  Otherwise, assign `similar_var` to False.  

In [None]:
normal = ...
similar_var = ...

In [None]:
grader.check("q22")

In [None]:
#Run this cell to see the ANOVA table
model = ols('weight ~ feed', data=chickwts).fit()
aov_table = sm.stats.anova_lm(model, typ=2)
aov_table

**Question 2.3:** What is the test statistic equal to for this hypothesis test?  Set `f_stat` equal to the correct value, rounded to 2 decimal places.

In [None]:
f_stat = ...

In [None]:
grader.check("q23")

**Question 2.4** What is the p-value for this hypothesis test? Set `p_val` to the correct value, rounded to 4 decimal places.

In [None]:
p_val = ...

In [None]:
grader.check("q24")

**Question 2.5:**
Which of the following is the correct conclusion and interpretation of the hypothesis test?  Use an alpha level of significance of 0.01 and assign `conclusion2` to the correct choice.

1. Since the p-value is less than $\alpha$, we reject the null hypothesis.  The evidence supports the alternative hypothesis: that the average weight of chicks is different for each type of diet.  
2. Since the p-value is less than $\alpha$, we reject the null hypothesis and accept that the average weight of chicks is the same across all of the diets.
3. Since the p-value is greater than or equal to $\alpha$ we reject the null hypothesis.  The evidence supports the alternative hypothesis: at least one pair of means is the same.
4. Since the p-value is less than $\alpha$, we reject the null hypothesis. The evidence supports the alternative hypothesis: that the average weight of chicks is not the same across all of the diets.  At least one of the mean weights is different.


In [None]:
conclusion2 = ...

In [None]:
grader.check("q25")

# 3. Teaching Methods
A study compared  five different methods for teaching descriptive statistics. The  five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam.

**Question 3.1** What are the hypotheses for evaluating if the average test physical activity levels are different for the different teaching methods?  Assign `hypotheses3` to the correct choice.

Potential Hypotheses
1. $H_o: \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu_5 $<br> $H_A:$ At least one of the means is different
2. $H_o: \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu_5 $<br> $H_A:$ All of the means are different
3.  $H_o: \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu_5$ <br> $H_A:\mu_1 \ne \mu_2 \ne \mu_3 \ne \mu_4 \ne \mu_5 $
4. $H_o: \mu_1 \ne \mu_2 \ne \mu_3 \ne \mu_4 \ne \mu_5 $ <br> $H_A: $ At least one pair of means is the same.

In [None]:
hypotheses3 = ...

In [None]:
grader.check("q31")

**Question 3.2** What are the degrees of freedom associated with the F-test for evaluating these hypotheses? Assign `dfg` and `dfe` to the correct numerical values.

In [None]:
dfg = ...
dfe = ...

In [None]:
grader.check("q32")

**Question 3.3**  Suppose the p-value for this test is 0.0168. At the 0.01 level of significance, what is the correct conclusion?
Assign `conclusion3` to the correct choice from the following list.

Possible Conclusions

1. Since the p-value is greater than or equal to $\alpha$ we accept that all of the mean physical activity levels are the same.
2. Since the p-value is less than $\alpha$, all of the mean physical activity levels are different from one another
3. Since the p-value is greater than or equal to $\alpha$, we reject the null hypothesis. The evidence supports the alternative, that all of the mean physical activity levels are different.
4. Since the p-value is greater than or equal to $\alpha$, we do not have enough evidence to reject that all of the mean physical activity levels are the same
5. Since the p-value is less than $\alpha$, we reject the null hypothesis. The evidence supports the alternative, that at least one of the mean physical activity levels is different from the others


In [None]:
conclusion3 = ...

In [None]:
grader.check("q33")

# 4. Coffee and Exercise
Caffeine is the world's most widely used stimulant, with approximately 80% consumed in the form of coffee. Participants in a study investigating the relationship between coffee consumption and exercise were asked to report the number of hours they spent per week on moderate (e.g., brisk walking) and vigorous (e.g., strenuous sports and jogging) exercise. Based on these data the researchers estimated the total hours of metabolic equivalent tasks (MET) per week, a value always greater than 0. The table below gives summary statistics of MET for women in this study based on the amount of coffee consumed.

 
| | 1 or less cup/week | 2-6 cups/week | 1 cup/day | 2-3 cups/day | 4 or more cups/day | Total |
| --- | --- | --- | --- | --- | --- | --- |
| Mean | 18.7 | 19.6 | 19.3 | 19.9 | 17.5 |  |
| SD | 21.1 | 25.5 | 22.5| 22| 22 |  |
| n | 12215 | 6617 | 17234 | 12290 | 2383 | 50739 |

**Question 4.1** Assume that all of the conditions required for this hypothesis test are met.  What are the hypotheses for evaluating if the average physical activity level varies among the different levels of coffee consumption? Assign `hypotheses4` to the correct choice.

Possible hypotheses: 

1. $H_o: $ All of the means are different <br> $H_A$: The average physical activity level is the same among the five different levels of coffee consumption
2. $H_o: $ The average physical activity level is the same among the five different levels of coffee consumption <br> $H_A: $ All of the means are different.
3. $H_o: $ At least one mean is different <br> $H_A$: The average physical activity level is the same among the five different levels of coffee consumption
4. $H_o: $ The average physical activity level is the same among the five different levels of coffee consumption <br> $H_A: $ At least one mean is different

In [None]:
hypotheses4 = ...

In [None]:
grader.check("q41")

# ANOVA Table for Coffee Data
Below is part of the output associated with this test. In question 4.2, you will assign each of the relevant variables to the correct numerical values that are missing from the table. Scroll to the top of the lab for a refresher on how all of these are computed and realted to each other and *please round to two decimal places where necessary.*

|  | df | Sum Sq | Mean Sq | F Value | Pr(>F) |
| --- | --- | --- | --- | --- | --- |
| coffee | *dfG* | *SSG* | *MSG* | *F* | 0.0003 |
| Residual | *dfE* | 25564819 | *MSE* |  |  |
| total | *dfT* | 25575327 |  

**Question 4.2**  Use Python to compute the code below by assigning each relevant variable to the correct numerical value.

In [None]:
#Enter numbers with no commas. Please round to two decimal places where necessary.
dfG = ...
dfE = ...
dfT = ...
ssG = ...
msG = ...
msE = ...
f = ...
print('The F Statistic is equal to', f)

In [None]:
grader.check("q42")

**Question 4.3**  Look back at the ANOVA table.  Assign `pval` to the numerical value for the p-value in the software output.  Do not round.

In [None]:
pval = ...

In [None]:
grader.check("q43")

**Question 4.4**  What is the correct conclusion for this hypothesis test, using an $\alpha = 0.01$ level of significance? Assign `conclusion4` to the correct choice.

Possible conclusions:

1. Since the p-value is greater than or equal to $\alpha$ we accept that all of the mean physical activity levels are the same.
2. Since the p-value is less than $\alpha$, all of the mean physical activity levels are different from one another
3. Since the p-value is greater than or equal to $\alpha$, we reject the null hypothesis. The evidence supports the alternative, that all of the mean physical activity levels are different.
4. Since the p-value is greater than or equal to $\alpha$, we do not have enough evidence to reject that all of the mean physical activity levels are the same
5. Since the p-value is less than $\alpha$, we reject the null hypothesis. The evidence supports the alternative, that at least one of the mean physical activity levels is different from the others

In [None]:
conclusion4 = ...

In [None]:
grader.check("q44")

**CONGRATULATIONS!** You just finished this jupyter notebook assignment! 

We hope you enjoyed this format and the chance to play with computer programming!
Be sure to...

- run all of the tests and verify that they all pass, 
- choose **Download as PDF via LaTeX** from the **File** menu,
- submit the .pdf file on **canvas**.