# Statistical Data Management Session 11: ANOVA


## Exercise 1: Animal-assisted therapy for heart patients *(exercise 10.38 and 10.58 from the book)*

In *the American Heart Association Conference* (Nov. 2005) study to gauge whether animal-assisted therapy can improve the physiological responses of heart failure patients, 76 heart patients were randomly assigned to one of three groups. Each patient in group T was visited by a human volunteer accompanied by a trained dog, each participant in group V was visited by a volunteer only, and the patients in group C were not visited at all. The anxiety level of each patient was measured (in points) both before and after the visits. The accompanying table gives summary statistics for the drop in anxiety level of the three groups of patients were compared with the use of analysis of variance. Although an ANOVA table was not provided in the article, sufficient information is given to reconstruct it.

| $\qquad$ | Sample Size | Mean Drop | Stand. Dev. |
|:---| :---:|:---:|:---:|
|Group T: Volunteer + Trained dog|26|10.5|7.6|
|Group V: Volunteer only |25|3.9|7.5|
|Group C: Control group (no visit)|25|1.4|7.5|

  
1. Compute the sum of squares for treatments (SST).

2. Compute the sum of squares for error (SSE).

3. Use the results from part 1. and 2. to construct the ANOVA table

| Source | $\qquad$ df $\qquad$ | $\qquad SS \qquad$ | $\qquad \qquad MS \qquad \qquad$ | $\qquad \qquad F \qquad \qquad$ |
|---:| :---:|:---:|:---:|:---:|
|Treatments| $k$-$1$  |$SST$|$MST=\frac{SST}{k-1}$|$\frac{MST}{MSE}$|
|Error| $n$-$k$ |$SSE$|$MSE=\frac{SSE}{n-k}$| |

 
4. Is there sufficient evidence (at $\alpha = 0.01$) of differences among the mean drops in anxiety levels by the patients in the three groups? Look up the threshold F value in the table or use Python below.

5. Comment on the validity of the ANOVA assumption. How might this affect the results of the study?

6. If you found evidence of a difference among the treatment means, then conduct a post-hoc-analysis. Conduct a Bonferroni analysis to establish confidence intervals of the treatment mean differences and rank the three treatment means. Use an experimentwise error rate of $\alpha = 0.03$. Interpret the results of the researchers. Look up the relevant t value in the table or use Python below.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats as sts
%matplotlib inline




## Exercise 2: Does the moment of an exam influence the results?

A university wants to test whether the moment of the day when an exam takes place has an influence on the test results. They assign students at random to one of three groups: an exam in the morning (M), early afternoon (E) or late afternoon (L). Their results are below. Assuming the underlying distribution of exam scores is normal and the variances in the subgroups the same, conduct ANOVA to determine whether or not there is a significant (use $\alpha = 0.05$) difference between the groups.

1. There is a short way (one line!) to do this in Python. Formulate $H_0$ and $H_a$ and draw the appropriate conclusion.
2. As a challenge: reconstruct this function yourself.

In [None]:
M = [13, 10, 9, 8, 6, 13, 9, 9, 11, 9, 9, 13, 11, 10, 10, 11, 8, 6, 9, 10, 9, 8, 9, 11, 8, 11, 11, 9, 10, 7, 11, 9, 10, 10, 10, 11, 12, 14, 8, 9, 14, 8, 11, 6, 8, 10, 12, 12, 9, 9, 11, 13, 10, 11, 12, 8, 10, 12, 13, 12, 11, 8, 12, 8, 9, 8, 11, 10, 10, 12, 7, 13, 11, 7, 10, 11, 14, 11, 7, 10]
E = [13, 9, 12, 14, 12, 9, 11, 12, 13, 10, 12, 9, 12, 8, 13, 9, 11, 10, 7, 8, 7, 11, 8, 11, 5, 12, 12, 7, 11, 8, 10, 10, 8, 9, 8, 13, 8, 10, 11, 10, 11, 12, 14, 9, 7, 7, 8, 7, 4, 11, 9, 7, 8, 8, 11, 7, 9, 10, 11, 9, 10, 11, 9, 11, 13, 8, 15, 6, 5, 14, 9, 12, 9, 12, 8, 12, 13, 8, 5, 8]
L = [9, 11, 8, 8, 9, 12, 12, 7, 9, 4, 5, 9, 12, 10, 9, 5, 8, 7, 10, 9, 7, 10, 14, 10, 9, 8, 9, 8, 10, 16, 10, 10, 10, 11, 13, 13, 7, 10, 12, 8, 8, 7, 13, 9, 12, 13, 9, 10, 8, 12, 11, 7, 6, 9, 6, 12, 4, 7, 11, 6, 8, 12, 14, 10, 12, 11, 6, 8, 11, 8, 11, 7, 13, 7, 12, 8, 9, 10, 12, 10]


