# Hypothesis testing solutions

## Exercise 1

### 1. State the hypotheses

* **Null Hypothesis ($H_0$)**: There is no significant difference in the average weight loss between the two diets. That is, the population means are equal: $\mu_1 = \mu_2$.
* **Alternative Hypothesis ($H_1$)**: There is a significant difference in the average weight loss between the two diets. That is, the population means are not equal: $\mu_1 \neq \mu_2$.

### 2. Check the Assumptions

For the t-test for independent samples to be valid, the following assumptions should ideally be met:

* **Independence**: The observations within each group are independent, and the groups themselves are independent. This is met as the groups were randomly selected.
* **Normality**: The data in each group should be approximately normally distributed. With small sample sizes ($n=7$), it's difficult to verify this assumption visually or with a formal test, but for the purpose of this exercise, we assume it's met. In a real-world scenario, you might use a **Shapiro-Wilk test** to check for normality.
* **Homogeneity of Variances**: The population variances of the two groups should be approximately equal. The `scipy.stats.ttest_ind` function assumes equal variances by default (`equal_var=True`). A **Levene's test** could be used to verify this assumption.

### 3. Perform Student's t-test

In [25]:
import scipy.stats as stats

diet_1 = [2.0, 2.5, 3.0, 2.8, 2.3, 2.7, 2.5]
diet_2 = [3.0, 3.2, 3.1, 2.9, 2.8, 3.0, 3.2]

# Student's t-test
t_value, p_value = stats.ttest_ind(diet_1, diet_2)

print(f"t-value: {t_value}")
print(f"p-value: {p_value}")

t-value: -3.5383407969933938
p-value: 0.004083270191713912


### 4. Conclusion

We compare the p-value with a chosen significance level, typically $\alpha = 0.05$.

* Since the **p-value (0.0041)** is less than the significance level **(0.05)**, we **reject the null hypothesis ($H_0$)**.
* This suggests that there is a **statistically significant difference** in the average weight loss between the two diets. In other words, the difference observed is unlikely to be due to random chance.

## Exercise 2

### 1. State the hypotheses

* **Null Hypothesis ($H_0$)**: The average corn yield for all three fertilizers is equal. That is, $\mu_1 = \mu_2 = \mu_3$.
* **Alternative Hypothesis ($H_1$)**: At least one of the fertilizer average corn yields is different from the others. That is, at least one $\mu_i \neq \mu_j$.

### 2. Check the Assumptions

For a one-way ANOVA to be valid, the following assumptions should be met:

* **Independence**: The samples from each group (fertilizer) are independent. This is met, as the plots are distinct from each other.
* **Normality**: The data within each group should be approximately normally distributed. With only 5 plots per fertilizer, it's difficult to confirm, but we assume it holds.
* **Homogeneity of Variances**: The population variances of the three groups are equal. This can be tested using Levene's test.

In [22]:
import scipy.stats as stats

fertilizer_1 = [20, 21, 20, 19, 20]
fertilizer_2 = [22, 21, 23, 22, 21]
fertilizer_3 = [24, 23, 22, 23, 24]

# Levene's test for homogeneity of variances
levene_test = stats.levene(fertilizer_1, fertilizer_2, fertilizer_3)
print(f"Levene's test p-value: {levene_test.pvalue}")

Levene's test p-value: 0.8039599174006208


### 3. Perform ANOVA test

In [23]:
# ANOVA test
f_value, p_value = stats.f_oneway(fertilizer_1, fertilizer_2, fertilizer_3)
print(f"f-value: {f_value}")
print(f"p-value: {p_value}")

f-value: 20.31578947368421
p-value: 0.000140478247931904


### 4. Analyze the Conclusions

* The **p-value (0.00014)** is less than the significance level **(0.05)**.
* Therefore, we **reject the null hypothesis ($H_0$)**.
* This indicates that there is a **significant difference** in the average corn yield between at least two of the three fertilizers. However, the ANOVA test alone **does not tell us which specific pairs of fertilizers are different**.

### 5. Post-hoc Test: Obtaining the Best Fertilizer

To find out which fertilizer is superior, we need to perform a **post-hoc test**. Tukey's HSD (Honestly Significant Difference) is a common choice for this.

In [24]:
import numpy as np
from statsmodels.stats.multicomp import pairwise_tukeyhsd

data = np.concatenate([fertilizer_1, fertilizer_2, fertilizer_3])
labels = ["F1"] * 5 + ["F2"] * 5 + ["F3"] * 5

# Tukey test
result = pairwise_tukeyhsd(data, labels, alpha = 0.05)
print(result)

Multiple Comparison of Means - Tukey HSD, FWER=0.05
group1 group2 meandiff p-adj  lower  upper  reject
--------------------------------------------------
    F1     F2      1.8 0.0099 0.4572 3.1428   True
    F1     F3      3.2 0.0001 1.8572 4.5428   True
    F2     F3      1.4 0.0409 0.0572 2.7428   True
--------------------------------------------------


**Interpreting the Results:**
The `reject` column shows whether there is a statistically significant difference between a pair of groups.

* `F1` vs `F2`: `reject=True`. There is a significant difference.
* `F1` vs `F3`: `reject=True`. There is a significant difference.
* `F2` vs `F3`: `reject=True`. There is a significant difference.

All fertilizer groups are significantly different from each other. Now, we look at the `meandiff` column to see the magnitude and direction of the difference.

* The average yield of F2 is 1.8 kg higher than F1.
* The average yield of F3 is 3.2 kg higher than F1.
* The average yield of F3 is 1.4 kg higher than F2.

Based on these results, **Fertilizer 3** provided the highest average yield, making it the most effective of the three fertilizers tested.