## Main Task

> We want to analyze the performance of Arsenal FC players based on their positions: forwards, midfielders, and defenders.

We have collected data on the number of goals scored by players in these three positions over a season.  
We want to determine if there is a significant difference in the average number of goals scored among the three positions.

The data:  
* Forwards: [12, 15, 14, 10, 13, 16]
* Midfielders: [8, 7, 9, 10, 8, 7]
* Defenders: [3, 4, 2, 5, 3, 4]

## Why ANOVA test is acceptable here? 

* We are comparing the mean of three different groups.
*  Even though there is a visible difference in the mean goals scored by each group, ANOVA provides a formal statistical test to determine whether these observed differences are statistically significant or could have occurred by random chance.

## Step By Step Solution

#### State the Hypotheses:
* Null Hypothesis ($H_0$): $\mu_{forwards} = \mu_{midfielders} = \mu_{defenders}$  
The means of the number of goals scored by forwards, midfielders, and defenders are equal.

* Alternative Hypothesis ($H_1$): Atleast one $\mu$ is different.

#### Significance Level
We will use a significance level of α = 0.05. This is a common choice in hypothesis testing because it represents a 5% risk of concluding that a difference exists when there is no actual difference (Type I error).

#### Calculate the means
* Mean of the forwards ($\bar X_f$): $$\bar X_f = \frac{12 + 15 + 14 + 10 + 13 + 16}{6} = 13.33$$
* Mean of the midfielders ($\bar X_m$): $$\bar X_m = \frac{8 + 7 + 9 + 10 + 8 + 7}{6} = 8.17$$
* Mean of the defenders ($\bar X_d$): $$\bar X_d = \frac{3 + 4 + 2 + 5 + 3 + 4}{6} = 3.50$$

#### Calculate the overall mean
$$\bar X = \frac{12 + 15 + 14 + 10 + 13 + 16 + 8 + 7 + 9 + 10 + 8 + 7 + 3 + 4 + 2 + 5 + 3 + 4}{18} = 8.33$$

#### Sum of Squares Between
$$SSB = n_f(\bar X_f - \bar X)^2  + n_m(\bar X_m - \bar X)^2 + n_d(\bar X_d - \bar X)^2$$  
where $n_f = n_m = n_d = 6$.

$$SSB = 6(13.33 - 8.33)^2  + 6(8.17 - 8.33)^2 + 6(3.50 - 8.33)^2 = 290.00$$ 

#### Sum of Squares Within
$$
SSW = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (X_{ij} - \bar{X}_i)^2
$$

$$
SSW_{forward} = (12 - 13.33)^2 + (15 - 13.33)^2 + (14 - 13.33)^2 + (10 - 13.33)^2 + (13 - 13.33)^2 + (16 - 13.33)^2 = 24.70
$$

$$
SSW_{midfielder} = (8 - 8.17)^2 + (7 - 8.17)^2 + (9 - 8.17)^2 + (10 - 8.17)^2 + (8 - 8.17)^2 + (7 - 8.17)^2 = 6.86
$$

$$
SSW_{defenders} = (3 - 3.50)^2 + (4 - 3.50)^2 + (2 - 3.50)^2 + (5 - 3.50)^2 + (3 - 3.50)^2 + (4 - 3.50)^2 = 5.50
$$

$$
SSW_{total} = 24.70 + 6.86 + 5.50 = 37.06
$$ 

#### Degrees of Freedom
* Between the groups: $df_{between} = k - 1 = 3 - 1 = 2$
* Within Groups: $df_{within} = N - k = 18 - 3 = 15$

#### Mean Squares
* Mean Square Between (MSB):
$$MSB = \frac{SSB}{df_{between}} = \frac{290.00}{2} = 145.00$$
* Mean Square Within (MSW):
$$MSW = \frac{SSW}{df_{within}} = \frac{37.06}{15} = 2.47$$

#### F-Statistic:
$$ F = \frac{MSB}{MSW} = \frac{145.00}{2.47} = 58.70 $$

#### Decision Rule
Compare the calculated F-statistic to the critical value from the F-distribution table at α = 0.05 with $df_1 = 2$ and $df_2 = 15$.  

Using an F-Table or calculator, the critical value for F(2, 15), at α = 0.05 is approximately 3.68.

#### Decision
Since $58.70 > 3.68$, we reject the null hypothesis.

## Python Implementation

In [4]:
import numpy as np
from scipy import stats

# Data
forwards = [12, 15, 14, 10, 13, 16]
midfielders = [8, 7, 9, 10, 8, 7]
defenders = [3, 4, 2, 5, 3, 4]

# Perform one-way ANOVA
f_statistic, p_value = stats.f_oneway(forwards, midfielders, defenders)

# Output the results
print(f"F-statistic: {f_statistic}")
print(f"P-value: {p_value}")

# Decision rule
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: There is a significant difference between the means of the groups.")
else:
    print("Fail to reject the null hypothesis: There is no significant difference between the means of the groups.")

F-statistic: 61.05140186915875
P-value: 6.206359406875525e-08
Reject the null hypothesis: There is a significant difference between the means of the groups.
