## üéØ What is ANOVA?

* ANOVA = Analysis of Variance

* It is used to test whether 3 or more group means are different.

* Null Hypothesis (H‚ÇÄ):

      All group means are equal. Œº1 = Œº2 = Œº3 = .........

* Alternative Hypothesis (H‚ÇÅ):

       At least one group mean is different.

#### ‚≠ê When do we use ANOVA?

* When you compare 3+ groups.

      Example: Marks of students taught by 3 different teachers ‚Äî are their mean scores same or different?

#### ‚≠ê Types of ANOVA
1) One-way ANOVA

* One independent variable (factor) with 3+ groups.
* ‚úî Teacher ‚Üí 3 groups
* ‚úî Diet types ‚Üí 4 groups

2) Two-way ANOVA

* Two independent variables
* ‚úî Gender (M/F)
* ‚úî Diet type (A/B/C)

### One-way ANOVA ‚Äî Manual Step-by-Step Calculation
Assume 3 groups:
| Group | Scores   |
| ----- | -------- |
| A     | 8, 9, 6  |
| B     | 5, 4, 7  |
| C     | 10, 9, 8 |
#### STEP 1: Calculate group means
![image.png](attachment:886d9a07-22ee-4013-bf01-90d94dcfdde1.png)
#### STEP 2: Between-Group Sum of Squares (SSB)
![image.png](attachment:395a49e3-9e99-44ec-a70b-cfeef2045fb2.png)
#### STEP 3: Within-Group Sum of Squares (SSW)
![image.png](attachment:8bc68c8f-32f5-44ba-af38-e13b514a9b03.png)
#### STEP 4: Degrees of Freedom
![image.png](attachment:66dbb86c-955e-44f2-a09f-56c0ddadedb9.png)
#### STEP 5: Find Mean Squares
![image.png](attachment:816dcfef-d42a-476f-89d5-5ba132197458.png)
#### STEP 6: Compute F-statistic
![image.png](attachment:dc1a996d-a513-4975-97b8-07d01c1cc661.png)
#### STEP 7: Compare with Critical Value
![image.png](attachment:49aee884-99a1-4e8a-b86a-d96d879d71c8.png)

| Source  | SS      | df | MS      | F    |
| ------- | ------- | -- | ------- | ---- |
| Between | 35.5488 | 2  | 17.7744 | 9.39 |
| Within  | 11.354  | 6  | 1.892   | ‚Äî    |
| Total   | 46.90   | 8  | ‚Äî       | ‚Äî    |


In [6]:
#Manually
import numpy as np
from scipy.stats import f

# ------------------------------
# Example Data
A = np.array([8, 9, 6])
B = np.array([5, 4, 7])
C = np.array([10, 9, 8])
groups = [A, B, C]
alpha = 0.05
# ------------------------------

# group sizes and means
n = [len(g) for g in groups]
means = [np.mean(g) for g in groups]
N = sum(n)
k = len(groups)

overall_mean = np.mean(np.concatenate(groups))

# SSB
SSB = sum(n[i] * (means[i] - overall_mean)**2 for i in range(k))

# SSW
SSW = sum(((groups[i] - means[i])**2).sum() for i in range(k))

dfB = k - 1
dfW = N - k

MSB = SSB / dfB
MSW = SSW / dfW

F_stat = MSB / MSW

p_value = 1 - f.cdf(F_stat, dfB, dfW)
F_critical = f.ppf(1 - alpha, dfB, dfW)

print("SSB =", SSB)
print("SSW =", SSW)
print("MSB =", MSB)
print("MSW =", MSW)
print("F =", F_stat)
print("p-value =", p_value)
print("F-critical =", F_critical)
print("Decision:", "Reject H0" if F_stat > F_critical else "Fail to Reject H0")


SSB = 20.66666666666667
SSW = 11.333333333333332
MSB = 10.333333333333336
MSW = 1.8888888888888886
F = 5.4705882352941195
p-value = 0.04442455150462965
F-critical = 5.143252849784718
Decision: Reject H0


In [7]:
# Anova using stats
import numpy as np
from scipy import stats

north = [10, 12, 9, 11]
south = [8, 9, 7, 10]
east = [13, 15, 14, 16]

f_stat, p_value = stats.f_oneway(north, south, east)

print("F-Statistic:", f_stat)
print("p-value:", p_value)

if p_value < 0.05:
    print("Reject H‚ÇÄ ‚Üí At least one region's sales differ.")
else:
    print("Fail to Reject H‚ÇÄ ‚Üí No significant difference in sales.")


F-Statistic: 22.399999999999995
p-value: 0.0003203104089926066
Reject H‚ÇÄ ‚Üí At least one region's sales differ.


In [5]:
import numpy as np
from scipy.stats import f_oneway, f

# --------------------
# Example data
A = np.array([8, 9, 6])
B = np.array([5, 4, 7])
C = np.array([10, 9, 8])
alpha = 0.05
# --------------------

# -------- ANOVA using stats --------
F_stat, p_value = f_oneway(A, B, C)

# Degrees of freedom
k = 3  # number of groups
N = len(A) + len(B) + len(C)

df_between = k - 1
df_within  = N - k

# Critical F-value
F_critical = f.ppf(1 - alpha, df_between, df_within)

# Decision
decision = "Reject H0" if F_stat > F_critical else "Fail to Reject H0"

print("F-statistic:", F_stat)
print("p-value:", p_value)
print("F-critical:", F_critical)
print("Decision:", decision)


F-statistic: 5.470588235294121
p-value: 0.044424551504629574
F-critical: 5.143252849784718
Decision: Reject H0


#### üîç Why use scipy.stats.f_oneway()?

* It automatically performs:
![image.png](attachment:a78f5114-91cd-4968-a7b6-c8e1c26958f2.png)
* without manually calculating SSB, SSW, MSB, MSW.