## Question
Using the following data, perform a oneway analysis of variance using α=.05. Write up the results in APA format.<br>
[Group1: 51, 45, 33, 45, 67]<br>
[Group2: 23, 43, 23, 43, 45]<br>
[Group3: 56, 76, 74, 87, 56]<br>

## Answer

It is a question of one way ANOVA test. In it we calculates the F statistics which deals with the differences between the variances of Two populations.

**1. F-test :-** $$\frac{MSS_B}{MSS_W}$$

**2. Mean Sum of Squares (within Groups)**
$$MSS_w = \frac{\sum_{g \in G} (X - \bar X_g)^2}{n - k}$$

**3. Mean Sum of Squares (Between Groups)**
$$MSS_B = \frac{\sum_{g \in G} n_g(\bar X_g - \bar X_G)^2}{k - 1}$$

n = total no. of variables<br>
k = total no. of groups.<br>
g = individual data set group<br>
G = Grand data set group<br>

**null hypothesis :-** $\mu_1 = \mu_2 = \mu_3$<br>
**Alternate hypothesis :-** atleast any two group mean is different

In [1]:
import numpy as np
import pandas as pd

In [2]:
d1 = {'Group1' : pd.Series([51, 45, 33, 45, 67]),
     'Group2' : pd.Series([23, 43, 23, 43, 45]),
     'Group3' : pd.Series([56, 76, 74, 87, 56])}

df1 = pd.DataFrame(d1)
df1

Unnamed: 0,Group1,Group2,Group3
0,51,23,56
1,45,43,76
2,33,23,74
3,45,43,87
4,67,45,56


## 1. Calculating $n_g$ :- (numbers in each groups)

In [3]:
n_g = pd.DataFrame(df1.count())
n_g = n_g.T
n_g = n_g.rename(index = {0: 'ng'})
n_g

Unnamed: 0,Group1,Group2,Group3
ng,5,5,5


## 3. Calculating $\bar x_g$ :- (mean of each group)

In [4]:
mean_g = pd.DataFrame(df1.mean())
mean_g = mean_g.T
mean_g = mean_g.rename(index = {0: 'mean_g'})
mean_g

Unnamed: 0,Group1,Group2,Group3
mean_g,48.2,35.4,69.8


## 2. Calculating $N_g$ :- (numbers in all the group combined)

In [5]:
N_g = pd.DataFrame(n_g.sum(axis = 1))
N_g = N_g.rename(index = {'ng': 'Ng'})
N_g = N_g.iloc[0, 0]
N_g

15

## 4. Calculating $\bar x_G$ :- (grand mean of numbers in all group)


In [6]:
df1.loc['sum'] = df1.sum()
sum_G = df1.loc['sum', ['Group1', 'Group2', 'Group3']].sum()
mean_G = (sum_G) / (N_g)
mean_G

51.133333333333333

## 5. Calculation $k$ :- (number of groups)

In [7]:
k = pd.DataFrame(n_g.count(axis = 1))
k = k.iloc[0, 0]
k

3

## 6. Calculating $(X - \bar X_g)^2$ :-

In [8]:
df1['diff1'] = (df1.iloc[0:5, 0] - mean_g.iloc[0, 0])**2
df1['diff2'] = (df1.iloc[0:5, 1] - mean_g.iloc[0, 1])**2
df1['diff3'] = (df1.iloc[0:5, 2] - mean_g.iloc[0, 2])**2
df1.iloc[0:5, :]

Unnamed: 0,Group1,Group2,Group3,diff1,diff2,diff3
0,51,23,56,7.84,153.76,190.44
1,45,43,76,10.24,57.76,38.44
2,33,23,74,231.04,153.76,17.64
3,45,43,87,10.24,57.76,295.84
4,67,45,56,353.44,92.16,190.44


In [9]:
df1.loc['sum'] = df1.iloc[0:5, 3:6].sum(axis=0)
df1.iloc[0:6, 3:6]

Unnamed: 0,diff1,diff2,diff3
0,7.84,153.76,190.44
1,10.24,57.76,38.44
2,231.04,153.76,17.64
3,10.24,57.76,295.84
4,353.44,92.16,190.44
sum,612.8,515.2,732.8


## 7. Calculating $MSS_w$ :- (mean sum of square within groups)

In [10]:
sum_sqr_within = df1.iloc[5, 3:6].sum()
mean_sum_sqr_within = sum_sqr_within / (N_g - k)
mean_sum_sqr_within

155.06666666666666

## 8. Calculating $MSS_B$ :- (mean sum of square between groups)

In [23]:
import math

x = 5*math.pow((48.2 - 51.13), 2)
y = 5*math.pow((35.4 - 51.13), 2)
z = 5*math.pow((69.8 - 51.13), 2)

In [25]:
mean_sum_sqr_between = (x+y+z)/(3-1)
mean_sum_sqr_between

1511.46675

## 9. Calculating F-stats :-

In [27]:
f_stats = mean_sum_sqr_between / mean_sum_sqr_within
f_stats

9.7472060404127259

## 10. Calculating degree of freedom :-

In [31]:
d_o_f_num = k - 1
d_o_f_denom = N_g - k
print(d_o_f_num)
print(d_o_f_denom)

2
12


## 11. Using F table :-

In [32]:
F_crit = 3.89

### As F_stats $\gt$ F_crit thus there is enough evidence to reject the null hypothesis.

APA formate -->  F(2, 12) = 9.74, p < 0.05