### **Types of Hypothesis Testing**

**Chi-Square (for categorical data)**

This test is used when you are dealing with categorical data (data that falls into categories, like "yes" or "no"). It helps to see if there's a significant relationship between two variables.

**For example, you might use a Chi-square test to see if gender and voting preferences are related**

Calculatively, this test is used to determine if the expected and observed results are well-fitted.

The observed values in the data should be compared to the predicted values that would be present if the null hypothesis were true.

It is calculated using the formula:

$$
x_(c)^2 = \frac{\sum(O_i - E_i)^2}{E_i}
$$

where:

**c = Degrees of freedom**

**O = Observed Value**

**E = Expected Value**

**ANOVA (Analysis of Variance) - comparing more than two means**

This is used when you want to compare the **mean**s of three or more groups. It is like the two-sample test but for more than two groups.

**For example, if you want to compare the average test scores of students from three different teaching methods, ANOVA can help determine if there is a significant difference**

**Or, in business, a company might use ANOVA to analyze whether three different stores are performing differently in terms of sales.**

**It is also widely used in fields like medical research and social sciences, where comparing group differences can provide valuable insights**

Types of ANOVA
- One-Way ANOVA
It examines whether the means of two or more independent or unrelated groups differ statistically significantly. 
It is often used to investigate whether fluctuations or different quantities of a single independent variable or factor affect a dependent variable.

- Two-Way ANOVA
It compares the means of more than two groups with varying amounts of a second variable in addition to being independent. It is utilized to determine each independent variable's main effect and whether there is an interaction effect between them.

- Factorial ANOVA
It is used to examine the effect of multiple independent varibales on a dependent variable and their interactions. It is essential that your variable of interest be continuous, regularly distributed, and have a comparable distribution amount your groups.

- Welch's F-test ANOVA
It compares two or more means when the assumption of equal variances is violated to determine whether two means are equal. It is useful for conducting an ANOVA statistics analysis when the homogeneity of variances assumption is not satisfied, mainly when sample sizes are unequal.

**You use ANOVA when comparing more thantwo group means as it is a better approach than the t-test, OR, when thre is a continuous (quantitative) outcome, you cannot conduct an ANOVA test if your dependent variable is the norminal data**

The best way to calculate for an ANOVA test is to organize the formula inside an ANOVA table.

**STRUCTURE OF AN ANOVA TABLE** 

| Source of Variation | Sum of Squares | Degree of Freedom | Mean Squares | F Value |
| --------------- | ------------ | ----------- | --------------- | ------------- |
| **Between Groups** | $$ SSB = \sum nj(\bar{x}_j - \bar{x})^2 $$ | $$ df_1 = k - 1 $$ | $$ MSB = SSB / (k - 1) $$ | $$ f = MSB /MSE or, F = MST/MSE $$ |
| **Error** | $$ SSE = \sum nj(\bar{x} - \bar{x}_j)^2 $$ | $$ df_2 = N - k $$ | $$ MSE = SSE / (N - k) $$ | |
| **Total** | $$ SST = SSB + SSE $$ | $$ df_3 = N - 1 $$ | | |

where;

**F = ANOVA Coefficient**

**MSB = Mean of the total of squares between groupings**

**MSW = Mean sum of squares due to error**

**SST = Total sum of squares**

**p = Total number of populations**

**n = The total number of samples in a population**

**SSW = Sum of squares within the groups**

**SSB = Sum of squares between the groups**

**SSE = Sum of squares due to error**

**s = Standard deviation of the samples**

**N = Total number of observations**


## **Calculating ANOVA**
Compare plant scores of three students from different tutors (Tutor A, Tutor B, Tutor C) with an overall score of 50:

- Tutor A: [45, 35, 40]
- Tutor B: [40, 39, 35]
- Tutor C: [35, 32, 30]

**Step 1 - State Hypothesis**
- $$ Null Hypothesis (H_0): µ_A = µ_B = µ_C
- $$ Alternative Hypothesis (H_a): At least one µ differs

**Step 2 - Calculate Group means and Grand mean.**
- $$ Group Means: \bar{X}_A, \bar{X}_B, and \bar{X}_C $$
- $$ Grand Mean: \bar{X}_{grand} $$

$$
\bar{X}_A = \frac{45 + 35 + 40}{3} = 40
$$
$$
\bar{X}_B = \frac{40 + 39 + 35}{3} = 38
$$
$$
\bar{X}_C = \frac{35 + 32 + 30}{3} = 32.3
$$
$$
\bar{X}_{grand} = \frac{45 + 35 + 40 + 40 + 39 + 35 + 35 + 32 + 30}{9} = \frac{331}{9} = 36.8
$$

**Step 3: Compute Sum of Squares (SS):**

**SSB(Sum of Squares Between Groups):** Accounts for variation due to the treatment or independent variable.
$$ SSB = \sum n_i(\bar{X}_i - \bar{X}_{grand})^2 $$

**SSE(Sum of Squares Error or Withn Groups):**
Accounts for variation within groups (random error or residuals).
$$ SSE = \sum (\bar{x}_i - \bar{X})^2 $$

**SSE(Total Sum of Squares):**
Accounts for variation within groups (random error or residuals).
$$ SST = SSB + SSW $$

$$ 
SSB = 3(40 - 36.8)^2 + 3(38 - 36.8)^2 + 3(32.3 - 36.8)^2
$$
$$
(3 * 3.2^2) + (3 * 1.2^2) + (3 * (-4.5)^2)
$$
$$
3(10.24) + 3(1.44) + 3(20.25)
$$
$$
30.72 + 4.32 + 60.75 = 95.79
$$

<!-- 45 + 35 + 40, 40 -->
$$
- Tutor A: (45 - 40)^2 + (35 - 40)^2 + (40 - 40)^2 = 5^2 + (-5)^2 + 0^2 = 25 + 25 + 0 = 50
$$
<!-- 40 + 39 + 35, 38 -->
$$
- Tutor B: (40 - 38)^2 + (39 - 38)^2 + (38 - 38)^2 = 2^2 + 1^2 + 0^2 = 4 + 1 + 0 = 5
$$
<!-- 35 + 32 + 30, 32.3 -->
$$
- Tutor C: (35 - 32.3)^2 + (32 - 32.3)^2 + (30 - 32.3)^2 = 2.7^2 + 0.3^2 + (-2.3)^2 = 7.29 + 0.09 + 5.29 = 12.67
$$

    SSW = 12.67 + 5 + 50 = 67.67
    SST = 95.79 + 67.67 = 163.46

**4. Calculate Degrees of Freedom (df)**

    df1(Between Groups) = k - 1, where k is number of groups.
    df2 (Within Groups) = N - k, where N is the total observations
    df3 (Total) = 
