
# CLASS TASK5
### Types of Hypothesis Testing
##### Given the following Concepts:
- Chi-square test (for categorical data),
- ANOVA (for comparing more than two means),
- Two Tailed Tests
- Proportion Tests
- `Read about them, and explain them in a Jupyter notebook`



## 1. Chi-Square Test (for Categorical Data)

**Purpose:**  
Used to check if there is a significant association between two categorical variables, 
or if the observed frequencies fit expected ones.

**Steps:**  
1. State the hypotheses:  
   - $H_0$: The variables are independent (no association).  
   - $H_1$: The variables are dependent (there is an association).  

2. Construct a contingency table with observed frequencies ($O$).  

3. Compute expected frequencies ($E$) using:  
   $$
   E_{ij} = \frac{(\text{Row total})(\text{Column total})}{\text{Grand total}}
   $$

4. Compute the Chi-square statistic:  
   $$
   \chi^2 = \sum \frac{(O - E)^2}{E}
   $$

5. Compare the computed $\chi^2$ with the critical value from the Chi-square distribution table 
   with $(r-1)(c-1)$ degrees of freedom, at $\alpha = 0.05$.  

6. Decision: If $\chi^2_{calculated} > \chi^2_{critical}$, reject $H_0$.



## 2. ANOVA (Analysis of Variance)

**Purpose:**  
Used when comparing the means of **three or more groups** to see if at least one mean is different.

**Steps:**  
1. Hypotheses:  
   - $H_0$: $\mu_1 = \mu_2 = \mu_3 = \dots$ (all means are equal).  
   - $H_1$: At least one mean is different.  

2. Compute the **grand mean** (average of all observations).  

3. Compute the **between-group variability (SSB)**:  
   $$
   SSB = \sum n_i (\bar{X}_i - \bar{X}_{grand})^2
   $$

4. Compute the **within-group variability (SSW)**:  
   $$
   SSW = \sum \sum (X_{ij} - \bar{X}_i)^2
   $$

5. Find the **Mean Squares**:  
   $$
   MSB = \frac{SSB}{k-1}, \quad MSW = \frac{SSW}{N-k}
   $$  
   where $k$ = number of groups, $N$ = total sample size.

6. Compute the F-ratio:  
   $$
   F = \frac{MSB}{MSW}
   $$

7. Compare with the F-critical value at $(k-1, N-k)$ df.  

8. Decision: If $F_{calculated} > F_{critical}$, reject $H_0$.



## 3. Two-Tailed T-Test

**Purpose:**  
Tests whether two means are significantly different (independent samples or paired samples).

**Steps (Independent samples):**  
1. Hypotheses:  
   - $H_0$: $\mu_1 = \mu_2$.  
   - $H_1$: $\mu_1 \neq \mu_2$.  

2. Compute sample means ($\bar{X}_1, \bar{X}_2$) and variances ($s_1^2, s_2^2$).  

3. Compute the **pooled variance** (if equal variances assumed):  
   $$
   s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}
   $$

4. Compute the test statistic:  
   $$
   t = \frac{\bar{X}_1 - \bar{X}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}
   $$

5. Degrees of freedom: $df = n_1 + n_2 - 2$.  

6. Compare with critical $t$ value at 0.05 significance (two-tailed).  

7. Decision: Reject $H_0$ if $|t_{calculated}| > t_{critical}$.



## 4. Proportion Test

**Purpose:**  
Used to test hypotheses about population proportions.

**Steps (One-sample):**  
1. Hypotheses:  
   - $H_0: p = p_0$.  
   - $H_1: p \neq p_0$.  

2. Compute the sample proportion:  
   $$
   \hat{p} = \frac{x}{n}
   $$  
   where $x$ = number of successes, $n$ = sample size.

3. Compute the **standard error (SE)**:  
   $$
   SE = \sqrt{\frac{p_0 (1 - p_0)}{n}}
   $$

4. Compute the test statistic:  
   $$
   z = \frac{\hat{p} - p_0}{SE}
   $$

5. Compare with the z-critical value (±1.96 at 5% significance, two-tailed).  

6. Decision: Reject $H_0$ if $|z| > 1.96$.
