```{contents}

```


#  Central Limit Theorem (CLT)

### Definition

The **Central Limit Theorem (CLT)** states that:

> When we take repeated random samples from a population and calculate their means, the distribution of these sample means will **approximate a normal distribution (bell curve)** — regardless of the shape of the original population distribution — **if the sample size is large enough (n ≥ 30 is a common rule of thumb).**

---

### Key Points

1. Works even if the population distribution is **not normal** (e.g., skewed, uniform).
2. The **larger the sample size (n)** → the closer the sample mean distribution is to a **normal distribution**.
3. The **mean of the sample means** = the **population mean (μ)**.
4. The **spread of sample means** is smaller than population spread, and is given by the **Standard Error (SE):**

$$
SE = \frac{\sigma}{\sqrt{n}}
$$

where

* $\sigma$ = population standard deviation
* $n$ = sample size

---

###  Why It’s Important?

* It allows us to **use normal distribution methods** (Z-test, t-test, ANOVA) even when the population is not normal.
* It is the **foundation of inferential statistics** → helps us make conclusions about populations from samples.

---

###  Example

* Suppose exam scores in a college are **right-skewed** (most students score low, a few very high).
* If we randomly take small samples (say 5 students each), their average will still look skewed.
* But if we take **many larger samples (say n = 50 students each)** and plot the means → the distribution of those means will look **normal (bell curve)**.

---

### Visual Idea

Population distribution → could be skewed, uniform, exponential, etc.
⬇️
Take many random samples (n ≥ 30) and calculate means.
⬇️
Distribution of sample means → looks **normal**.

---

✅ **In short:**
Even if your data is not normal, **the averages of sufficiently large samples will be normal.**
This is why **ANOVA, t-tests, Z-tests** rely on the CLT assumption of normality of sampling distribution.



## Types of ANOVA

1. **One-Way ANOVA**

* **Factor**: 1
* **Levels**: ≥ 2 (independent)
* Example: A doctor tests a new medication for headache relief.

  * Groups: 10mg, 20mg, 30mg dosages
  * Factor = Medication
  * Levels = Different dosages (independent of each other)


2. **Repeated Measures ANOVA**

* **Factor**: 1
* **Levels**: ≥ 2 (dependent)
* Example: Running performance of same individuals over **Day 1, Day 2, Day 3**.

  * Factor = Running
  * Levels = Days (dependent, since same participants measured repeatedly)


3. **Factorial ANOVA**

* **Factors**: ≥ 2
* **Levels**: Each factor has ≥ 2 (independent or dependent)
* Example: Running data + Gender

  * Factor 1 = Running (Day 1, Day 2, Day 3 → dependent levels)
  * Factor 2 = Gender (Male, Female → independent levels)
* Used when multiple factors influence the outcome simultaneously.


✅ **Key Idea:**

* **One-Way ANOVA** → One factor, independent levels
* **Repeated Measures ANOVA** → One factor, dependent levels
* **Factorial ANOVA** → Two or more factors, each with ≥ 2 levels (independent or dependent)



## Hypothesis Testing in ANOVA

### 1. **Purpose of ANOVA**

* Compares the means of **two or more groups**.
* Uses **variance** to determine if group means differ significantly.

---

### 2. **Hypotheses**

* **Null Hypothesis (H₀):**
  μ₁ = μ₂ = μ₃ … = μₖ (all means are equal)
* **Alternate Hypothesis (H₁):**
  At least one mean is not equal.

---

### 3. **Test Statistic**

* **F-test** is used.
* Formula:

  $$
  F = \frac{\text{Variance Between Groups}}{\text{Variance Within Groups}}
  $$

---

### 4. **Example Problem**

Doctor tests a new medication for headache relief with **three dosage levels**:

* 15 mg, 30 mg, 45 mg.
* Participants rate headache relief **1–10**.

We test if there is a **difference between mean relief scores** at α = 0.05.

---

### 5. **Steps in One-Way ANOVA**

#### Step 1: Define Hypotheses

* H₀: μ₁ = μ₂ = μ₃
* H₁: Not all means are equal.

#### Step 2: Significance Level

* α = 0.05 → confidence level = 95%.

#### Step 3: Degrees of Freedom

* Total sample size = 21 (7 participants × 3 groups).
* **Between groups:** a − 1 = 3 − 1 = 2.
* **Within groups:** n − a = 21 − 3 = 18.
* **Total:** 20.

#### Step 4: Decision Rule

* Critical value from F-table (df₁=2, df₂=18, α=0.05) = **3.5546**.
* If F > 3.5546 → Reject H₀.

#### Step 5: Compute F-statistic

* Compute **Sum of Squares (SS):**

  * SS(Between) = 98.67
  * SS(Within) = 10.29
  * SS(Total) = 108.96

* Compute **Mean Squares (MS):**

  * MS(Between) = 49.34
  * MS(Within) = 0.54

* Compute **F-value:**

  $$
  F = \frac{49.34}{0.54} = 86.56
  $$

#### Step 6: Decision

* Since 86.56 > 3.5546 → Reject H₀.
* ✅ Conclusion: There is a **significant difference** between at least one of the dosage groups.

---

## 🔹 Key Takeaways

* ANOVA compares **means of multiple groups** using **variance**.
* The **F-statistic** = variance between groups ÷ variance within groups.
* Decision is based on **critical value from F-table**.
* In the example: medication dosage significantly affects headache relief.

| Source         | SS     | df | MS    | F     | F-critical |
| -------------- | ------ | -- | ----- | ----- | ---------- |
| Between Groups | 98.67  | 2  | 49.34 | 86.56 | 3.55       |
| Within Groups  | 10.29  | 18 | 0.54  | —     | —          |
| Total          | 108.96 | 20 | —     | —     | —          |

