## Why Sample Variance is Divided by (n – 1)

### **Concept Overview**
When we calculate variance for a **sample** (not the full population), we divide by  
$(n - 1)$ instead of $n$ to make the estimate of population variance **unbiased**.  

This adjustment is known as **Bessel’s Correction**.

---

### **Detailed Explanation**

- In a **population variance**, we know the **true mean (μ)** of the entire population.  
  So, we use the formula:  
  $\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}$  

- In a **sample variance**, we only have a **sample mean ($\bar{x}$)**, which is an **estimate** of the true mean.  
  Because $\bar{x}$ comes from the same sample, the deviations $(x_i - \bar{x})$ are on average **smaller** than the true deviations $(x_i - \mu)$.  
  This makes the variance **underestimated** if we divide by $n$.  

To correct this bias, we divide by $(n - 1)$ instead of $n$.  
This slightly increases the variance, giving a more **accurate estimate** of the population variance.

---

### **Mathematically**

- **Population Variance:**  
  $\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}$  

- **Sample Variance (Unbiased):**  
  $s^2 = \frac{\sum (x_i - \bar{x})^2}{n - 1}$  

---

### **Intuitive Example**

Suppose we take a small sample of 3 values:  
$\{4,\ 7,\ 9\}$  

1. Mean ($\bar{x}$) = $\frac{4 + 7 + 9}{3} = 6.67$  
2. Deviations from mean = (-2.67, 0.33, 2.33)  
3. Squared deviations = (7.11, 0.11, 5.43)  
4. Sum = 12.65  

Now compute:
- If divided by $n$:  
  $\frac{12.65}{3} = 4.22$  
- If divided by $(n - 1)$:  
  $\frac{12.65}{2} = 6.33$  

The second one (6.33) is **unbiased** and better represents the true population spread.

---

### **Key Takeaways**

| Term | Formula | Divisor | Purpose |
|------|----------|----------|----------|
| **Population Variance** | $\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}$ | $N$ | True measure (known mean) |
| **Sample Variance** | $s^2 = \frac{\sum (x_i - \bar{x})^2}{n - 1}$ | $(n - 1)$ | Unbiased estimate of population variance |

---

**Summary:**  
> We divide by $(n - 1)$ because the sample mean $\bar{x}$ is itself estimated from the data.  
> Using $n - 1$ corrects the underestimation of variability and makes the sample variance an **unbiased estimator** of the true population variance.


## Random Variable and Its Types

### **Definition**

A **Random Variable (RV)** is a variable that takes different numerical values based on the outcome of a **random experiment**.  
It assigns a **number** to each possible outcome in a sample space.

---

### **Example**

Consider tossing a fair coin:
- Sample space: {Head, Tail}  
- Define a random variable $X$ such that:  
  - $X = 1$ if Head occurs  
  - $X = 0$ if Tail occurs  

Here, $X$ is a **random variable** because its value depends on a random outcome.

---

### **Types of Random Variables**

#### 1. **Discrete Random Variable**
A random variable that takes **countable** or **finite** number of values.

**Examples:**
- Number of heads in 3 coin tosses → {0, 1, 2, 3}  
- Number of students present in class  
- Number of cars passing a signal in one minute  

**Probability Distribution:**
For discrete random variables, we use **Probability Mass Function (PMF)**:  
$P(X = x_i)$$=  
It gives the probability of each possible value.

**Properties:**
- $0 \le P(X = x_i) \le 1$  
- $\sum P(X = x_i) = 1$

---

#### 2. **Continuous Random Variable**
A random variable that can take **any value within a given range** (including decimals).

**Examples:**
- Height of students (e.g., 160.2 cm, 171.8 cm)  
- Temperature of a city  
- Time taken to complete a task  

**Probability Distribution:**
For continuous random variables, we use **Probability Density Function (PDF)**:  
$P(a \le X \le b) = \int_a^b f(x) \, dx$  

**Properties:**
- $f(x) \ge 0$ for all $x$  
- $\int_{-\infty}^{\infty} f(x)\,dx = 1$

---

### **Key Differences**

| Feature | Discrete Random Variable | Continuous Random Variable |
|:--------|:--------------------------|:----------------------------|
| **Possible Values** | Countable / Finite | Infinite / Uncountable |
| **Examples** | Number of students, dice rolls | Height, temperature, time |
| **Function Used** | Probability Mass Function (PMF) | Probability Density Function (PDF) |
| **Probability of a single value** | $P(X = x)$ can be nonzero | $P(X = x) = 0$ |
| **Representation** | Table or bar graph | Curve (area under curve = probability) |

---

### **Visualization**

<p align="center">
  <img src="https://media.geeksforgeeks.org/wp-content/uploads/20240611123446/Uniform-vs-discrete.webp" alt="Discrete vs Continuous Random Variable" width="500"/>
</p>

---

### 💡 **Summary**

- A **Random Variable** links outcomes of random experiments to numbers.  
- **Discrete RV** → countable values (uses PMF)  
- **Continuous RV** → uncountable values (uses PDF)  
- The **area under the probability curve** or the **sum of all probabilities** is always 1.
