Probability Distribution Function describes how the probabilities are distributed over the values of a random variable.

2 main functions are used for probability distribution function
1. Probability density function (PDF) - used for continuous data/variables
2. Probability mass function (PMF) - used for discrete data/variables
2. Cumulative distribution function (CDF) -


---

## 🎯 **1. Probability Mass Function (PMF)**

➡ **Used for:** Discrete random variables (countable values like 0, 1, 2,…).
➡ **Purpose:** Gives the **probability of exact outcomes**.

### ✅ Key Points:

* The graph has **spikes** or **bars** at specific outcomes.
* Total probability across all outcomes equals **1**.
* Example: Rolling a fair die.

### ✅ Conceptual Graph:

```
Probability
   |
 0.2 |      *   *   *   *   *   *      ← Each * represents probability of outcome 1 to 6 (each 1/6)
     |
 0.0 +----------------------------
         1   2   3   4   5   6   ← Outcomes
```

---

## 🎯 **2. Probability Density Function (PDF)**

➡ **Used for:** Continuous random variables (can take infinite values within a range).
➡ **Purpose:** Describes the **density** of probability along the number line.

### ✅ Key Points:

* Graph is a **smooth curve**.
* Probability is the **area under the curve** between two points.
* Total area under the curve equals **1**.
* Example: Normal distribution (bell-shaped curve for heights, weights, etc.).

### ✅ Conceptual Graph:

```
Density
   |
   |           .-''''''-.
   |         .'          '.
   |       .'              '.
   |-----'--------------------'-----
      -4    0               +4    ← Continuous values
```

Probability between two points = shaded area under curve.

---

## 🎯 **3. Cumulative Distribution Function (CDF)**

➡ **Used for:** Both discrete and continuous random variables.
➡ **Purpose:** Shows the **cumulative probability** up to a certain point (i.e., $P(X \leq x)$).

---

### ✅ CDF for Discrete Data:

* Graph has **steps** (jumps) at each outcome.
* Example: Rolling a die.

### ✅ Conceptual Graph (Discrete CDF):

```
Cumulative Probability
   |
 1 +           _______
   |          |
 0 +___|___|___|___|___|___|____
       1   2   3   4   5   6    ← Outcomes
```

---

### ✅ CDF for Continuous Data:

* Graph is **smooth** and continuously increases.
* Starts near 0, rises to 1.
* Example: Normal distribution CDF.

### ✅ Conceptual Graph (Continuous CDF):

```
Cumulative Probability
   |
 1 +                    ______
   |                 .-'
   |              .-'
   |          .-'
   |       .-'
 0 +-----'
      -4    0             +4    ← Continuous values
```

---

## ✅ **Quick Summary of Graphs:**

| Function | Used For   | Graph Shape                             | Meaning                                |
| -------- | ---------- | --------------------------------------- | -------------------------------------- |
| **PMF**  | Discrete   | Bars/Spikes                             | Probability of exact value             |
| **PDF**  | Continuous | Smooth Curve                            | Probability density (area under curve) |
| **CDF**  | Both       | Steps (discrete) or Smooth (continuous) | Cumulative probability up to point     |

---

## ✅ **Key Insight:**

* PMF → Probability of **exact** value (for discrete).
* PDF → Probability over an **interval** (for continuous).
* CDF → Probability **up to a value** (cumulative, works for both types).

---



---

## 🎯 **1. Probability Mass Function (PMF) — Discrete Situations**

➡ **Use When:** You’re dealing with **countable outcomes** — specific numbers you can list.

### ✅ **Examples:**

* Rolling dice → Probability of getting exactly a **4**.
* Tossing coins → Probability of getting exactly **2 heads** in 3 tosses.
* Number of customers arriving at a store in an hour.
* Number of defective items in a batch.

### ✅ **Why Use PMF?**

* You need the **probability of exact events**.
* The outcomes don’t have decimal values between them (they are whole numbers).

---

## 🎯 **2. Probability Density Function (PDF) — Continuous Situations**

➡ **Use When:** Your variable can take **any value within a range**, even decimals.

### ✅ **Examples:**

* Heights of people → Probability density at 170 cm (though exact probability at one point is 0; we focus on intervals).
* Weight of products.
* Time taken to finish a task.
* Speed of cars passing a point.

### ✅ **Why Use PDF?**

* You want the **relative likelihood** over an interval, not exact points.
* You can’t assign probability to exact values in continuous data (area under curve gives probability over a range).

---

## 🎯 **3. Cumulative Distribution Function (CDF) — Both Discrete & Continuous**

➡ **Use When:** You’re interested in the **cumulative probability** — the chance that your variable is **less than or equal** to some value.

### ✅ **Examples:**

* Probability of scoring **at most 70** in an exam.
* Probability that the delivery time is **within 2 hours**.
* Probability that rainfall doesn’t exceed **10 mm**.
* CDF is widely used in risk analysis, quality control, and decision-making.

### ✅ **Why Use CDF?**

* You care about the **probability up to a threshold**.
* It gives an overall perspective on where most data points fall.

---

## ✅ **Quick Summary Table (Real-World Usage):**

| Function | When to Use                       | Real-Life Example                                       |
| -------- | --------------------------------- | ------------------------------------------------------- |
| **PMF**  | Countable, exact outcomes         | Rolling dice, counting events (e.g., defects, arrivals) |
| **PDF**  | Continuous values, ranges         | Heights, weights, speeds, durations                     |
| **CDF**  | Cumulative probability (any type) | Passing score cutoffs, delivery time limits, thresholds |

---

## ✅ **How to Decide Which to Use?**

1. **Is your variable discrete or continuous?**

   * **Discrete → PMF (and CDF if cumulative needed).**
   * **Continuous → PDF (and CDF for cumulative analysis).**

2. **Do you need the probability for exact points or ranges?**

   * **Exact point (discrete only) → PMF.**
   * **Interval (continuous) → PDF.**
   * **Cumulative probability → CDF.**

---

---

## ✅ **PMF Practice (Discrete Probability)**

### **1. A fair dice is rolled.**

a. Probability of getting exactly a 5?
**Answer:**
$P(X = 5) = \frac{1}{6}$

b. What is the PMF for this dice roll?
**Answer:**
All outcomes are equally likely:
$P(X = x) = \frac{1}{6}$ for $x = 1, 2, 3, 4, 5, 6$

---

### **2. In a multiple-choice quiz with 4 options per question, what’s the probability of randomly guessing the correct answer?**

**Answer:**
$P(\text{correct answer}) = \frac{1}{4}$

---

### **3. A factory produces light bulbs. On average, 2 out of every 10 bulbs are defective.**

a. Probability of exactly 1 defective bulb in a sample of 3 bulbs?
**Answer:**
This is a **binomial distribution** problem:
$n = 3, p = 0.2$ (defective rate)

$$
P(X = 1) = \binom{3}{1} \times (0.2)^1 \times (0.8)^2 = 3 \times 0.2 \times 0.64 = 0.384
$$

b. What type of distribution to use here?
**Answer:**
**Binomial Distribution** (Discrete, count of defective bulbs).

---

## ✅ **PDF Practice (Continuous Probability)**

### **4. Customer purchase time follows a continuous distribution.**

a. Can you calculate the exact probability that a customer finishes in exactly 5.0 minutes?
**Answer:**
No. Probability at a single point in continuous data is always **0**.

b. How to calculate the probability between 5 and 10 minutes?
**Answer:**
You must calculate the **area under the PDF curve** between 5 and 10 minutes.

---

### **5. Heights of adult males follow a normal distribution (mean = 175 cm, SD = 7 cm).**

a. What type of function to model this?
**Answer:**
**Probability Density Function (PDF)** — specifically, a **normal distribution PDF**.

b. What does the area under the curve between 170 cm and 180 cm represent?
**Answer:**
It represents the **probability** that a randomly selected person has a height between 170 cm and 180 cm.

---

## ✅ **CDF Practice (Cumulative Probability)**

### **6. Exam scores CDF: $F(50) = 0.7$**

a. What does this mean?
**Answer:**
70% of students scored **50 or less**.

b. How can CDF help set pass/fail cutoffs?
**Answer:**
By finding the score corresponding to a desired cumulative probability (e.g., passing top 60% students).

---

### **7. Daily rainfall CDF at 20 mm is 0.9.**

a. What does this mean?
**Answer:**
There’s a **90% chance** that daily rainfall is **20 mm or less**;
10% chance it exceeds 20 mm.

---

## ✅ **Mixed Concept Questions**

### **8. PMF, PDF, or CDF?**

a. Predicting number of emails per day?
**Answer:**
**PMF** (Discrete count — Poisson distribution often used).

b. Measuring probability that weight lies between 60 kg and 70 kg?
**Answer:**
**PDF** (Continuous variable like weight).

c. Finding probability delivery arrives within 3 days?
**Answer:**
**CDF** (Cumulative probability up to 3 days).

---

### **9. Why does PDF’s total area equal 1, but single-point probability is 0?**

**Answer:**

* PDF’s total area under the curve represents total probability (must be 1).
* A **single point has no width** on the number line, so its probability is 0.
* Only **intervals** (areas) yield non-zero probabilities.

---

## ✅ **Challenge Question**

### **10. Random variable $X$ with CDF:**

$$
F(x) =
\begin{cases} 
0 & \text{if } x < 0 \\
x/10 & \text{if } 0 \leq x \leq 10 \\
1 & \text{if } x > 10
\end{cases}
$$

a. Probability that task takes more than 7 hours?
**Answer:**
$P(X > 7) = 1 - F(7) = 1 - \frac{7}{10} = 0.3$ (30%).

b. What is the PDF for $X$?
**Answer:**
Derivative of CDF in range $0 \leq x \leq 10$:
$f(x) = \frac{d}{dx} \left( \frac{x}{10} \right) = \frac{1}{10}$
PDF is **constant** between 0 and 10.

c. Is $X$ discrete or continuous?
**Answer:**
**Continuous** — it has a continuous CDF and constant PDF in the range.

---