# The Relationship Between PDF, PMF, and CDF

In probability theory and statistics, understanding the distribution of data is crucial. We use specific functions to describe how probabilities are distributed over the values of a random variable. These functions are:

1.  **PMF (Probability Mass Function)**
2.  **PDF (Probability Density Function)**
3.  **CDF (Cumulative Distribution Function)**

## 1. Random Variables
Before diving into the functions, we must distinguish between two types of random variables:
*   **Discrete Random Variable:** Can take on a countable number of distinct values (e.g., rolling a die, number of students in a class).
*   **Continuous Random Variable:** Can take on an infinite number of possible values within a range (e.g., height, weight, temperature).

## 2. Probability Mass Function (PMF)
*   **Used for:** Discrete Random Variables.
*   **Definition:** The PMF gives the probability that a discrete random variable $X$ is exactly equal to some value $x$.
    $$P(X = x)$$
*   **Properties:**
    *   $0 \le P(X=x) \le 1$
    *   $\sum P(X=x) = 1$ (The sum of probabilities for all possible values is 1).

## 3. Probability Density Function (PDF)
*   **Used for:** Continuous Random Variables.
*   **Definition:** The PDF describes the relative likelihood for this random variable to take on a given value. Unlike PMF, the value of the PDF at any single point is **not** the probability of that point (probability of a single point in continuous distribution is 0). Instead, the area under the curve represents probability.
*   **Properties:**
    *   $f(x) \ge 0$
    *   The total area under the curve is 1: $\int_{-\infty}^{\infty} f(x) dx = 1$
    *   Probability between two points $a$ and $b$ is the integral: $P(a \le X \le b) = \int_{a}^{b} f(x) dx$

## 4. Cumulative Distribution Function (CDF)
*   **Used for:** Both Discrete and Continuous Random Variables.
*   **Definition:** The CDF gives the probability that the random variable $X$ will take a value less than or equal to $x$.
    $$F(x) = P(X \le x)$$
*   **Relationship with PMF (Discrete):**
    $$F(x) = \sum_{x_i \le x} P(X = x_i)$$
*   **Relationship with PDF (Continuous):**
    $$F(x) = \int_{-\infty}^{x} f(t) dt$$
    Conversely, the PDF is the derivative of the CDF: $f(x) = F'(x)$.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

# 1. PMF Example: Rolling a fair die
# Possible values: 1, 2, 3, 4, 5, 6
die_outcomes = [1, 2, 3, 4, 5, 6]
die_probs = [1/6] * 6

plt.figure(figsize=(12, 4))

plt.subplot(1, 3, 1)
plt.stem(die_outcomes, die_probs, basefmt=" ")
plt.title("PMF: Fair Die Roll")
plt.xlabel("Outcome")
plt.ylabel("Probability")
plt.ylim(0, 0.25)

# 2. PDF Example: Normal Distribution
# Continuous variable
x = np.linspace(-4, 4, 100)
pdf_values = stats.norm.pdf(x, loc=0, scale=1) # Standard Normal (mean=0, std=1)

plt.subplot(1, 3, 2)
plt.plot(x, pdf_values, color='red')
plt.fill_between(x, pdf_values, alpha=0.3, color='red')
plt.title("PDF: Standard Normal Distribution")
plt.xlabel("Value")
plt.ylabel("Density")

# 3. CDF Example: Normal Distribution
cdf_values = stats.norm.cdf(x, loc=0, scale=1)

plt.subplot(1, 3, 3)
plt.plot(x, cdf_values, color='green')
plt.title("CDF: Standard Normal Distribution")
plt.xlabel("Value")
plt.ylabel("Cumulative Probability")
plt.grid(True)

plt.tight_layout()
plt.show()

## Summary of Relationships

| Feature | Discrete (PMF) | Continuous (PDF) |
| :--- | :--- | :--- |
| **Function** | Probability Mass Function (PMF) | Probability Density Function (PDF) |
| **Probability at a point** | $P(X=x)$ | 0 (Area is required) |
| **Cumulative Function** | CDF (Summation) | CDF (Integration) |
| **Visual** | Bar plot / Stem plot | Smooth Curve |