# 1.1 Probability Distributions\n
\n
**Syllabus Topic:** 1.1 Probability Distributions\n
\n
## Introduction\n
In machine learning, we deal with uncertainty using **probability theory**. A **probability distribution** describes how the values of a random variable are distributed. It tells us which values are more likely to occur.\n
\n
We distinguish between:\n
*   **Discrete Distributions** (e.g., Bernoulli, Binomial, Poisson). Defined by a Probability Mass Function (PMF).\n
*   **Continuous Distributions** (e.g., Gaussian, Beta, Gamma). Defined by a Probability Density Function (PDF).\n
\n
In this notebook, we focus on the most fundamental continuous distribution: the **Gaussian (Normal) Distribution**.

## 1. The Univariate Gaussian Distribution\n
\n
The Gaussian distribution $\mathcal{N}(x | \mu, \sigma^2)$ is defined by two parameters:\n
1.  **Mean ($\mu$)**: The center or expectation of the distribution.\n
2.  **Variance ($\sigma^2$)**: The spread or width of the distribution. (Standard Deviation $\sigma = \sqrt{\text{Variance}}$).\n
\n
### Formal Definition (PDF)\n
For a single variable $x$, the Probability Density Function is:\n
\n
$$ \mathcal{N}(x | \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left( -\frac{(x - \mu)^2}{2\sigma^2} \right) $$\n
\n
*   The factor $\frac{1}{\sqrt{2\pi\sigma^2}}$ is a normalization constant ensuring the area under the curve sums to 1.\n
*   The quadratic term in the exponent $(x - \mu)^2$ creates the bell shape.

In [1]:
import numpy as np\n
import matplotlib.pyplot as plt\n
from scipy.stats import norm\n
import seaborn as sns\n
\n
# Set plot style\n
sns.set_style("whitegrid")\n
plt.rcParams['figure.figsize'] = (10, 6)

SyntaxError: unexpected character after line continuation character (1591631810.py, line 1)

### Visualizing the PDF and CDF\n
Let's visualize the Probability Density Function (PDF) and Cumulative Distribution Function (CDF) for a Standard Normal Distribution ($\mu=0, \sigma=1$).

In [None]:
# Parameters\n
mu = 0\n
sigma = 1\n
\n
# Define range for x\n
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 1000)\n
\n
# Calculate PDF and CDF\n
pdf_values = norm.pdf(x, loc=mu, scale=sigma)\n
cdf_values = norm.cdf(x, loc=mu, scale=sigma)\n
\n
# Plotting\n
fig, ax = plt.subplots(1, 2, figsize=(14, 5))\n
\n
# PDF Plot\n
ax[0].plot(x, pdf_values, label=f'$\mu={mu}, \sigma={sigma}$', color='blue', lw=2)\n
ax[0].fill_between(x, pdf_values, alpha=0.2, color='blue')\n
ax[0].set_title("Probability Density Function (PDF)")\n
ax[0].set_xlabel("x")\n
ax[0].set_ylabel("Probability Density")\n
ax[0].legend()\n
\n
# CDF Plot\n
ax[1].plot(x, cdf_values, label=f'$\mu={mu}, \sigma={sigma}$', color='red', lw=2)\n
ax[1].set_title("Cumulative Distribution Function (CDF)")\n
ax[1].set_xlabel("x")\n
ax[1].set_ylabel("Cumulative Probability")\n
ax[1].legend()\n
\n
plt.tight_layout()\n
plt.show()

## 2. Sampling from a Gaussian\n
\n
In practice, we often don't know the true distribution but observe **samples** from it. As the number of samples increases, the empirical histogram converges to the theoretical PDF.

In [None]:
# Generate samples\n
num_samples = 1000\n
samples = np.random.normal(loc=mu, scale=sigma, size=num_samples)\n
\n
# Plot histogram vs theoretical PDF\n
plt.figure(figsize=(8, 5))\n
count, bins, ignored = plt.hist(samples, bins=30, density=True, alpha=0.6, color='g', label='Empirical Histogram')\n
plt.plot(x, pdf_values, linewidth=2, color='r', label='Theoretical PDF')\n
plt.title(f"Sampling {num_samples} points from $\mathcal{{N}}({mu}, {sigma}^2)$ ")\n
plt.legend()\n
plt.show()

## 3. The Multivariate Gaussian (Optional but Important)\n
\n
For a $D$-dimensional vector $\mathbf{x}$, the Gaussian distribution is defined by:\n
*   **Mean Vector ($\boldsymbol{\mu}$)**: A $D$-dimensional vector.\n
*   **Covariance Matrix ($\boldsymbol{\Sigma}$)**: A $D \times D$ symmetric, positive-definite matrix describing spread and correlations.\n
\n
$$ \mathcal{N}(\mathbf{x} | \boldsymbol{\mu}, \boldsymbol{\Sigma}) = \frac{1}{\sqrt{(2\pi)^D |\boldsymbol{\Sigma}|}} \exp \left( -\frac{1}{2} (\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right) $$\n
\n
Let's visualize a 2D Gaussian.

In [None]:
from scipy.stats import multivariate_normal\n
\n
# Parameters for 2D Gaussian\n
mu_2d = [0, 0]\n
cov_2d = [[1, 0.5], [0.5, 1]]  # Correlated variables\n
\n
# Create grid\n
x_grid, y_grid = np.mgrid[-3:3:.01, -3:3:.01]\n
pos = np.dstack((x_grid, y_grid))\n
\n
# Calculate PDF\n
rv = multivariate_normal(mu_2d, cov_2d)\n
z = rv.pdf(pos)\n
\n
# Plot Contour\n
plt.figure(figsize=(7, 6))\n
contour = plt.contourf(x_grid, y_grid, z, levels=20, cmap='viridis')\n
plt.colorbar(contour, label='Probability Density')\n
plt.title("2D Multivariate Gaussian Density")\n
plt.xlabel("$x_1$")\n
plt.ylabel("$x_2$")\n
plt.show()

## Summary\n
\n
*   **Distributions** model uncertainty.\n
*   **PDF** describes the likelihood of continuous variables.\n
*   **Gaussian Distribution** is central to ML due to the Central Limit Theorem and its mathematical properties.\n
*   **Sampling** allows us to generate data that follows a specific distribution.