# In-Depth Guide to Key Continuous Distributions

This guide provides a comprehensive overview of six fundamental continuous probability distributions: Uniform, Normal (Gaussian), Exponential, Student's t, Chi-square, and F. For each distribution, we include its definition, support, probability density function (PDF), cumulative distribution function (CDF), moments, entropy, hazard function (where relevant), estimation, relationships, sampling methods, and practical applications. All mathematical expressions are presented in LaTeX.

## 1. Continuous Uniform $\mathcal{U}(a, b)$

**Definition:** The uniform distribution models equal likelihood over a finite interval $[a, b]$.

**Parameters:** $a < b$, where $a$ and $b$ are the lower and upper bounds.

**Support:** $x \in [a, b]$.

### PDF
$$f(x) = \begin{cases}
\frac{1}{b - a}, & a \leq x \leq b, \\
0, & \text{otherwise.}
\end{cases}$$

### CDF
$$F(x) = \begin{cases}
0, & x < a, \\
\frac{x - a}{b - a}, & a \leq x \leq b, \\
1, & x > b.
\end{cases}$$

### Quantile (Inverse CDF)
$$Q(p) = a + (b - a)p, \quad 0 \leq p \leq 1.$$

### Moments and Properties

- **Mean:** $\mathbb{E}[X] = \frac{a + b}{2}$
- **Variance:** $\mathrm{Var}(X) = \frac{(b - a)^2}{12}$
- **Mode:** Any $x \in [a, b]$
- **Median:** $\frac{a + b}{2}$
- **Entropy:** $H(X) = \ln(b - a)$
- **Moment Generating Function (MGF):**
  $$M_X(t) = \begin{cases}
  \frac{e^{tb} - e^{ta}}{(b - a)t}, & t \neq 0, \\
  1, & t = 0.
  \end{cases}$$
- **Characteristic Function (CF):**
  $$\varphi_X(t) = \frac{e^{itb} - e^{ita}}{i t (b - a)}.$$

### Likelihood and MLE

For a sample $x_1, \dots, x_n$:
$$\mathcal{L}(a, b) = \prod_{i=1}^n \frac{\mathbf{1}\{a \leq x_i \leq b\}}{b - a} = \frac{\mathbf{1}\{a \leq x_{(1)}, x_{(n)} \leq b\}}{(b - a)^n},$$
where $x_{(1)} = \min_i x_i$, $x_{(n)} = \max_i x_i$.

**Maximum Likelihood Estimators (MLEs):**
- $\hat{a} = x_{(1)}$
- $\hat{b} = x_{(n)}$

**Note:** This is a non-regular estimation problem; standard Fisher information is not applicable.

### Sampling
Generate $X = a + (b - a)U$, where $U \sim \mathcal{U}(0, 1)$.

### Hazard Function
- **Survival:** $S(x) = 1 - \frac{x - a}{b - a}$ for $x \in [a, b]$.
- **Hazard:** $h(x) = \frac{1}{b - x}$, increasing to infinity at $x = b$.

### When to Use
- Modeling complete ignorance within a bounded interval.
- Random offsets or simulation baselines.

---

## 2. Normal (Gaussian) $\mathcal{N}(\mu, \sigma^2)$

**Definition:** The normal distribution models additive effects and is central to statistics due to the Central Limit Theorem (CLT).

**Parameters:** $\mu \in \mathbb{R}$ (mean), $\sigma > 0$ (standard deviation).

**Support:** $x \in \mathbb{R}$.

### PDF
$$f(x) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left( -\frac{(x - \mu)^2}{2\sigma^2} \right).$$

### CDF
No elementary form; denoted as:
$$F(x) = \Phi\left( \frac{x - \mu}{\sigma} \right),$$
where $\Phi$ is the CDF of the standard normal $\mathcal{N}(0, 1)$.

### Standardization
If $X \sim \mathcal{N}(\mu, \sigma^2)$, then:
$$Z = \frac{X - \mu}{\sigma} \sim \mathcal{N}(0, 1).$$

### Moments and Properties

- **Mean:** $\mathbb{E}[X] = \mu$
- **Variance:** $\mathrm{Var}(X) = \sigma^2$
- **Skewness:** 0
- **Excess Kurtosis:** 0
- **MGF:**
  $$M_X(t) = \exp\left( \mu t + \frac{1}{2}\sigma^2 t^2 \right).$$
- **CF:**
  $$\varphi_X(t) = \exp\left( i \mu t - \frac{1}{2}\sigma^2 t^2 \right).$$
- **Entropy:**
  $$H(X) = \frac{1}{2} \ln\left( 2\pi e \sigma^2 \right).$$

### Key Identities

- **Sum of independent normals:** If $X_i \sim \mathcal{N}(\mu_i, \sigma_i^2)$, then $\sum X_i \sim \mathcal{N}\left( \sum \mu_i, \sum \sigma_i^2 \right)$.
- **Linear transformation:** If $X \sim \mathcal{N}(\mu, \sigma^2)$, then $aX + b \sim \mathcal{N}(a\mu + b, a^2 \sigma^2)$.
- **Central Limit Theorem (CLT):** Sums or averages of many i.i.d. variables approximate a normal distribution.

### Likelihood and MLE

For a sample $x_1, \dots, x_n$:
$$\ell(\mu, \sigma^2) = -\frac{n}{2} \ln(2\pi \sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2.$$

**MLEs:**
- $\hat{\mu} = \bar{x}$
- $\widehat{\sigma^2}_{\text{MLE}} = \frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^2$

**Note:** Unbiased variance estimator uses $\frac{1}{n-1}$ instead of $\frac{1}{n}$.

### Bayesian Conjugacy

- **Known $\sigma^2$:** Normal prior for $\mu$ yields a normal posterior.
- **Unknown $(\mu, \sigma^2)$:** Normal-Inverse-Gamma or Normal-Inverse-$\chi^2$ prior.

### Sampling
Use the Box-Muller transform:

1. Draw $U_1, U_2 \stackrel{\text{iid}}{\sim} \mathcal{U}(0, 1)$.
2. Compute:
   $$Z_1 = \sqrt{-2 \ln U_1} \cos(2\pi U_2), \quad Z_2 = \sqrt{-2 \ln U_1} \sin(2\pi U_2).$$
3. Set $X = \mu + \sigma Z_1$.

### When to Use

- Modeling aggregated noise or measurement errors.
- Phenomena driven by additive effects.
- Parametric tests relying on CLT (e.g., t-tests, ANOVA).

---

## 3. Exponential $\mathrm{Exp}(\lambda)$

**Definition:** The exponential distribution models waiting times in processes with constant hazard (memoryless property).

**Parameter:** $\lambda > 0$ (rate).

**Support:** $x \geq 0$.

### PDF, CDF, Survival, and Hazard

- **PDF:**
  $$f(x) = \lambda e^{-\lambda x}, \quad x \geq 0.$$
- **CDF:**
  $$F(x) = 1 - e^{-\lambda x}.$$
- **Survival:**
  $$S(x) = e^{-\lambda x}.$$
- **Hazard:**
  $$h(x) = \frac{f(x)}{S(x)} = \lambda \quad (\text{constant}).$$

### Quantile and Memorylessness

- **Quantile:**
  $$Q(p) = -\frac{1}{\lambda} \ln(1 - p), \quad 0 \leq p < 1.$$
- **Memoryless Property:**
  $$\mathbb{P}(X > s + t \mid X > s) = \mathbb{P}(X > t).$$

### Moments and Properties

- **Mean:** $\mathbb{E}[X] = \frac{1}{\lambda}$
- **Variance:** $\mathrm{Var}(X) = \frac{1}{\lambda^2}$
- **MGF:**
  $$M_X(t) = \frac{\lambda}{\lambda - t}, \quad t < \lambda.$$
- **Entropy:**
  $$H(X) = 1 - \ln \lambda.$$

### Likelihood and MLE

For a sample $x_1, \dots, x_n$:
$$\ell(\lambda) = n \ln \lambda - \lambda \sum_{i=1}^n x_i.$$

**MLE:**
$$\hat{\lambda} = \frac{n}{\sum x_i} = \frac{1}{\bar{x}}.$$

- **Sufficient Statistic:** $\sum x_i$.
- **Fisher Information:** $I(\lambda) = \frac{n}{\lambda^2}$.

### Bayesian Conjugacy
If $\lambda \sim \mathrm{Gamma}(\alpha, \beta)$, the posterior is:
$$\lambda \mid x_1, \dots, x_n \sim \mathrm{Gamma}\left( \alpha + n, \beta + \sum x_i \right).$$

### Sampling
Generate $X = -\frac{\ln U}{\lambda}$, where $U \sim \mathcal{U}(0, 1)$.

### When to Use

- Modeling inter-arrival times in Poisson processes.
- Systems with constant hazard (no aging).
- Queueing theory and reliability analysis.

---

## 4. Student's t $\mathrm{t}_\nu(\mu, s)$

**Definition:** The t-distribution is used for small-sample inference and robust modeling, with heavier tails than the normal.

**Parameters:** $\mu \in \mathbb{R}$ (location), $s > 0$ (scale), $\nu > 0$ (degrees of freedom). Standard form: $\mu = 0$, $s = 1$.

**Support:** $x \in \mathbb{R}$.

### PDF
$$f(x) = \frac{\Gamma\left( \frac{\nu + 1}{2} \right)}{s \sqrt{\nu \pi} \, \Gamma\left( \frac{\nu}{2} \right)} \left( 1 + \frac{1}{\nu} \left( \frac{x - \mu}{s} \right)^2 \right)^{-\frac{\nu + 1}{2}}.$$

### CDF (Standard Form, $\mu = 0, s = 1$)
Expressed using the regularized incomplete Beta function $I_z(a, b)$:
$$F(t) = \frac{1}{2} + \operatorname{sgn}(t) \cdot \frac{1}{2} I_{\frac{\nu}{t^2 + \nu}}\left( \frac{\nu}{2}, \frac{1}{2} \right),$$
or via hypergeometric function:
$$F(t) = \frac{1}{2} + t \cdot \frac{\Gamma\left( \frac{\nu + 1}{2} \right)}{\sqrt{\nu \pi} \, \Gamma\left( \frac{\nu}{2} \right)} \, {}_2F_1\left( \frac{1}{2}, \frac{\nu + 1}{2}; \frac{3}{2}; -\frac{t^2}{\nu} \right).$$

### Key Relationship
If $Z \sim \mathcal{N}(0, 1)$, $V \sim \chi^2_\nu$, and $Z \perp V$, then:
$$T = \frac{Z}{\sqrt{V / \nu}} \sim \mathrm{t}_\nu.$$
For location-scale: $X = \mu + s T$.

### Moments (Standard Form)

- **Mean:** $\mathbb{E}[T] = 0$ ($\nu > 1$)
- **Variance:** $\mathrm{Var}(T) = \frac{\nu}{\nu - 2}$ ($\nu > 2$)
- **Skewness:** 0 ($\nu > 3$)
- **Excess Kurtosis:** $\frac{6}{\nu - 4}$ ($\nu > 4$)
- **MGF:** Does not exist.
- **CF:** Exists but lacks a simple elementary form.

### Heavy Tails and Robustness
The t-distribution has heavier tails than the normal, making it robust to outliers, especially for small $\nu$.

### Likelihood and Estimation
No closed-form MLEs for $(\mu, s, \nu)$. Numerical optimization or Expectation-Maximization (EM) is required. For fixed $\nu$, $\mu$ and $s$ can be estimated via iteratively reweighted least squares (IRLS).

### Sampling

1. Draw $Z \sim \mathcal{N}(0, 1)$, $V \sim \chi^2_\nu$.
2. Compute $T = \frac{Z}{\sqrt{V / \nu}}$.
3. Set $X = \mu + s T$.

### When to Use

- Small-sample inference with unknown variance.
- Robust regression modeling under outliers.
- Bayesian models using scale mixtures of normals.

---

## 5. Chi-square $\chi^2_k$

**Definition:** The chi-square distribution models the sum of squared standard normals.

**Parameter:** $k > 0$ (degrees of freedom).

**Support:** $x > 0$.

**Relationship:** If $Z_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, 1)$, then $\sum_{i=1}^k Z_i^2 \sim \chi^2_k$. Equivalent to $\mathrm{Gamma}\left( \frac{k}{2}, 2 \right)$.

### PDF and CDF

- **PDF:**
  $$f(x) = \frac{1}{2^{k/2} \Gamma(k/2)} x^{k/2 - 1} e^{-x/2}, \quad x > 0.$$
- **CDF:**
  $$F(x) = \frac{\gamma\left( \frac{k}{2}, \frac{x}{2} \right)}{\Gamma(k/2)},$$
  where $\gamma(a, x)$ is the lower incomplete gamma function.

### Moments and Properties

- **Mean:** $\mathbb{E}[X] = k$
- **Variance:** $\mathrm{Var}(X) = 2k$
- **Skewness:** $\sqrt{\frac{8}{k}}$
- **Excess Kurtosis:** $\frac{12}{k}$
- **MGF:**
  $$M_X(t) = (1 - 2t)^{-k/2}, \quad t < \frac{1}{2}.$$
- **Entropy:**
  $$H(X) = \frac{k}{2} + \ln\left( 2 \Gamma(k/2) \right) + \left( 1 - \frac{k}{2} \right) \psi\left( \frac{k}{2} \right),$$
  where $\psi$ is the digamma function.

### Relationships

- **Additivity:** If $X_1 \sim \chi^2_{k_1}$, $X_2 \sim \chi^2_{k_2}$, and $X_1 \perp X_2$, then $X_1 + X_2 \sim \chi^2_{k_1 + k_2}$.
- **F-distribution:** If $X \sim \chi^2_k$, $Y \sim \chi^2_m$, and $X \perp Y$, then $\frac{X/k}{Y/m} \sim F_{k, m}$.

### Estimation Use
Used for variance tests in normal data:
$$\frac{(n - 1)S^2}{\sigma^2} \sim \chi^2_{n - 1},$$
where $S^2$ is the sample variance.

### Sampling

- Sum the squares of $k$ i.i.d. $\mathcal{N}(0, 1)$ variates.
- Alternatively, use a Gamma sampler with shape $\frac{k}{2}$, scale 2.

### When to Use

- Variance inference for normal data.
- Goodness-of-fit tests.
- Components of sums of squares in ANOVA.
- Large-sample approximations in contingency tables.

---

## 6. F Distribution $F_{d_1, d_2}$

**Definition:** The F-distribution models the ratio of scaled chi-square variables.

**Parameters:** $d_1, d_2 > 0$ (degrees of freedom).

**Support:** $x > 0$.

**Relationship:** If $U \sim \chi^2_{d_1}$, $V \sim \chi^2_{d_2}$, and $U \perp V$, then:
$$F = \frac{(U / d_1)}{(V / d_2)} \sim F_{d_1, d_2}.$$

### PDF and CDF

- **PDF:**
  $$f(x) = \frac{1}{\mathrm{B}\left( \frac{d_1}{2}, \frac{d_2}{2} \right)} \left( \frac{d_1}{d_2} \right)^{d_1/2} \frac{x^{d_1/2 - 1}}{\left( 1 + \frac{d_1}{d_2} x \right)^{(d_1 + d_2)/2}}, \quad x > 0.$$
- **CDF:**
  $$F(x) = I_{\frac{d_1 x}{d_1 x + d_2}}\left( \frac{d_1}{2}, \frac{d_2}{2} \right),$$
  where $I_z(a, b)$ is the regularized incomplete Beta function.

### Moments and Properties

- **Mean:** $\mathbb{E}[F] = \frac{d_2}{d_2 - 2}$, $d_2 > 2$
- **Variance:**
  $$\mathrm{Var}(F) = \frac{2 d_2^2 (d_1 + d_2 - 2)}{d_1 (d_2 - 2)^2 (d_2 - 4)}, \quad d_2 > 4.$$
- **Mode:**
  $$\frac{(d_1 - 2)}{d_1} \cdot \frac{d_2}{d_2 + 2}, \quad d_1 > 2.$$

### Relationships

- If $T \sim \mathrm{t}_\nu$, then $T^2 \sim F_{1, \nu}$.
- **Reciprocal:** If $X \sim F_{d_1, d_2}$, then $\frac{1}{X} \sim F_{d_2, d_1}$.

### ANOVA and Regression
Used in global F-tests:
$$F = \frac{\text{MS}_{\text{model}}}{\text{MS}_{\text{error}}} = \frac{(SSR / d_1)}{(SSE / d_2)},$$
where $SSR$ is the sum of squares for the model, and $SSE$ is the residual sum of squares.

### Sampling

1. Draw $U \sim \chi^2_{d_1}$, $V \sim \chi^2_{d_2}$.
2. Compute $F = \frac{U / d_1}{V / d_2}$.

### When to Use

- Comparing two variances.
- Omnibus tests in ANOVA.
- Nested model comparisons via ratio of mean squares.

---

## Cross-Distribution Relationships

- **Uniform:** Basis for simulation; relates to other distributions via transformations (e.g., inverse CDF sampling).
- **Normal:** Foundational due to CLT; sums of normals remain normal; standardized normal leads to $\chi^2$, t, and F.
- **Exponential:** Linked to Poisson processes; special case of Gamma; memoryless property is unique.
- **t:** Arises as a ratio involving $\mathcal{N}(0, 1)$ and $\chi^2$; converges to normal as $\nu \to \infty$.
- **$\chi^2$:** Sum of squared standard normals; equivalent to $\mathrm{Gamma}(k/2, 2)$; used in t and F.
- **F:** Ratio of scaled $\chi^2$ variables; used in variance comparisons and ANOVA.

---

## When to Choose What?

- **Uniform:** Use for complete ignorance within a bounded interval or simulation inputs.
- **Normal:** Use for additive effects, measurement errors, or CLT-driven models.
- **Exponential:** Use for waiting times with constant hazard (e.g., Poisson processes).
- **t:** Use for small samples with unknown variance or robust modeling under outliers.
- **$\chi^2$:** Use for variance inference, goodness-of-fit, or sums of squares in ANOVA.
- **F:** Use for comparing variances or omnibus tests in ANOVA and regression.

---

## Estimation Cheat-Sheet (i.i.d. Sample)

- **Uniform $[a, b]$:** $\hat{a} = x_{(1)}$, $\hat{b} = x_{(n)}$ (non-regular case).
- **Normal:** $\hat{\mu} = \bar{x}$, $\widehat{\sigma^2}_{\text{MLE}} = \frac{1}{n} \sum (x_i - \bar{x})^2$.
- **Exponential:** $\hat{\lambda} = \frac{1}{\bar{x}}$.
- **t (location-scale):** No closed-form MLEs; use numerical methods.
- **$\chi^2_k$, $F_{d_1, d_2}$:** Degrees of freedom typically known; if unknown, use method-of-moments or numerical likelihood.

---

## Hazard and Survival (Reliability Perspective)

- **Uniform:** Survival $S(x) = 1 - \frac{x - a}{b - a}$, hazard $h(x) = \frac{1}{b - x}$ (increases to infinity).
- **Exponential:** Constant hazard $h(x) = \lambda$, survival $S(x) = e^{-\lambda x}$.
- **Normal, t, $\chi^2$, F:** No simple closed-form hazards; shapes depend on parameters (t, $\chi^2$, and F have heavy right tails).

---

## Sampling Recipes

- **Uniform:** $X = a + (b - a)U$, $U \sim \mathcal{U}(0, 1)$.
- **Normal:** Box-Muller transform (see Normal section).
- **Exponential:** $X = -\frac{\ln U}{\lambda}$, $U \sim \mathcal{U}(0, 1)$.
- **t:** Draw $Z \sim \mathcal{N}(0, 1)$, $V \sim \chi^2_\nu$, compute $T = \frac{Z}{\sqrt{V / \nu}}$, then $X = \mu + s T$.
- **$\chi^2_k$:** Sum of squares of $k$ i.i.d. $\mathcal{N}(0, 1)$ or use Gamma sampler.
- **F:** Draw $U \sim \chi^2_{d_1}$, $V \sim \chi^2_{d_2}$, compute $F = \frac{U / d_1}{V / d_2}$.

---

## Common Pitfalls

- **Normal with Heavy Tails:** Use t-distribution for heavy-tailed data or small samples.
- **Exponential Misuse:** Verify constant hazard before using for lifetimes.
- **Variance Estimation:** MLE for normal variance uses $\frac{1}{n}$; unbiased estimator uses $\frac{1}{n-1}$.
- **Uniform Endpoints:** Non-regular estimation; standard asymptotics do not apply.

In [None]:
#above is the indepth knowledge of the continuous probability distributions