# Random Variables

**Definition**

A random variable is a mapping $X:\Omega\to\mathbb{R}$ that assigns a real number $X(\omega)$ to each outcome $\omega$.

---

**Definition**

Given $X$ and $A\subset\mathbb{R}$

$X^{-1}(A) = \{\omega\in\Omega: X(\omega)\in A\}$

$\mathbb{P}(X\in A) = \mathbb{P}(X^{-1}(A)) = \mathbb{P}(\{\omega\in\Omega: X(\omega)\in A\})$

$\mathbb{P}(X = x) = \mathbb{P}(X^{-1}(x)) = \mathbb{P}(\{\omega\in\Omega: X(\omega) = x\})$, where $x$ - a particular value of $X$

### Distribution Functions and Probability Functions

**Definition**

The cumulative distribution function (CDF) is the function $F_X:\mathbb{R}\to[0,1]$, defined by $F_X(x)=\mathbb{P}(X\leq x)$

---

**Theorem**

Let $X$ have CDF $F$ and let $Y$ have CDF $G$. If $F(x) = G(x)$ for all $x$, then $\mathbb{P}(X\in A) = \mathbb{P}(Y\in A)$ for all $A$.

---

**Theorem**

A function $F$ mapping $\mathbb{R}\to[0,1]$ is a CDF for some probability $\mathbb{P}$ $\iff$ $F$ satisfies the following 3 conditions:
- $F$ is non-decreasing: $x_1 < x_2 \Rightarrow F(x_1)\leq F(x_2)$
- $F$ is normalized:
  - $\lim_{x\to-\infty} F(x) = 0$
  - $\lim_{x\to\infty} F(x) = 1$
- $F$ is right-continuous: $\forall x: F(x) = F(x^+)$, where $F(x^+) = \lim_{y\downarrow x}F(y)$

---

**Definition**

$X$ is discrete if it takes countably many values $\{x_1, x_2, ...\}$.

$f_X(x) = \mathbb{P}(X=x)$ - the probability function or a probability mass function
- $\forall x\in\mathbb{R}: f_X(x) \geq 0$
- $\sum_{i} f_X(x_i) = 1$
- $F_X(x) = \mathbb{P}(X\leq x) = \sum_{x_i < x} f_X(x_i)$

---

**Definition**

$X$ is continuous if $\exists f_X$ - the probability density function:
- $\forall x: f_X(x)\geq 0$
- $\int_{-\infty}^\infty f_X(x)dx = 1$
- $\forall a \leq b: \mathbb{P}(a\leq X < b) = \int_a^b f_X(x)dx$
- $F_X(x) = \int_{-\infty}^x f_X(t)dt$
- $f_X(x) = dF_X(x)/dx$ at all points where $F_X(x)$ is differentiable

---

**Lemma**

Let $F$ be the CDF for $X$:
- $\mathbb{P}(X = x) = F(x) - F(x^-)$, where $F(x^-) = \lim_{y\uparrow x}F(y)$
- $\mathbb{P}(x < X \leq y) = F(y) - F(x)$
- $\mathbb{P}(X > x) = 1 - F(x)$
- $X$ is continuous $\Rightarrow F(b) - F(a) = \mathbb{P}(a<X<b) = \mathbb{P}(a\leq X < b) = \mathbb{P}(a < X \leq b) = \mathbb{P}(a\leq X \leq b)$

---

**Definition**

The inverse CDF or quantile function is $F^{-1}(q) = \inf\{x: F(x)>q\}$ for $q\in[0,1]$.

If $F$ is strictly increasing and continuous, then $F^{-1}(q)$ is a unique real number $x: F(x)=q$.

---

**Definition**

Two random variables $X$ and $Y$ are equal in distribution if $\forall x: F_X(x) = F_Y(x)$.

### Discrete Random Variables

---

**The point mass distribution** $X\sim\delta_a$

$F(x) = \begin{cases} 0 & x < a \\ 1 & x\geq a \end{cases}$

$f(x) = \begin{cases} 0 & x\neq a \\ 1 & x = a \end{cases}$

---

**The discrete uniform distribution**

$k > 1$

$f(x) = \begin{cases} 1/k & x \in \{1,...,k\} \\ 0 & x \not\in \{1,...,k\} \end{cases}$

---

**The Bernoulli distribution** $X\sim \text{Bernoulli}(p)$

$p\in[0, 1]$

$f(x) = p^x (1-p)^{1-x}$, where $x\in\{0,1\}$

---

**The binomial distribution** $X\sim \text{Binomial}(n, p)$

$p\in[0, 1]$, $n > 0$

$f(x) = \begin{cases} \binom{n}{x}p^x (1-p)^{n-x} & x\in\{0,...,n\} \\ 0 & x\not\in \{0,...,n\} \end{cases}$

if $X_1\sim \text{Binomial}(n_1, p)$ and $X_2\sim \text{Binomial}(n_2, p)$, then $X_1 + X_2\sim \text{Binomial}(n_1 + n_2, p)$

---

**The geometric distribution** $X\sim \text{Geom}(p)$

$p\in(0, 1)$

$f(x) = p(1-p)^{x-1}$, where $x\geq 1$

---

**The Poisson distribution** $X\sim \text{Poisson}(\lambda)$

$\lambda \geq 0$

$f(x) = e^{-\lambda}\frac{\lambda^x}{x!}$, where $x\geq 0$

if $X_1\sim \text{Poisson}(\lambda_1)$ and $X_2\sim \text{Poisson}(\lambda_2)$, then $X_1 + X_2\sim \text{Poisson}(\lambda_1 + \lambda_2)$

### Continuous Random Variables

---

**The uniform distribution** $X\sim \text{Uniform}(a,b)$

$a < b$

$f(x) = \begin{cases} \frac{1}{b-a} & x\in[a,b] \\ 0 & x\not\in[a,b] \end{cases}$

---

**The normal (Gaussian) distribution** $X\sim N(\mu,\sigma^2)$

$\mu\in\mathbb{R}$, $\sigma > 0$

$f(x) = \frac{1}{\sigma\sqrt{2\pi}}\exp{(-\frac{1}{2\sigma^2}(x-\mu)^2)}$, where $x\in\mathbb{R}$

$\phi(z) = \frac{1}{\sqrt{2\pi}}\exp(-\frac{1}{2}z^2)$, where $z\in\mathbb{R}$

- $X\sim N(\mu, \sigma^2) \Rightarrow Z = \frac{X-\mu}{\sigma}\sim N(0,1)$
- $Z\sim N(0,1) \Rightarrow X = \mu + \sigma Z \sim N(\mu, \sigma^2)$
- if $\forall i\in\{1,2,...,n\}$ $X_i\sim N(\mu_i, \sigma_i^2)$ are independent $\Rightarrow \sum_{i=1}^n X_i \sim N(\sum_{i=1}^n\mu_i,\sum_{i=1}^n\sigma_i^2)$
- $\mathbb{P}(a<X<b) = \mathbb{P}(\frac{a-\mu}{\sigma}<Z<\frac{b-\mu}{\sigma}) = \Phi(\frac{b-\mu}{\sigma}) - \Phi(\frac{a-\mu}{\sigma})$

---

**The exponential distribution** $X\sim \text{Exp}(\beta)$

$\beta > 0$

$f(x) = \frac{1}{\beta}e^{-\frac{x}{\beta}}$, where $x > 0$

---

**The gamma distribution** $X\sim \text{Gamma}(\alpha, \beta)$

$\Gamma(\alpha) = \int_0^\infty y^{\alpha-1}e^{-y}dy$, where $\alpha > 0$

$\alpha > 0$, $\beta > 0$

$f(x) = \frac{1}{\beta^\alpha\Gamma(\alpha)}x^{\alpha-1}e^{-\frac{x}{\beta}}$, where $x > 0$

- if $\forall i\in\{1,2,...,n\}$ $X_i\sim \text{Gamma}(\alpha_i, \beta)$ are independent $\Rightarrow \sum_{i=1}^n X_i \sim \text{Gamma}(\sum_{i=1}^n \alpha_i, \beta)$

---

**The beta distribution** $X\sim \text{Beta}(\alpha, \beta)$

$\alpha > 0$, $\beta > 0$

$f(x) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}(1-x)^{\beta-1}$, where $0 < x < 1$

---

**The t distribution** $X\sim t_\nu$

$f(x) = \frac{\Gamma(\frac{\nu+1}{2})}{\Gamma(\frac{\nu}{2})}\frac{1}{(1+\frac{x^2}{\nu})^\frac{\nu+1}{2}}$

---

**The Cauchy distribution** $X\sim t_1$

$f(x) = \frac{1}{\pi(1+x^2)}$

---

**The $\chi^2$ distribution** $X\sim \chi_p^2$

$f(x) = \frac{1}{\Gamma(\frac{p}{2})2^\frac{p}{2}}x^{\frac{p}{2}-1}e^{-\frac{x}{2}}$, where $x > 0$

- if $Z_1,...,Z_p \sim N(0,1)$ are independent $\Rightarrow \sum_{i=1}^p Z_i^2 \sim \chi_p^2$

### Bivariate Distributions

---

**Definition**

$X$ and $Y$ - discrete random variables

$f(x,y) = f_{XY} = \mathbb{P}(X=x,Y=y)$ - the joint probability mass function
- $\forall x,y\in\mathbb{R}: f(x,y) \geq 0$
- $\sum_{i,j} f(x_i,y_j) = 1$
- $F(x,y) = \mathbb{P}(X\leq x, Y\leq y) = \sum_{x_i < x, y_i < y} f(x_i, y_i)$

---

**Definition**

$X$ and $Y$ - discrete random variables

$f(x,y) = f_{XY}$ - the joint probability density function
- $\forall x,y: f(x,y)\geq 0$
- $\int_{-\infty}^\infty \int_{-\infty}^\infty f(x,y)dxdy = 1$
- $\forall A\subset \mathbb{R}\times\mathbb{R} \to \mathbb{P}((X,Y)\in A) = \int\int_A f(x,y)dxdy$
- $F(x,y) = \mathbb{P}(X\leq x, Y\leq y) = \int_{-\infty}^{x}\int_{-\infty}^{y} f(x_i, y_i)dxdy$

### Marginal Distributions

**Definition**

$(X,Y)\sim f_{XY}$ - probability mass function $\Rightarrow$

$f_X(x) = \mathbb{P}(X=x) = \sum_y\mathbb{P}(X=x,Y=y) = \sum_y f(x,y)$ - marginal mass function for $X$

$f_Y(y) = \mathbb{P}(Y=y) = \sum_x\mathbb{P}(X=x,Y=y) = \sum_x f(x,y)$ - marginal mass function for $Y$

---

**Definition**

$(X,Y)\sim f_{XY}$ - probability density function $\Rightarrow$

$f_X(x) = \int_{-\infty}^\infty f(x,y)dy$

$f_Y(y) = \int_{-\infty}^\infty f(x,y)dx$

### Independent Random Variables

**Definition**

$\mathbb{P}(X\in A, Y\in B) = \mathbb{P}(X\in A)\mathbb{P}(Y\in B) \Rightarrow $ $X$ and $Y$ are independent

---

**Theorem**

Let $X$ and $Y$ have joint PMF $f_{XY}$. Then $X$ and $Y$ are independent $\iff$ $f_{XY}(x,y) = f_X(x)f_Y(y)$ for all values $x$ and $y$.

---

**Theorem**

Let $X$ and $Y$ have joint PDF $f_{XY}$. Then $X$ and $Y$ are independent $\iff$ $f_{XY}(x,y) = f_X(x)f_Y(y)$ for all values $x$ and $y$.

---

**Theorem**

Suppose that the range of $X$ and $Y$ is a (possibly infinite) rectangle. If $f(x,y) = g(x)h(y)$ for some functions $g$ and $h$ (not necessarily PDFs), then $X$ and $Y$ are independent.

### Conditional Distributions

**Definition**

The conditional probability mass function:

$f_{X|Y}(x|y) = \mathbb{P}(X=x|Y=y) = \frac{\mathbb{P}(X=x,Y=y)}{\mathbb{P}(Y=y)} = \frac{f_{XY}(x,y)}{f_Y(y)}$ if $f_Y(y) > 0$

---

**Definition**

The conditional probability density function:

$f_{X|Y}(x|y) = \frac{f_{XY}(x,y)}{f_Y(y)}$ if $f_Y(y) > 0$

$P(X\in A|Y=y) = \int_A f_{X|Y}(x|y)dx$

### Multivariate Distributions and IID Samples

**Notation**

$X = (X_1,...,X_n)$ - random vector, where $X_1,...,X_n$ are random variables

$f(x_1,...,x_n)$ - the PDF

**Definition**

$\forall A_1,...,A_n: \mathbb{P}(X_1\in A_1,...,X_n\in A_n) = \prod_{i=1}^n \mathbb{P}(X_i\in A_i) \Rightarrow X_1,...,X_n$ are independent

$f_{X_1...X_n}(x_1,...,x_n) = \prod_{i=1}^n f_{X_i}(x_i) \Rightarrow X_1,...,X_n$ are independent

**Definition**

If $X_1,...,X_n$ are independent and each has the same marginal CDF $F$ (_i.e._ $X_1,...,X_n\sim F$) $\Rightarrow$ $X_1,...,X_n$ are independent and identically distributed (IID).

If $F$ has $f$, then $X_1,...,X_n\sim f$.

$X_1,...,X_n$ represent a random sample of size $n$ from $F$.

### Two Important Multivariate Distributions

**Multinomial distribution** $X\sim\text{Multinomal}(n,p)$

$f(x) = f(x_1,...,x_k) = \binom{n}{x_1,...,x_k}p_1^{x_1}...p_k^{x_k}$, where $\binom{n}{x_1,...,x_k} = \frac{n!}{x_1!...x_k!}$

$p = (p_1,...,p_k)$, where $p_i \geq 0$ and $\sum_{i=1}^k p_i = 1$

$X = (X_1,...,X_k)$, where $X_i$ - number of appearances of some property with probability $p_i$

$n = \sum_{i=1}^k X_i$

**Lemma**

$X\sim\text{Multinomal}(n,p) \Rightarrow X_i\sim\text{Binomial}(n,p_i)$

---

**Standard multivariate normal distribution** $Z\sim N(\mathbf{0},\mathbf{I})$

$f(z) = \prod_{i=1}^k f(z_i) = \frac{1}{(2\pi)^{k/2}}\exp{\left(-\frac{1}{2}\sum_{i=1}^k z_i^2\right)} = \frac{1}{(2\pi)^{k/2}}\exp{\left(-\frac{1}{2}\mathbf{z}^T\mathbf{z}\right)}$

$Z = (Z_1,...,Z_k)$, where $Z_i\sim N(0,1)$

$\mathbf{0} = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix}$

$\mathbf{I} = \begin{bmatrix} 1 & 0 & \ldots & 0 \\ 0 & 1 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & 1\end{bmatrix}$

---

**Standard multivariate normal distribution** $X\sim N(\mathbf{\mu},\mathbf{\Sigma})$

$f(x; \mathbf{\mu},\mathbf{\Sigma}) = \frac{1}{(2\pi)^{k/2}\left|\mathbf{\Sigma}\right|^{1/2}}\exp{\left(-\frac{1}{2}(\mathbf{x-\mu})^T\mathbf{\Sigma}^{-1}(\mathbf{x-\mu})\right)}$

$\mathbf{\Sigma}$ - symmetric and positive definite matrix $\Rightarrow$:
- $\mathbf{\Sigma}^{1/2}$ - symmetric
- $\mathbf{\Sigma} = \mathbf{\Sigma}^{1/2} \mathbf{\Sigma}^{1/2}$
- $\mathbf{\Sigma}^{1/2} \mathbf{\Sigma}^{-1/2} = \mathbf{\Sigma}^{-1/2} \mathbf{\Sigma}^{1/2} = \mathbf{I}$, where $\mathbf{\Sigma}^{-1/2} = (\mathbf{\Sigma}^{1/2})^{-1}$

**Theorem**

$Z\sim N(\mathbf{0},\mathbf{I})$ and $X = \mathbf{\mu} + \mathbf{\Sigma}^{1/2} Z \Rightarrow X\sim N(\mathbf{\mu},\mathbf{\Sigma})$

$X\sim N(\mathbf{\mu},\mathbf{\Sigma}) \Rightarrow \mathbf{\Sigma}^{-1/2} (X - \mathbf{\mu}) \sim N(\mathbf{0},\mathbf{I})$

**Theorem**

$X\sim N(\mathbf{\mu},\mathbf{\Sigma}) \Rightarrow$:
- $X_a \sim N(\mathbf{\mu}_a,\mathbf{\Sigma}_{aa})$
- $X_b|X_a = x_a \sim N(\mathbf{\mu}_b + \mathbf{\Sigma}_{ba} \mathbf{\Sigma}_{aa}^{-1}(\mathbf{x}_a - \mathbf{\mu}_a),\mathbf{\Sigma}_{bb} - \mathbf{\Sigma}_{ba} \mathbf{\Sigma}_{aa}^{-1} \mathbf{\Sigma}_{ab})$
- $\mathbf{a}^T X \sim N(\mathbf{a}^T\mathbf{\mu}, \mathbf{a}^T \mathbf{\Sigma} \mathbf{a})$
- $V = (X-\mathbf{\mu})^T\mathbf{\Sigma}(X-\mathbf{\mu}) \sim \chi_k^2$

$X = (X_a, X_b)$

$\mathbf{\mu} = \begin{bmatrix} \mathbf{\mu}_a \\ \mathbf{\mu}_b \end{bmatrix}$

$\mathbf{\Sigma} = \begin{bmatrix} \mathbf{\Sigma}_{aa} & \mathbf{\Sigma}_{ab} \\ \mathbf{\Sigma}_{ba} & \mathbf{\Sigma}_{bb} \end{bmatrix}$

### Transformations of Random Variables

$X\sim f_X$ with $F_X$

$Y = r(X)$

**Distrete case**

$f_Y(y) = \mathbb{P}(Y=y) = \mathbb{P}(r(X)=y) = \mathbb{P}(\{x: r(x)=y\}) = \mathbb{P}(X\in r^{-1}(y))$

**Continuous case**

$F_Y(y) = \mathbb{P}(Y\leq y) = \mathbb{P}(r(X)\leq Y) = \mathbb{P}(A_y=\{x: r(x)\leq y\}) = \int_{A_y}f_X(x)dx$

$f_Y(y) = F_Y'(y)$

---

**Definition**

if $r$ is strictly monotone increasing/decreasing $\Rightarrow \exists s = r^{-1}$:

$f_Y(y) = f_X(s(y))\left|\frac{ds(y)}{dy}\right|$

### Transformations of Several Random Variables

$X\sim f_X$ with $F_X$ and $Y\sim f_Y$ with $F_Y$

$Z = r(X,Y)$

**Distrete case**

$f_Z(Z) = \mathbb{P}(Z=z) = \mathbb{P}(r(X,Y)=z) = \mathbb{P}(\{x,y: r(x,y)=z\}) = \mathbb{P}(X,Y\in r^{-1}(z))$

**Continuous case**

$F_Z(z) = \mathbb{P}(Z\leq z) = \mathbb{P}(r(X, Y)\leq z) = \mathbb{P}(A_z=\{x,y: r(x,y)\leq z\}) = \int\int_{A_z}f_X(x,y)dxdy$

$f_Z(z) = F_Z'(z)$