---
title: "Random Variables"
author: "Daniel Smith"
date: "2024-01-16"
categories: [Mathematics, Probability Theory]
title-block-banner: false
jupyter: python3
draft: true
---

A gentle overview of the theory of random variables is provided for those with a basic knowledge of measure theory. Throughout we fix a probability space $(\Omega,\mathcal{F},\mathbb{P})$ and consider $\mathbb{R}$ equipped with the Borel $\sigma$-algebra $\mathcal{B}(\mathbb{R})$.

## Definition - Random Variables

A **random variable** $X$ is a measurable map
$X:\Omega\rightarrow\mathbb{\mathbb{R}}$.\
That is, $X:\Omega\rightarrow\mathbb{\mathbb{R}}$ is a random variable if for all measurable $A\subset\mathbb{R}$ we have that the pre-image 

$$X^{-1}(A) := \{\omega \in \Omega \,|\, X(\omega)\in A \} \subset \Omega$$
is a measurable subset of $\Omega$, i.e. is an element of $\mathcal{F}$.


\
For measurable $A\subset\mathbb{R}$ introduce the notation
$$\mathbb{P}(X \in A) := \mathbb{P}\left(X^{-1}(A)\right) = \mathbb{P} \left( \left\{ \omega \in \Omega \,|\, X(\omega)\in A  \right\} \right).$$


A **stochastic process** is an indexed family of random variables
${\{X_t\}_{t\in T}}$ where the indexing set $T$ is not necessarily
countable, and the index $t$ is often interpreted as time.

---

## Definition - The Distribution of a Random Variable

Given a random variable $X$ the map
\begin{align*}
\mathcal{L}_X: \mathcal{B}(\mathbb{R}) &\longrightarrow [0,1]\\
A &\longmapsto \mathbb{P}(X\in A)
\end{align*}

is called the **distribution** (or the **law**) of X. The distribution completely describes the random variable and there is often no need to explicitly refer to the sample space $\Omega$.

The distribution $\mathcal{L}_X$ is a (Borel) probability measure on $\left(\mathbb{R},\mathcal{B}(\mathbb{R})\right)$ and is the pushforward/image measure on $\left(\mathbb{R},\mathcal{B}(\mathbb{R})\right)$ induced from $\mathbb{P}$ by $X$. By abuse of notation it is common to write

$$\mathcal{L}_X = \mathbb{P} \circ X^{-1}$$

The **cumulative distribution function** (CDF) $F_X : \mathbb{R} \rightarrow [0,1]$ of a random variable
$X$ is defined by 

$$F_X(x) = \mathbb{P}(X\leq x) := \mathbb{P}(X\in (-\infty,x]) = \mathcal{L}_X((-\infty,x]).$$

-------------------------

## Remark - Random Variables vs Distributions

Do not confuse a random variable $X: \Omega \rightarrow \mathbb{R}$ with it's distribution $\mathcal{L}_X: \mathcal{B}(\mathbb{R}) \rightarrow [0,1]$. 

The distribution $\mathcal{L}_X$ completely describes the random variable $X$ in the sense that we can recover any probability $\mathbb{P}(X\in A)$ from the distribution $\mathcal{L}_X$ via

$$\mathbb{P}(X\in A) = \mathcal{L}_X(A).$$

Since the probabilities of outcomes are the only physically relevant features of $X$ the distribution $\mathcal{L}_X$ carries all useful information about $X$. However, very different random variables can have the same distributions. For example:


---

By standard properties of measurable functions we have the following 

## Lemma - New Random Variables from Old

If $X,\,Y : \Omega \rightarrow \mathbb{R}$ are random variables, $\lambda\in\mathbb{R}$ and $g:\mathbb{R}\rightarrow\mathbb{R}$ is measurable then:

- $X + Y$ is a random variable
- $XY$ is a random variable
- $\lambda X$ is a random variable (in particular $X - Y$ is a random variable)
- $X/Y$ is a random variable if $Y$ is never zero
- $g(X) = g \circ X : \Omega \rightarrow \mathbb{R}$ is a random variable

---

## Definition - Discrete Random Variables/Distributions

A random variable $X:\Omega\rightarrow\mathbb{\mathbb{R}}$ is called **discrete** if there exists a countable set $C\subset\mathbb{R}$ with $\mathcal{L}_X(C)=1$.

In particular, if $X$ exclusively takes values in a discrete set such as $\mathbb{N}$ or $\{0,1,\dots,k\}$ then $X$ is a discrete random variable.

For discrete random variables it makes sense to talk about the probability of individual outcomes, such as for $k\in\mathbb{R}$: 

\begin{align*}
\mathbb{P}(X=k) &= \mathbb{P}\left(X^{-1}\{k\}\right)\\
&= \mathcal{L}_X\left(\{k\}\right).
\end{align*}

We can then define the **probability mass function** (PMF) $p_X : \mathbb{R} \rightarrow [0,1]$ of $X$ by $p_X(x) = \mathbb{P}(X=x).$

The probability mass function $p_X$ is related to the cumulative distribution function $F_X$ via

$$F_X(x) = \sum_{x_i \leq x}p_X(x_i) = \sum_{x_i \leq x}\mathbb{P}(X=x_i) .$$

---

## Example - Bernoulli Distribution

Consider a random variable $X$ taking values in $\{0,1\}$ with probabilities

\begin{align*}
\mathbb{P}(X=1) &= p \\
\mathbb{P}(X=0) &= 1-p
\end{align*}

for some parameter $p\in[0,1]$. 

Then $X$ follows the **Bernoulli distribution** with parameter $p$, denoted $X \sim \text{Ber} (p)$.

Equivalently, a Bernoulli random variable $X$ has probability mass function 

\begin{align*}
p_X(x) = 
\begin{cases}
p \,&\text{if}\, x=1\\
1-p \,&\text{if}\, x=0
\end{cases}
\end{align*}

Informally, the Bernoulli distribution can be thought of as a model for the set of possible outcomes of a single random experiment with a yes/no outcome. 

For example, a single flip of a coin is a *Bernoulli trial* with probability of success (heads) $p=0.5$, assuming the coin is unbiased.

---

## Example - Binomial Distribution

Consider a random variable $X$ taking values in $\{0,1,\dots,n\}$ for some positive integer $n$ with probabilities

$$ \mathbb{P}(X=k) = {n \choose k}p^k(1-p)^{n-k} $$

for $k\in\{0,1,\dots,n\}$ and some parameter $p\in[0,1]$.

Then $X$ follows the **Binomial distribution** with parameters $n$ and $p$, denoted $X \sim \text{Bin} (n,p)$.

Such a binomial random variable $X$ can be written as the sum of $n$ Bernoulli random variables $X_j$ defined on the same sample space:

$$X = X_1 + X_2 + \dots + X_n$$

because $X=k \iff k$ of the summands take the value 1.