## Random Variables
In discussing the rules of probability we encountered the notion of $\Omega$, the set of all possible outcomes of a random process. 

To link the notion of events such as $E$ and collections of events, or *probability spaces* $\Omega$ to data, we must introduce the concept of random variables. The following definition is taken from Larry Wasserman's All of Stats.

**Definition**. A random variable is a mapping

$$ X: \Omega \rightarrow \mathbb{R}$$

that assigns a real number $X(\omega)$ to each outcome $\omega$. $\Omega$ is the sample space. Points
$\omega$ in $\Omega$ are called sample outcomes, realizations, or elements. Subsets of
$\Omega$ are called Events. Say $\omega = HHTTTTHTT$ then $X(\omega) = 3$ if defined as number of heads in the sequence $\omega$.

We will assign a real number P(A) to every event A, called the probability of
A. We also call P a probability distribution or a probability measure.
To qualify as a probability, P must satisfy the three axioms (non-negative, $P(\Omega)=1$, disjoint probs add).

## A Note on Notation
Notation in probability is uniformly sloppy, especially regarding random variables. 

Taking the Binomial Distribution as an example, one might see the following **correct** notation:

$$
  X \sim Binom(n,p) \\
  P(X=k\,|\,N=n,P=p)=  \binom{n}{k} p^k (1-p)^{(n-k)}
$$

The key thing here is that _events, and only events_ have probability. A probability distribution is the probability of each of various events (often given some pre-conditions or assumptions).

The lower notation makes this explicit: It's the probability that X (standing for the number of heads we _might_ observe) takes a particular value k (e.g. 5 heads), **given** that we are making n flips and the bias on the coin is p. It also gives us a plug-in formula to compute such a probability, and every value used in the formula is put either before the bar (random, part of the probability, value not yet set) or after the bar (has a known value).

Notably, we know that an integral over all values of k will give 1, and an intergral over the other parameters almost certainly will not.

Or one might see any of the following **conventions** or **abuses**:
$$
  P(X)\\
  P(x)\\
  P(X=x) = Binom(x;n,p)\\
  P(X=x) = \binom{n}{x} p^x(1-p)^{(n-x)}\\
  P(X=k) = \binom{n}{k} p^k(1-p)^{(n-k)}\\
  X|N,P\\
  X|n,p\\
  P(x|Data)
$$

Whenever you see any of the above, try and translate back to the explicit situaiton above: what is given/known/behind the bar, what is random/unknon/ahead of the bar, and what values does each thing take? For instance, $P(x|Data)$ means to be $p(X=x|Data=observed\ data)$

### Summary
Any capital letter like $X$ is a random variable. It's some quantity that can take on multiple possible values. We can talk about the probability of it taking on some value $x$ or $k$ or $5$. And, below, we can make graphs that show the probability of $X=k$ or $X\le k$ for various values of k.

## Cumulative distribution Function

The **cumulative distribution function**, or the **CDF**, is a function

$$F_X : \mathbb{R} → [0, 1] $$,

 defined by

$$F_X (x) = p(X \le x).$$

A note on notation: $X$ is a random variable while $x$ is a particular value of the random variable.

Let $X$ be the random variable representing the number of heads in two coin tosses. Then $x$ can take on values 0, 1 and 2. The CDF for this random variable can be drawn thus (taken from All of Stats):

![](images/2tosscdf.png)

  Notice that this function is right-continuous and defined for all $x$, even if $x $does not take real values in-between the integers.

## Probability Mass and Distribution Function

$X$ is called a **discrete random variable** if it takes countably many values $\{x_1, x_2,…\}$. We define the **probability function** or the **probability mass function** (**pmf**) for X by:

$$f_X(x) = p(X=x)$$

$f_X$ **is a probability**.

The pmf for the number of heads in two coin tosses (taken from All of Stats) looks like this:

![](images/2tosspmf.png)

On the other hand, a random variable is called a **continuous random variable** if there exists a function $f_X$ such that $f_X (x) \ge 0$ for all x,  $\int_{-\infty}^{\infty} f_X (x) dx = 1$ and for every a ≤ b,

$$p(a < X < b) = \int_{a}^{b} f_X (x) dx$$

The function $f_X$ is called the probability density function (pdf). We have the CDF:

$$F_X (x) = \int_{-\infty}^{x}f_X (t) dt $$

and $f_X (x) = \frac{d F_X (x)}{dx}$ at all points x at which $F_X$ is differentiable.

Continuous variables are confusing. Note:

1. $p(X=x) = 0$ for every $x$. You **cant think** of $f_X(x)$ as $p(X=x)$. This holds only for discretes. You can only get probabilities from a pdf by integrating, if only over a very small paty of the space.
2. A pdf can be bigger than 1 unlike a probability mass function, since probability masses represent actual probabilities.

### A continuous example: the Uniform Distribution

Suppose that X has pdf
$$
f_X (x) =
\begin{cases}
1 & \text{for } 0 \leq x\leq 1\\
    0             & \text{otherwise.}
\end{cases}
$$
A random variable with this density is said to have a Uniform (0,1) distribution. This is meant to capture the idea of choosing a point at random between 0 and 1. The cdf is given by:
$$
F_X (x) =
\begin{cases}
0 & x \le 0\\
x & 0 \leq x \leq 1\\
1 & x > 1.
\end{cases}
$$
and can be visualized as so (again from All of Stats):

![](images/unicdf.png)

### A discrete example: the Bernoulli Distribution

The **Bernoulli Distribution** represents the distribution a coin flip. Let the random variable $X$ represent such a coin flip, where $X=1$ is heads, and $X=0$ is tails. Let us further say that the probability of heads is $p$ ($p=0.5$ is a fair coin).

We then say:

$$X \sim Bernoulli(p)$$

which is to be read as $X$ **has distribution** $Bernoulli(p)$. The pmf or probability function associated with the Bernoulli distribution is
$$
f(x) =
\begin{cases}
1 - p & x = 0\\
p & x = 1.
\end{cases}
$$


for p in the range 0 to 1. This pmf may  be written as

$$f(x) = p^x (1-p)^{1-x}$$

for x in the set {0,1}.

$p$ is called a parameter of the Bernoulli distribution.