# Discrete Random Variables
Notes based on [Probability course's chapter 3](https://www.probabilitycourse.com/chapter3/3_1_1_random_variables.php).

## Basic Concepts

### Range
A random variable is a 'real-valued' function that assigns a numerical value to each possible outcome of a random experiment. Essentially mapping a set numerical values to a set of possibly non-numerical values.   

$X : S \rightarrow \mathbb{R}$

Say you flip a coin twice, then the possible results are defined as such:

In [1]:
coin = ["Heads", "Tails"]
outcomes = [(c1, c2) for c1 in coin for c2 in coin]
outcomes

[('Heads', 'Heads'),
 ('Heads', 'Tails'),
 ('Tails', 'Heads'),
 ('Tails', 'Tails')]

The possible results is a list (or sample space) of length 4. Each unique outcome can then be mapped to a numerical value. 
The numerical values used are called the range of a random variable.

In [2]:
x_range = list(range(len(outcomes)))
x_range

[0, 1, 2, 3]

In [3]:
mapped_x = list(zip(x_range, outcomes))
mapped_x

[(0, ('Heads', 'Heads')),
 (1, ('Heads', 'Tails')),
 (2, ('Tails', 'Heads')),
 (3, ('Tails', 'Tails'))]

$(X) = R_{X} = \{0,1,2,3\}$

### Difference between Discrete and Continuous random variables

A random variable is considered discrete when its range is countable. A range is a set, and a set is countable when it is either:

* A finite set such as $\{1,2,3\}$, or
* It can be put in one-to-one correspondence with natural numbers. These types of sets are said to be countably infinite.

### Probability Mass Function (PMF)

Random variables are often denoted by use of capital letters, $X, Y, Z$, etc. The numbers in a random variable's range, however, are usually denoted by lowercase letters such as $x_{1}, x_{2}, y_{1}, z_{1}$, etc.

In order to figure out the probability of the event where $X = x_{k}$, the **probability mass function** may be used. The first step, is to define the event:  

$A = \{ s \in S | X(s) = x_{k} \}$  

Next, to define the **probability mass function**:

$P_{X}(x_{k}) = P(X = x_{k})$ where
> $R_{X} = \{x_{0}, x_{1}, x_{2},...\}$  
> $k = 0, 1, 2,...$



In [4]:
S = sorted([outcome.count("Heads") for outcome in outcomes])
X_range = list(range(len(set(S))))
S, X_range

([0, 1, 1, 2], [0, 1, 2])

In [5]:
def PMF(S, i):
    """S : Sample space, i : Index, returns  probability mass function of item i in S"""
    return S.count(S[i]) / len(S)


In [6]:
for i in range(len(S)):
    print(f"PMF for S(x_k) = {S[i]}: {PMF(S, i)}")

PMF for S(x_k) = 0: 0.25
PMF for S(x_k) = 1: 0.5
PMF for S(x_k) = 1: 0.5
PMF for S(x_k) = 2: 0.25


The **probability mass function** is also called the **probability distribution**.