RANDOM VARIABLES (COURSE NOTES)

To mathematically reason about a random variable, we need to somehow keep track of the full range of possibilities for what the random variable's value could be, and how probable different instantiations of the random variable are. The resulting formalism may at first seem a bit odd but as we progress through the course, it will become more apparent how this formalism helps us study real-world problems and address these problems with powerful solutions.

To build up to the formalism, first note, computationally, what happened in the code in the previous part.

1. First, there is an underlying probability space $(Ω,P)$, where $\Omega = \{ \text {sunny}, \text {rainy}, \text {snowy}\}$, and

$$\begin{eqnarray}
\mathbb{P}(\text{sunny}) &=& 1/2, \\
\mathbb{P}(\text{rainy}) &=& 1/6, \\
\mathbb{P}(\text{snowy}) &=& 1/3.
\end{eqnarray}$$

2. A random outcome $ω∈Ω$ is sampled using the probabilities given by the probability space (Ω,P). This step corresponds to an underlying experiment happening.

3. Two random variables are generated:
    - W is set to be equal to ω. As an equation:
    $$\begin{eqnarray}
    W(\omega) &=&\omega\quad\text{for }\omega\in\{\text{sunny},\text{rainy},\text{snowy}\}.
    \end{eqnarray}$$
    This step perhaps seems entirely unnecessary, as you might wonder “Why not just call the random outcome $W$ instead of $ω$?" Indeed, this step isn't actually necessary for this particular example, but the formalism for random variables has this step to deal with what happens when we encounter a random variable like $I$.

    - $I$ is set to $1$ if $ω=\text{sunny}$, and $0$ otherwise. As an equation:
    $$\begin{eqnarray}
    I(\omega)
    &=&
    \begin{cases}
      1 & \text{if }\omega=\text{sunny}, \\
      0 & \text{if }\omega\in\{\text{rainy},\text{snowy}\}.
    \end{cases}
    \end{eqnarray}$$
    Importantly, multiple possible outcomes (rainy or snowy) get mapped to the same value $0$ that $I$ can take on.

We see that random variable $W$ maps the sample space $\Omega =\{ \text {sunny},\text {rainy},\text {snowy}\}$ to the same set $\{ \text {sunny},\text {rainy},\text {snowy}\}$. Meanwhile, random variable I maps the sample space $\Omega =\{ \text {sunny},\text {rainy},\text {snowy}\}$ to the set $\{0,1\}$.

In general:

**Definition of a “finite random variable" (in this course, we will just call this a “random variable"):** Given a finite probability space $(Ω,P)$, a finite random variable $X$ is a mapping from the sample space Ω to a set of values $X$ that random variable $X$ can take on. (We will often call $X$ the “alphabet" of random variable $X$.)

For example, random variable $W$ takes on values in the alphabet $\{ \text {sunny},\text {rainy},\text {snowy}\}$, and random variable I takes on values in the alphabet $\{0,1\}$.

**Quick summary:** There's an underlying experiment corresponding to probability space $(Ω,P)$. Once the experiment is run, let $ω∈Ω$ denote the outcome of the experiment. Then the random variable takes on the specific value of $X(ω)∈X$.

**Explanation using a picture:** Continuing with the weather example, we can pictorially see what's going on by looking at the probability tables for: the original probability space, the random variable $W$, and the random variable $I$:

![img](../images/images_sec-random-variables-main.png)

These tables make it clear that a “random variable" really is just reassigning/relabeling what the values are for the possible outcomes in the underlying probability space (given by the top left table):

- In the top right table, random variable $W$ does not do any sort of relabeling so its probability table looks the same as that of the underlying probability space.

- In the bottom left table, the random variable $I$ relabels/reassigns “sunny" to 1, and both “rainy" and “snowy" to $0$. Intuitively, since two of the rows now have the same label $0$, it makes sense to just combine these two rows, adding their probabilities $(16+13=12)$. This results in the bottom right table.

**Technical note:** Even though the formal definition of a finite random variable doesn't actually make use of the probability assignment P, the probability assignment will become essential as soon as we talk about how probability works with random variables.

## Two Ways to Specify a Random Variable in Code

**Approach 1.** Go with the mathematical definition of a random variable. First, specify what the underlying probability space is:

In [3]:
import sys
sys.path.append('../comp_prob_inference')
from comp_prob_inference import *

In [1]:
prob_space = {'sunny': 1/2, 'rainy': 1/6, 'snowy': 1/3}

In [2]:
W_mapping = {'sunny': 'sunny', 'rainy': 'rainy', 'snowy': 'snowy'}
I_mapping = {'sunny': 1, 'rainy': 0, 'snowy': 0}

In [18]:
random_outcome = sample_from_finite_probability_space(prob_space)
W = W_mapping[random_outcome]
I = I_mapping[random_outcome]

In [19]:
W

'snowy'

**Approach 2.** Remember how we wrote out probability tables for random variables W and I? Let's directly store these probability tables:

In [20]:
W_table = {'sunny': 1/2, 'rainy': 1/6, 'snowy': 1/3}
I_table = {0: 1/2, 1: 1/2}

In [22]:
W = sample_from_finite_probability_space(W_table)
I = sample_from_finite_probability_space(I_table)

## Random Variables Notation and Terminology

RANDOM VARIABLES NOTATION AND TERMINOLOGY (COURSE NOTES)

In this course, we denote random variables with capital/uppercase letters, such as $X$, $W$, $I$, etc. We use the phrases “probability table", “probability mass function" (abbreviated as PMF), and “probability distribution" (often simply called a distribution) to mean the same thing, and in particular we denote the probability table for $X$ to be $p_X$ or $p_X(⋅)$.

We write $p_X(x)$ to denote the entry of the probability table that has label $x∈X$ where $X$ is the set of values that random variable $X$ takes on. Note that we use lowercase letters like $x$ to denote variables storing nonrandom values. We can also look up values in a probability table using specific outcomes, e.g., from earlier, we have $p_W(\text{rainy})=1/6$ and $p_I(1)=1/2$.

Note that we use the same notation as in math where a function f might also be written as $f(⋅)$ to explicitly indicate that it is the function of one variable. Both $f$ and $f(⋅)$ refer to a function whereas $f(x)$ refers to the value of the function $f$ evaluated at the point $x$.

As an example of how to use all this notation, recall that a probability table consists of nonnegative entries that add up to $1$. In fact, each of the entries is at most $1$ (otherwise the numbers would add to more than $1$). For a random variable $X$ taking on values in $X$, we can write out these constraints as:

$$0 \le p_ X(x) \le 1\quad \text {for all }x\in \mathcal{X}, \qquad \sum _{x \in \mathcal{X}} p_ X(x) = 1.$$
 
Often in the course, if we are making statements about all possible outcomes of $X$, we will omit writing out the alphabet $X$ explicitly. For example, instead of the above, we might write the following equivalent statement:

$$0 \le p_ X(x) \le 1\quad \text {for all }x, \qquad \sum _ x p_ X(x) = 1.$$


### Exercise

Consider the following probability space:
```python
prob_space = {'cat': 0.2, 'dog':0.7, 'shark':0.1}
```
Let's define a random variable $X$ that maps 'cat' and 'dog' both to $5$, and 'shark' to $7$.

What is the set of values that X can take on? Express your answer as a Python set.

In [57]:
model = {'cat': 0.2, 'dog':0.7, 'shark':0.1}
mapping = {'cat': 5, 'dog': 5, 'shark':7}

In [106]:
def PMF(mapping, model):
    """Returns to prob. dist. function for given probabiliy model and random variable
    
    >>> model = {'cat': 0.2, 'dog':0.7, 'shark':0.1}
    >>> mapping = {'cat': 5, 'dog':5, 'shark':7}
    >>> PMF(mapping, model)
    {5: 0.8999999999999999, 7: 0.1}
    """

    new_model = dict()
    for key, value in mapping.items():
        if value in new_model:
            new_model[value] += model[key]

        else:
            new_model[value] = model[key]

    return new_model

if __name__ == "__main__":
    import doctest
    doctest.testmod()

In [93]:
new_model = PMF(mapping, model)
new_model

{5: 0.8999999999999999, 7: 0.1}

In [94]:
X = set(new_model.keys())
X

{5, 7}

In [107]:
def is_samePMF(X, Y):
    """
    Retrun True if both are same PMF else false
    
    >>> is_samePMF({5: 0.8999999999999999, 7: 0.1}, {5: 0.9, 7: 0.1})
    True
    
    >>> is_samePMF({5: 0.7, 7: 0.3}, {5: 0.9, 7: 0.1})
    False
    
    >>> is_samePMF({6: 0.8999999999999999, 7: 0.1}, {5: 0.9, 7: 0.1})
    False
    
    >>> is_samePMF(dict(), {5: 0.9, 7: 0.1})
    False
    
    >>> is_samePMF({5: 0.9, 7: 0.1}, {5: 0.1})
    False
    """
    if not X.keys() == Y.keys():
        return False 
    
    else:
        for key, value in X.items():
            if (abs(X[key] - Y[key]) > 0.00001):
                return False

        return True

if __name__ == "__main__":
    import doctest
    doctest.testmod()

In [108]:
is_samePMF(new_model, Y)

True

### Exercise: Probability with Dice

Let random variable $X$ be the sum of rolls of two fair six-sided dice with faces numbered $1$ through $6$.

How many different values can $X$ can take on?

In [116]:
two_dice = {(i+1, j+1): 1/36 for i in range(6) for j in range(6)}

In [117]:
mapping = {key: sum(key) for key, value in two_dice.items()}

In [119]:
X_sum = PMF(mapping, two_dice)
X_sum

{2: 0.027777777777777776,
 3: 0.05555555555555555,
 4: 0.08333333333333333,
 5: 0.1111111111111111,
 6: 0.1388888888888889,
 7: 0.16666666666666669,
 8: 0.1388888888888889,
 9: 0.1111111111111111,
 10: 0.08333333333333333,
 11: 0.05555555555555555,
 12: 0.027777777777777776}

What is the probability that $X=7$? (Hint: An earlier exercise asked you for the event that the two faces sum to $7$.)

In [120]:
X_sum[7]

0.16666666666666669

### Exercise: Functions of Random Variables

Consider the random variable $W$ that we have seen before, where $W=\text{sunny}$ with probability $1/2$, $W=\text{rainy}$ with probability $1/6$, and $W=\text{snowy}$ with probability $1/3$. Consider a function $f$ that maps 'sunny' and 'rainy' to $3$, and 'snowy' to $42$.

In [121]:
model   = {'sunny': 1/2, 'rainy': 1/6, 'snowy': 1/3}
mapping = {'sunny': 3, 'rainy': 3, 'snowy': 42}

$f(W)$ is also a random variable. Express the probability table for $f(W)$ as a Python dictionary. (Your answer should be the Python dictionary itself, and not the dictionary assigned to a variable, so please do not include, for instance, “prob_table =" before specifying your answer. You can use fractions. If you use decimals instead, please be accurate and use at least 5 decimal places.)

In [125]:
new_model = PMF(mapping, model)
new_model

{3: 0.6666666666666666, 42: 0.3333333333333333}

Is $(f(W))^2$ also a random variable? If yes, provide the probability table for $(f(W))^2$ as a Python dictionary.

In [126]:
model_square = {key: value**2 for key, value in new_model.items()}

In [127]:
def is_valid_model(model):
    """
    Return true if the model is valid probabilistic model else retrun false.
    
    >>> is_valid_model({'hearts': 0, 'clubs': 0.4, 'diamonds': 0.7, 'spades': 0.2})
    False
    
    >>> is_valid_model({'apple': 0.5, 'orange': 0.4, 'pear': 0.2, 'banana': -0.1})
    False
    
    >>> is_valid_model({1: 0.4, 2: 0.3, 'cat': 0.3})
    True
    """
    sum_of_values = 0
    for key, value in model.items():
        if value < 0:
            return False 
            
        else:
            sum_of_values += value
            
    if abs(sum_of_values - 1) < 0.00001: 
        return True
    
    else:
        return False
        print(sum_of_values)

if __name__ == "__main__":
    import doctest
    doctest.testmod()

In [128]:
is_valid_model(model_square)

False