original version: Jessica Hamrick and Tom Griffiths


----

# Joint, Conditional, Marginal Distributions

In [None]:
import numpy as np

## Random Variables and Probability Distributions ($W$ and $P(W)$)

A random variable is a **mapping** from a set of **values** to **probabilities**.

Consider a random variable $W$, which assigns probabilities for different weather states.

|weather|P(W=weather)|
|:---:|:--------:|
|sunny|0.7|
|cloudy|0.2|
|stormy|0.1|

In [None]:
# Returns the probability of a particular weather pattern
def P_W(weather):
    if weather == "sunny":
        return 0.7
    if weather == "cloudy":
        return 0.2
    if weather == "stormy":
        return 0.1

# Check our probabilities
for weather in ("sunny", "cloudy", "stormy"):
    print(weather, P_W(weather))

## Joint Distributions ($P(W, T)$)

A joint distribution maps values of **two or more** variables to probabilities.

We'll add in another random variable, $T$ for whether there is traffic, and explore a joint distribution over $W$ and $T$.

|weather|traffic|P(W=weather, T=traffic)|
|:-----:|:--:|:------------------:|
|sunny  |yes  |0.1|
|cloudy |yes  |0.1|
|stormy |yes  |0.1|
|sunny  |no  |0.6|
|cloudy |no  |0.1|
|stormy |no  |0.0|

In [None]:
# Returns the probability of a combination of weather pattern and whether there is traffic
def P_WT(weather, traffic):
    states = {
        ("sunny", "yes") : 0.1,
        ("cloudy", "yes"): 0.1,
        ("stormy", "yes"): 0.1,
        ("sunny", "no")  : 0.6,
        ("cloudy", "no") : 0.1,
        ("stormy", "no") : 0.0,
    }
    return states[(weather, traffic)]

# Check our probabilities
for traffic in ("yes", "no"):
    for weather in ("sunny", "cloudy", "stormy"):
        print(weather, traffic, P_WT(weather, traffic))

## Conditional Distributions ($P(T|W)$)

What's the chance we see traffic if we know what the weather is? In probability, we can answer this with a **conditional distribution**: the distribution of the traffic **conditioned on**, or **given**, the weather.

Starting from the product rule:
$$P(W, T) = P(T|W)P(W)$$
We can derive a formula for the conditional distribution:
$$P(T|W) = \frac{P(W,T)}{P(W)}$$

In [None]:
# Use the product rule to find the conditional distribution:
def P_T_given_W(traffic, weather):
    return P_WT(weather, traffic) / P_W(weather)

# Check our probabilities
for traffic in ("yes", "no"):
    for weather in ("sunny", "cloudy", "stormy"):
        print(weather, traffic, P_T_given_W(traffic, weather))

So, we've derived the distribution $P(T|W)$:

|weather|traffic|P(T=trafficlW=weather)|
|:-----:|:--:|:------------------:|
|sunny  |yes  |0.143|
|cloudy |yes  |0.5|
|stormy |yes  |1.0|
|sunny  |no  |0.857|
|cloudy |no  |0.5|
|stormy |no  |0.0|

So, for example, the chance that we see traffic when the weather is sunny is 14.3%.


## Marginal Distributions ($P(T)$)

Now, what if we just care about how likely traffic is on any day? This can be answered by starting with the joint distribution and deriving a marginal distribution.

A marginal distribution can be obtained by **marginalizing**, or **summing out** a variable. This means to sum together **all possible assignments** of that variable:

$$P(T) = \sum_{weather}{P(\textit{W=weather, T})}$$
Or, equivalently:
$$P(T) = P(\textit{W=sunny, T}) + P(\textit{W=cloudy, T}) + P(\textit{W=stormy, T})$$

In [None]:
def P_T(traffic):
    return sum(P_WT(weather, traffic) for weather in ("sunny", "cloudy", "stormy"))
    # Or, equivalently:
    # return P_WT("sunny", traffic) + P_WT("cloudy", traffic) + P_WT("stormy", traffic)

# Check our probabilities
for traffic in ("yes", "no"):
    print(traffic, P_T(traffic))

This gives us our marginal distribution $P(T)$:

|traffic|P(T=traffic)|
|:-----:|:----------:|
|yes|0.3|
|no|0.7|

## When adding and when multiplying probabilities?

We add probabilities when we say it is this **or** that. The weather will be cloudy or stormy: $p(\textit{W=cloudy}) + p(\textit{W=stormy})$

We multiply probabilities when we say it is this **and** that. The weather will be cloudy and there will be traffic: $p(\textit{W=cloudy}) * p(\textit{T=traffic})$