# Bi-variate Discrete Distributions

Until now we have been dealing with univariate distributions, i.e., distributions of a single random variable. In this notebook, we will discuss bi-variate distributions, i.e., distributions of two random variables. We will start with the simplest case, the joint distribution of two discrete random variables.

## Joint Distribution

Until now we have discussed the probability mass function (pmf) of a single discrete random variable. It answers questions such as "What is the probability that the random variable takes the value $x$?". It is very common that we are interested in more than out random variable. For example, we might conduct an experiment where we assign different marketing strategies to groups of customers and measure the number of products they buy. In that case we might be interested in the joint outcome of the marketing strategy and the number of products bought.

For discrete distributions with a limited number of outcomes, we can represent the joint distribution as a table. For example, consider the following table of two random variables $X$ and $Y$. $X$ can take the values 0, 1, 2, and 3, and $Y$ can take the values 2, and 3. The cells of the table contain the probability that the random variables take the corresponding values.

In [3]:
import pandas as pd
import numpy as np

# Import a couple of csv files from github
px = pd.read_csv("https://raw.githubusercontent.com/febse/data/main/econ/prob_review/px.csv")
py = pd.read_csv("https://raw.githubusercontent.com/febse/data/main/econ/prob_review/py.csv")
pxy = pd.read_csv("https://raw.githubusercontent.com/febse/data/main/econ/prob_review/pxy.csv")

# Select specific columns
pxy = pxy[['x', 'y', 'p']]

pxy_table = pd.pivot_table(pxy, values='p', index='x', columns='y')
pxy_table

y,2,3
x,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0.241,0.009
1,0.089,0.006
2,0.229,0.043
3,0.201,0.182


So we can look up probabilities such as

$$
P(X=0, Y=2) = 0.241
$$

directly from the table. We will write the PMF of two random variables $X$ and $Y$ as 

$$
f_{XY}(x, y) = P_{XY}(X=x, Y=y)
$$.

As the PMF of a single random variable, the joint PMF must satisfy the following properties:

1. $f_{XY}(x, y) \geq 0$ for all $x$ and $y$.
2. $\sum_x \sum_y f_{XY}(x, y) = 1$.


In [6]:
# Verify that the sum of all probabilities is 1

pxy_table.sum().sum()

1.0

# Marginal Distributions

If we have a joint distribution of two random variables, we can calculate the marginal distribution of each of the random variables, which are simply the univariate distributions of each of the random variables. For example, the marginal distribution of $X$ is

$$
f_X(x) = \sum_y f_{XY}(x, y)
$$

and the marginal distribution of $Y$ is

$$
f_Y(y) = \sum_x f_{XY}(x, y)
$$


In [None]:
# The marginal distribution of y is the sum of the rows of the table

pxy_table.sum(axis=0)

y
2    0.76
3    0.24
dtype: float64

In [11]:
# The marginal distribution of x is the sum of the columns of the table
pxy_table.sum(axis=1)

x
0    0.250
1    0.095
2    0.272
3    0.383
dtype: float64

## Conditional Distributions

A lot of questions in applied research boil down to the comparisons of conditional probabilities. For example, we might be interested in the probability that a customer buys a product given that they have been exposed to a certain marketing strategy or that a patient survives a decease given that they have been treated with a certain drug.

To be able to answer these questions, we need to calculate the conditional distribution of one random variable given the other. The conditional distribution of $X$ given $Y$ is defined as

$$
f_{X|Y}(x|y) = P(X=x|Y=y) = \frac{P(X=x, Y=y)}{P(Y=y)} = \frac{f_{XY}(x, y)}{f_Y(y)}
$$

and of course the conditional distribution of $Y$ given $X$ is

$$
f_{Y|X}(y|x) = P(Y=y|X=x) = \frac{P(X=x, Y=y)}{P(X=x)} = \frac{f_{XY}(x, y)}{f_X(x)}
$$


In [17]:
# Conditional distribution of x given y

px_given_y = pxy_table.div(pxy_table.sum(axis=0), axis=1)
px_given_y

y,2,3
x,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0.317105,0.0375
1,0.117105,0.025
2,0.301316,0.179167
3,0.264474,0.758333


In [15]:
# Conditional distribution of y given x

py_given_x = pxy_table.div(pxy_table.sum(axis=1), axis=0)
py_given_x


y,2,3
x,Unnamed: 1_level_1,Unnamed: 2_level_1
0,0.964,0.036
1,0.936842,0.063158
2,0.841912,0.158088
3,0.524804,0.475196


## Conditional Moments

The conditional distributions are probability distributions in their own right, and we can summarize them in the same way as we summarize univariate distributions. For example, we can calculate the conditional mean of $X$ given $Y$ as

$$
E(X|Y=y) = \sum_x x f_{X|Y}(x|y)
$$

and the conditional variance of $X$ given $Y$ as

$$
Var(X|Y=y) = \sum_x (x - E(X|Y=y))^2 f_{X|Y}(x|y)
$$


In [40]:
# In pandas it is easier to calculate the conditional moments in a long format (more rows than columns)

pxy_long = pxy.melt(id_vars=['x', 'y'], var_name='var').rename(columns={'value': 'p'})
pxy_long

Unnamed: 0,x,y,var,p
0,0,2,p,0.241
1,0,3,p,0.009
2,1,2,p,0.089
3,1,3,p,0.006
4,2,2,p,0.229
5,2,3,p,0.043
6,3,2,p,0.201
7,3,3,p,0.182


In [47]:
# Conditional expectation of y given x

pxy_long.groupby('x').apply(lambda d: (d['y'] * d['p'] / d['p'].sum()).sum())

  pxy_long.groupby('x').apply(lambda d: (d['y'] * d['p'] / d['p'].sum()).sum())


x
0    2.036000
1    2.063158
2    2.158088
3    2.475196
dtype: float64

In [48]:
# Conditional expectation of x given y

pxy_long.groupby('y').apply(lambda d: (d['x'] * d['p'] / d['p'].sum()).sum())


  pxy_long.groupby('y').apply(lambda d: (d['x'] * d['p'] / d['p'].sum()).sum())


y
2    1.513158
3    2.658333
dtype: float64