# Discrete Random Variables

This short notebooks experiments with the concepts from the Probability Theory 1 video from the class ECON-GA-4002

We will do the following:

- propose a bivariate normal by specifying its mean and covariance
- Sample random draws from that distribution
- Use the random draws to construct an M x N discretized representation of our bivariate normal. This will give us a discrete joint distribution in a matrix
- Experiment with the joint distribution

In [1]:
import numpy as np
from scipy.stats import multivariate_normal

In [2]:
# change these if you want
mu = np.array([1, -2])

# Diagonal matrix implies independent RVs
sigma = np.array([[3, 0], [0, 1]])

In [3]:
# draw 10,000 samples
np.random.seed(42)
samples = multivariate_normal(mu, sigma).rvs(10000)

In [4]:
# discretize the samples
counts, xbins, ybins = np.histogram2d(
    x=samples[:, 0], 
    y=samples[:, 1],
    bins=(12, 10),
)

In [5]:
counts

array([[  0.,   0.,   0.,   3.,   1.,   1.,   2.,   1.,   0.,   0.],
       [  0.,   0.,   3.,   6.,  14.,   6.,   5.,   2.,   0.,   0.],
       [  0.,   1.,   7.,  37.,  55.,  30.,  14.,   6.,   1.,   0.],
       [  0.,  11.,  31., 130., 165., 110.,  47.,   9.,   2.,   0.],
       [  4.,  23., 108., 259., 358., 271., 109.,  22.,   0.,   0.],
       [  2.,  36., 184., 468., 673., 489., 197.,  39.,   3.,   1.],
       [  6.,  30., 213., 549., 764., 567., 200.,  45.,   5.,   0.],
       [  6.,  33., 147., 420., 621., 444., 183.,  31.,   3.,   2.],
       [  4.,  19., 112., 242., 387., 269., 102.,  23.,   3.,   1.],
       [  1.,   5.,  38., 108., 147., 109.,  46.,   9.,   2.,   0.],
       [  0.,   1.,  16.,  27.,  45.,  30.,   8.,   1.,   0.,   0.],
       [  0.,   1.,   1.,   8.,   6.,   9.,   4.,   1.,   0.,   0.]])

In [6]:
xbins

array([-5.79379652, -4.74010796, -3.6864194 , -2.63273084, -1.57904228,
       -0.52535372,  0.52833484,  1.5820234 ,  2.63571197,  3.68940053,
        4.74308909,  5.79677765,  6.85046621])

In [7]:
ybins

array([-5.68836529, -4.87162034, -4.05487538, -3.23813043, -2.42138547,
       -1.60464052, -0.78789557,  0.02884939,  0.84559434,  1.6623393 ,
        2.47908425])

In [8]:
P = counts / counts.sum()  # construct joint probability dist

In [9]:
# check that we have a distribution
assert P.sum() == 1
assert np.all(P >= 0)

Now that we have a joint probability distribution, try to answer the following questions:

1. What does a single element of P represent? For example, `P[8, 4]`
2. How could you compute the marginal distribution of x? 
3. How could you compute the marginal distribution of y?
4. Given the covariance matrix we picked, and your marginal distribution p(x) from question 2, what to expect the *conditional* distribution $p(x | y \approx 0.029)$ to be? Why? What did you have to compute to answer that question?
5. What happens to questions 1-4 if you change the covariance matrix and repeat?