In [43]:
from datascience import *
import numpy as np
from math import *

$$
\begin{align}
\end{align}
$$

## Joint Distributions

Recall from Lessons 13 & 14: let $X$ be a random variable. $X$ has a distribution that is described by a probability mass function (pmf) or probability density function (pdf). 

We can consider multiple random variables simultaneously using joint distributions. 

NOTE: When answering the questions below, you are not required to use Python. If you would like to answer in Markdown, feel free to change the type of the cell, or to use both types of cells. 

#### Example 1: Discrete Joint Distribution

Let $X$ and $Y$ be discrete random variables that can each only take the values 0, 1 or 2, and do so according to the following distribution:


 |  | $X$  | 1 | 2 | 3 |
 |-- | ------ | ------ |
 |  | 1 | 0.17 | 0.15 | 0.08 |
 | $Y$ | 2 | 0.00 | 0.10 | 0.10 |
 |  | 3 | 0.08 | 0.20 | 0.12 |
 


This is an example of a joint probability mass function (joint pmf), and is denoted as $f_{X,Y}(x,y)$. 

For example, the probability that $X$ takes the value 1 AND $Y$ takes the value 3, or $P(X=1,Y=3)$ is equal to 0.08. 

**_1.1_** Find $P(X = 2,Y=3)$. 

**_1.2_** Find $E(X+Y)$. 

*1.1:* *$P(X=2,Y=3)=0.20$*

*1.2:*
$$
\begin{align}
E\left(X+Y\right)&=\sum_{x}\sum_{y}\left(x+y\right)f_{x,y}\left(x,y\right)\\
&=\left(1+1\right)f_{x,y}\left(1,1\right)+\left(1+2\right)f_{x,y}\left(1,2\right)+\left(1+3\right)f_{x,y}\left(1,3\right)+\\
& \left(2+1\right)f_{x,y}\left(2,1\right)+\left(2+2\right)f_{x,y}\left(2,2\right)+\left(2+3\right)f_{x,y}\left(2,3\right)+\\
&\left(3+1\right)f_{x,y}\left(3,1\right)+\left(3+2\right)f_{x,y}\left(3,2\right)+\left(3+3\right)f_{x,y}\left(3,3\right)\\
&=\left(2\right)\left(0.17\right)+\left(3\right)\left(0.00\right)+\left(4\right)\left(0.08\right)+\left(3\right)\left(0.15\right)+\left(4\right)\left(0.10\right)+\\&\left(5\right)\left(0.20\right)+\left(4\right)\left(0.08\right)+\left(5\right)\left(0.10\right)+\left(6\right)\left(0.12\right)\\
&=0.34+0.00+0.32+0.45+0.40+1.00+0.32+0.50+0.72\\
E\left(X+Y\right)&=4.05
\end{align}
$$

In [44]:
# 1.2 with code
def jpmfXY(x,y):
    '''a function that returns the probability of X and Y, from the given table'''
    if x==1 and y==1:
        return 0.17
    elif x==1 and y==3:
        return 0.08
    elif x==2 and y==1:
        return 0.15
    elif x==2 and y==2:
        return 0.1
    elif x==2 and y==3:
        return 0.2
    elif x==3 and y==1:
        return 0.08
    elif x==3 and y==2:
        return 0.1
    elif x==3 and y==3:
        return 0.12
    else:
        return 0
    
Exy = 0
for x in [1,2,3]:
    for y in [1,2,3]:
        Exy += (x+y)*jpmfXY(x,y)
Exy

4.05

### Marginal Probability

When given a joint pmf like this, we may want to know the distribution of $X$ or $Y$ individually. Specifically, we might want to know $P(X=1)$ or $f_Y(y)$. 

Marginal probability can be found by summing across the remaining variable. Specifically,

$$
f_X(x)=\sum_y f_{X,Y}(x,y)
$$
and 
$$
f_Y(y)=\sum_x f_{X,Y}(x,y)
$$


**_1.3_** Find $P(X=2)$. 

**_1.4_** Find $f_Y(y)$ (the marginal pmf of $Y$). 

**_1.5_** Find $E(Y)$ and $V(Y)$. 

In [45]:
jpmfTable = Table().with_columns([
    'Y\X', [1, 2, 3],
    '1', [0.17, 0.0, 0.08],
    '2', [0.15, 0.1, 0.2],
    '3', [0.08, 0.1, 0.12]
]) # A table will make the distribution easier to work with
jpmfTable

print('1.3) P(X=2) = ' + str(jpmfTable.column('2').sum()))
# P(X=2) is found by isolating the X=2 column and summing the probabilities for any Y

fY=Table().with_columns([
    'Y', [1, 2, 3],
    'fY(y)', jpmfTable.drop('Y\X').apply(sum)
]) # the pmf of Y is found by summing each row across all X
print('1.4)')
fY

EY = sum(fY.column(0)*fY.column(1))
# E(X) is found by summing the products of each value for Y and its probability
print('1.5) E(Y) = ' + str(EY))

VY = sum((fY.column('Y') - EY)**2 * fY.column(1))
# V(Y) is found by summing the probability of each value of Y multiplied 
# by the square of the difference between the value and E(Y)
print('     V(Y) = ' + str(VY))

Y\X,1,2,3
1,0.17,0.15,0.08
2,0.0,0.1,0.1
3,0.08,0.2,0.12


1.3) P(X=2) = 0.45
1.4)


Y,fY(y)
1,0.4
2,0.2
3,0.4


1.5) E(Y) = 2.0
     V(Y) = 0.8


### Conditional Probability

We may be interested in the probability $X$ takes a specific value conditioned on the value of $Y$. Recall that conditional probability is given by $P(A|B)=\frac{P(A,B)}{P(B)}$. 

So, essentially, conditional probability can be found by dividing the joint probability by the appropriate marginal probability. 

**_1.6_** Find $P(Y=1|X=3)$. 

**_1.7_** Find $f_{X|Y=2}(x)$, the conditional pmf of $X$, given $Y=2$. 

**_1.8_** Find $E(X|Y=2)$ and $V(X|Y=2)$. 

In [46]:
# A marginal probability function will make this easier
def marg(XY, n):
    '''Given the letter 'X' or 'Y' and a number n,
    returns the marginal probability of X or Y being equal to n'''
    if XY=='X':
        return jpmfTable.column(n).sum()
    else:
        return jpmfTable.drop('Y\X').apply(sum)[n-1]
    
print('1.6) P(Y=1|X=3) = ' + str( jpmfXY(3,1)/marg('X',3) ))
#           P(Y=1|X=3)     =    P(X=3 and Y=1) / P(X=3)

1.6) P(Y=1|X=3) = 0.26666666666666666


In [60]:
fXY2 = Table().with_columns([
    'X', [1,2,3],
    'f_X|Y=2 (x)', [jpmfXY(1,2), jpmfXY(2,2), jpmfXY(3,2)] / marg('Y',2) ])
# I wish there was a better way to swap rows & columns like this
print('1.7):')
fXY2

EXY2 = sum(fXY2.column(0)*fXY2.column(1))
# E(X) is found by summing the products of each value for Y and its probability
print('1.8) E(X|Y=2) = ' + str(EXY2))

VXY2 = sum((fXY2.column(0) - EXY2)**2 * fXY2.column(1))
# V(Y) is found by summing the probability of each value of Y multiplied 
# by the square of the difference between the value and E(Y)
print('     V(X|Y=2) = ' + str(VXY2))

1.7):


X,f_X|Y=2 (x)
1,0.0
2,0.5
3,0.5


1.8) E(X|Y=2) = 2.5
     V(X|Y=2) = 0.25


**_1.9_** Are $X$ and $Y$ independent? Why or why not? 

*$X$ and $Y$ are independent, since $P(Y=2)=0.2$ while $P(Y=2|X=1)=0$.*

### Covariance and Correlation

Expected value and variance help us characterize $X$ and $Y$ marginally and conditionally, but we may also be interested in measuring the relationship between $X$ and $Y$. For this, we use *covariance*. 

$$
Cov(X,Y)=E[(X-E(X))(Y-E(Y))] = E(XY)-E(X)E(Y)
$$

Note that if $X$ and $Y$ are independent, $Cov(X,Y) =0$. The converse is NOT necessarily true. 

Covariance is dependent on the scales of $X$ and $Y$, so if the two variables are of vastly different scale, we'll want to use covariance's unitless counterpart, correlation, denoted by $\rho$. 

$$
\rho = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}
$$

$\rho$ is bounded by the interval $[-1,1]$. When $\rho=1$, $X$ and $Y$ are perfectly positively correlated. Similarly, when $\rho=-1$, $X$ and $Y$ are perfectly negatively correlated. 

**_1.10_** Find $Cov(X,Y)$

**_1.11_** Find $Corr(X,Y)$, or $\rho$. 

In [65]:
# E(Y) & V(Y) were previously calculated
EXY=0
EX=0
for x in [1,2,3]:
    for y in [1,2,3]:
        EXY+= x*y*jpmfXY(x,y)
        EX+= x*jpmfXY(x,y)
        
VX = 0
for x in [1,2,3]:
    for y in [1,2,3]:
        VX+= (x-EX)**2 * jpmfXY(x,y)
        
Cov = EXY - EX*EY
Corr = Cov/(VX*VY)**(1/2)

print('1.10) Cov(X,Y) = ' + str(Cov))
print('1.11) Corr(X,Y) = ' + str(Corr))

1.10) Cov(X,Y) = 0.13000000000000078
1.11) Corr(X,Y) = 0.1964293126950385


In problem 1.2, we found $E(X+Y)$. In order to find $Var(X+Y)$, we need to know how $X$ and $Y$ are correlated: 

$$
Var(X+Y)= Var(X)+Var(Y)+2*Cov(X,Y)
$$

**_1.12_** Find $Var(X+Y)$. 

In [66]:
print('1.12) Var(X+Y) = ' + str(VX + VY + 2*Cov))

1.12) Var(X+Y) = 1.6075000000000017


#### Example 2: Continuous Joint Distribution

All of the concepts above apply to continuous random variables. Consider continuous random variables $X$ and $Y$ with the following joint pdf:

$$
f_{X,Y}(x,y)=k(x+y)
$$

where both $x$ and $y$ are bounded by the interval $[0,1]$. 

**_2.1_** Find the value of $k$ that makes $f$ a valid joint pdf. 

$$
\begin{align}
\int\int f_{X,Y}\left(x,y\right)dxdy&=1\\
\int_0^1\int_0^1k\left(x+y\right)dxdy&=1\\
k\int_0^1\left(\frac{1}{2}x^2+xy\right)|_0^1dy&=1\\
k\int_0^1\left(\frac{1}{2}+y\right)dy&=1\\
k\left(\frac{1}{2}y+\frac{1}{2}y^2\right)|_0^1&=1\\
k\left(\frac{1}{2}+\frac{1}{2}\right)&=1\\
k&=1
\end{align}
$$

**_2.2_** Find $P(X<0.5,Y<0.5)$

$$
\begin{align}
P\left(X<0.5,Y<0.5\right)&=\int_0^{0.5}\int_0^{0.5}\left(x+y\right)dxdy\\
&=\int_0^{0.5}\left(\frac{1}{2}x^2+xy\right)|_0^{0.5}dy\\
&=\int_0^{0.5}\left(\frac{1}{8}+\frac{1}{2}y\right)dy\\
&=\left(\frac{1}{8}y+\frac{1}{4}y^2\right)|_0^{0.5}\\
&=\frac{1}{16}+\frac{1}{16}=\frac{1}{8}
\end{align}
$$

**_2.3_** Find $f_X(x)$ and $f_Y(y)$, the marginal pdfs of $X$ and $Y$. 

**_2.4_** Find $E(X)$ and $E(Y)$. 

*2.3)*
$$
\begin{align}
f_X\left(x\right)&=\int f_{X,Y}\left(x,y\right)dy &f_Y\left(y\right)&=\int f_{X,Y}\left(x,y\right)dx\\
&=\int_0^1\left(x+y\right)dy &&=\int_0^1\left(x+y\right)dx\\
&=\left(xy+\frac{1}{2}y^2\right)|_0^1 &&=\left(\frac{1}{2}x^2+xy\right)|_0^1\\
&=x+\frac{1}{2} &&=\frac{1}{2}+y
\end{align}
$$

*2.4)*
$$
\begin{align}
E\left(X\right)&=\int x f_X\left(x\right)dx &E\left(Y\right)&=\int x f_Y\left(y\right)dy\\
&=\int_0^1 x\left(x+\frac{1}{2}\right)dx &&=\int_0^1 y\left(y+\frac{1}{2}\right)dy\\
&=\int_0^1 \left(x^2+\frac{1}{2}x\right)dx &&=\int_0^1 \left(y^2+\frac{1}{2}y\right)dy\\
&=\left(\frac{1}{3}x^3+\frac{1}{4}x^2\right)|_0^1 &&=\left(\frac{1}{3}y^3+\frac{1}{4}y^2\right)|_0^1\\
&=\frac{1}{3}+\frac{1}{4}=\frac{7}{12} &&=\frac{1}{3}+\frac{1}{4}=\frac{7}{12}
\end{align}
$$

**_2.5_** Find $P(X>0.5\vert Y\leq 0.5)$. 

$$
\begin{align}
P\left(X>0.5\vert Y\leq 0.5\right)&=\frac{P\left(X>0.5 \text{ and } Y\leq0.5\right)}{P\left(Y\leq0.5\right)}\\
&=\frac{\int_{0.5}^1\int_0^{0.5}f_{X,Y}\left(x,y\right)dydx}{\int_0^{0.5}f_Y\left(y\right)dy}\\
&=\frac{\int_{0.5}^1\int_0^{0.5}\left(x+y\right)dydx}{\int_0^{0.5}\left(\frac{1}{2}+y\right)dy}\\
&=\frac{\int_{0.5}^1\left(xy+\frac{1}{2}y^2\right)\vert_0^{0.5}dx}{\left(\frac{1}{2}y+\frac{1}{2}y^2\right)|_0^{0.5}}\\
&=\frac{\int_{0.5}^1\left(\frac{1}{2}x+\frac{1}{8}\right)dx}{\left(\frac{1}{4}+\frac{1}{8}\right)}\\
&=\frac{\left(\frac{1}{4}x^2+\frac{1}{8}x\right)|_{0.5}^1}{\frac{3}{8}}\\
&=\frac{8}{3}\left(\left(\frac{1}{4}+\frac{1}{8}\right)-\left(\frac{1}{16}+\frac{1}{16}\right)\right)\\
&=\left(\frac{8}{3}\right)\left(\frac{1}{4}\right)=\frac{2}{3}
\end{align}
$$

**_2.6_** Find the conditional distributions of $X|Y$ and $Y|X$. Recall that conditional distributions can be found by dividing the joint pdf by the relevant marginal pdf. 

**_2.7_** Find $E(X|Y)$ and $E(Y|X)$.

*2.6)*
$$
\begin{align}
f_{X|Y}\left(x\right)&=\frac{f_{X,Y}\left(x,y\right)}{f_Y\left(y\right)} &f_{Y|X}\left(y\right)&=\frac{f_{X,Y}\left(x,y\right)}{f_X\left(x\right)}\\
&=\frac{x+y}{\frac{1}{2}+y} &&=\frac{x+y}{\frac{1}{2}+x}
\end{align}
$$

*2.7)*
$$
\begin{align}
E\left(X|Y\right)&=\int_0^1 x f_{X|Y}\left(x\right)dx &E\left(Y|X\right)&=\int_0^1 y f_{X|Y}\left(y\right)dy\\
&=\int_0^1 x\left(\frac{x+y}{\frac{1}{2}+y}\right)dx &&=\int_0^1 y\left(\frac{x+y}{\frac{1}{2}+x}\right)dy\\
&=\frac{1}{\frac{1}{2}+y}\int_0^1\left(x^2+xy\right)dx &&=\frac{1}{\frac{1}{2}+x}\int_0^1\left(y^2+xy\right)dy\\
&=\frac{1}{\frac{1}{2}+y} \left(\frac{1}{3}x^3+\frac{1}{2}x^2y\right)|_0^1 &&=
    \frac{1}{\frac{1}{2}+x} \left(\frac{1}{3}y^3+\frac{1}{2}y^2x\right)|_0^1\\
&=\frac{\frac{1}{3}+\frac{1}{2}y}{\frac{1}{2}+y} &&=\frac{\frac{1}{3}+\frac{1}{2}x}{\frac{1}{2}+x}\\
&=\frac{2+3y}{3+6y} &&=\frac{2+3x}{3+6x}
\end{align}
$$

**_2.8_** Are $X$ and $Y$ independent? 

*No, since $f_{X|Y}(x)\neq f_{X}(x)$ and $f_{Y|X}(y)\neq f_{Y}(y)$, we know that both $X$ and $Y$ depend on each other.*

*Note to self: if $f_{X,Y}(x,y)$ is separable into $g(x)h(y)$, $X$ & $Y$ are independet!*

**_2.9_** What is $Cov(X,Y)$? 

$$
\begin{align}
E\left(XY\right)&=\int\int xyf_{X,Y}\left(x,y\right)dxdy &Cov\left(X,Y\right)&=E\left(XY\right)=E\left(X\right)E\left(Y\right)\\
&=\int_0^1\int_0^1 xy\left(x+y\right)dxdy &&=\frac{1}{3}-\left(\frac{7}{12}\right)\left(\frac{7}{12}\right)\\
&=\int_0^1\int_0^1\left(x^2y+xy^2\right)dxdy &&=\frac{48}{144}-\frac{49}{144}\\
&=\int_0^1\left(\frac{1}{3}x^3y+\frac{1}{2}x^2y^2\right)|_0^1dy &&=\frac{-1}{144}\\
&=\int_0^1\left(\frac{1}{3}y+\frac{1}{2}y^2\right)dy\\
&=\left(\frac{1}{6}y^2+\frac{1}{6}y^3\right)|_0^1\\
&=\frac{1}{6}+\frac{1}{6}=\frac{1}{3}
\end{align}
$$