# Joint, Marginal, and Conditional Distribution

<a href="#Distribution">Distribution</a>

<a href="#Joint,-Marginal,-Conditional Distribution">Joint, Marginal, Conditional Distribution</a>

<a href="#Independent-Random-Variables">Independent Random Variables</a>

<a href="#Distributions-related-to-Coin-Flips">Distributions related to Coin Flips</a>

<a href="#Distributions-related-to-Dice-Rolling">Distributions related to Dice Rolling</a>

# Distribution

### Random variable

$$
X:\Omega\longrightarrow \mathbb{R}
$$

### Distribution

Let $X$ be a random variable.
We move a brick attached to $\omega$,
to $X(\omega)$ in the real line $\mathbb{R}$.
In this way we move all the bricks in $\Omega$ to $\mathbb{R}$.
Then
the total weights of the bricks moved into  $\mathbb{R}$ is 1.
This brick or weight distribution over the real line $\mathbb{R}$ is
the distribution of $X$.
$$\begin{array}{llll}
\mathbb{P}(X=a)&=&\mbox{Weight of the bricks at $a$}\nonumber\\
\\
\mathbb{P}(X\in A)&=&\mbox{Weight of the bricks in $A$}\nonumber\\
\end{array}$$

### PMF/PDF

$$\begin{array}{llll}
\mbox{PMF}&p_{x_i}&=&\mbox{Weight of the bricks attached to $x_i$}\nonumber\\
\\
\mbox{PDF}&f(x)dx&=&\mbox{Weight of the bricks in $[x,x+dx]$}\nonumber
\end{array}$$

### CDF

\begin{eqnarray}
F(x)&=&\mathbb{P}(X\le x)\nonumber\\
\nonumber\\
&=&\left\{\begin{array}{ll}\displaystyle\sum_{x_i\le x}p_{x_i}&\mbox{if $X$ is discrete}\\\displaystyle\int_{-\infty}^x f(s)ds&\mbox{if $X$ is continuous}\end{array}\right.\nonumber\\
\nonumber\\
&=&\mbox{Weight of the bricks cumulatively stacked from $-\infty$ up to $x$}\nonumber
\end{eqnarray}

[<a href="#Joint,-Marginal,-and-Conditional Distribution">Back to top</a>]

# Joint, Marginal, Conditional Distribution

### Random vector

$$
{\bf X}:\Omega\longrightarrow \mathbb{R}^d
$$

### Joint distribution

Let ${\bf X}$ be a random vector.
We move a brick attached to $\omega$,
to ${\bf X}(\omega)$ in $\mathbb{R}^d$.
In this way we move all the bricks in $\Omega$ to $\mathbb{R}^d$.
Then
the total weights of the bricks moved into  $\mathbb{R}^d$ is 1.
This brick or weight distribution over $\mathbb{R}^d$ is
the joint distribution of ${\bf X}$.
$$\begin{array}{llll}
\mathbb{P}({\bf X}={\bf a})&=&\mbox{Weight of the bricks at ${\bf a}$}\nonumber\\
\\
\mathbb{P}({\bf X}\in A)&=&\mbox{Weight of the bricks in $A$}\nonumber\\
\end{array}$$

### Joint PMF/PDF
$$\begin{array}{llll}
\mbox{Joint PMF}&p_{{\bf x}}&=&\mbox{Weight of the bricks attached to ${\bf x}$}\\
\\
\mbox{Joint PDF}&f({\bf x})d{\bf x}&=&\mbox{Weight of the bricks in $\prod_{i=1}^d[x_i,x_i+dx_i]$}
\end{array}$$

### Joint CDF
\begin{eqnarray}
F({\bf x})&=&\mathbb{P}({\bf X}\le {\bf x})\nonumber\\
\nonumber\\
&=&\left\{\begin{array}{ll}\displaystyle\sum_{{\bf x}_i\le {\bf x}} p_{{\bf x}_i}&\mbox{if ${\bf X}$ is discrete}\\\displaystyle\int_{-\infty}^{{\bf x}} f({\bf s})d{\bf s}&\mbox{if ${\bf X}$ is continuous}\end{array}\right.\nonumber\\
\nonumber\\
&=&\mbox{Weight of the bricks cumulatively stacked from $-\infty$ up to ${\bf x}$}\nonumber
\end{eqnarray}

### Joint, marginal, conditional distribution

<div align="center"><img src="img/Screen Shot 2018-01-31 at 1.53.03 AM.png" width="50%"></div>

Bayesian Methods for Machine Learning (Udacity)

### How to get joint, marginal, conditional from other two
$$\begin{array}{llll}
\mbox{Chain rule}&\displaystyle p(x,y)=p(x)p(y|x)\\
\\
\mbox{Marginalization}&\displaystyle p(x)=\sum_yp(x,y)\\
\mbox{Conditioning}&\displaystyle p(y|x)=\frac{p(x,y)}{p(x)}
\end{array}$$

### Example - From joint to marginal and conditional
The joint PMF of $X$ and $Y$ are given by

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\\\hline
{\bf 2}&\frac{1}{10}&0&\frac{1}{10}&\\\hline
{\bf 1}&0&\frac{2}{10}&\frac{1}{10}&\\\hline
{\bf 0}&\frac{1}{10}&0&\frac{1}{10}&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

1.
Find the marginal PMF of $X$.

2.
Find the marginal PMF of $Y$.

3.
Find the conditional PMF of $X$ given $Y=1$.

4.
Find the conditional PMF of $Y$ given $X=2$.

1.
Do the column  sum
and
get 
the marginal PMF of $X$.

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\\\hline
{\bf 2}&\frac{1}{10}&0&\frac{1}{10}&\\\hline
{\bf 1}&0&\frac{2}{10}&\frac{1}{10}&\\\hline
{\bf 0}&\frac{1}{10}&0&\frac{1}{10}&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
&\frac{3}{10}&\frac{3}{10}&\frac{4}{10}&{\bf \mathbb{P}(X=x_i)}\\\hline
\end{array}
$$

2.
Do the  row sum
and
get 
the marginal PMF of $Y$.

$$
\begin{array}{|c|c||c|c|c||c|} \hline
{\bf \mathbb{P}(Y=y_j)}& {\bf y_j}&&&&                                     \\\hline\hline  
\frac{3}{10}&{\bf 3}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\\\hline
\frac{2}{10}&{\bf 2}&\frac{1}{10}&0&\frac{1}{10}&\\\hline
\frac{3}{10}&{\bf 1}&0&\frac{2}{10}&\frac{1}{10}&\\\hline
\frac{2}{10}&{\bf 0}&\frac{1}{10}&0&\frac{1}{10}&\\\hline\hline
&&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

3.
To find the conditional PMF of $X$ given $Y=1$
we remove all the masses except the masses on the line $y=1$
and then normalize the masses so that the total mass of the remaining masses is 1.

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&&&&\\\hline
{\bf 2}&&&&\\\hline
{\bf 1}&0&\frac{2}{10}&\frac{1}{10}&\\\hline
{\bf 0}&&&&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&&&&\\\hline
{\bf 2}&&&&\\\hline
{\bf 1}&0&\frac{2}{3}&\frac{1}{3}&\\\hline
{\bf 0}&&&&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
&0&\frac{2}{3}&\frac{1}{3}&{\bf \mathbb{P}(X=x_i|Y=1)}\\\hline
\end{array}
$$


4.
To find the conditional PMF of $Y$ given $X=2$
we remove all the masses except the masses on the line $x=2$
and then normalize the masses so that the total mass of the remaining masses is 1.

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&&&\frac{1}{10}&\\\hline
{\bf 2}&&&\frac{1}{10}&\\\hline
{\bf 1}&&&\frac{1}{10}&\\\hline
{\bf 0}&&&\frac{1}{10}&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

$$
\begin{array}{|c|c||c|c|c||c|} \hline
{\bf \mathbb{P}(Y=y_j|X=2)}&{\bf y_j}&&&&                                     \\\hline\hline  
\frac{1}{4}&{\bf 3}&&&\frac{1}{4}&\\\hline
\frac{1}{4}&{\bf 2}&&&\frac{1}{4}&\\\hline
\frac{1}{4}&{\bf 1}&&&\frac{1}{4}&\\\hline
\frac{1}{4}&{\bf 0}&&&\frac{1}{4}&\\\hline\hline
&&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

### Example - 3 red balls and 1 blue ball

There are 3 red balls and 1 blue ball in the bin.
$$
\{{\color{red}r}, {\color{red}r}, {\color{red}r}, {\color{blue}b}\}
$$
We choose the first ball and record its color.
After the record remove the first ball from the bin.
Then
we choose the second ball and record its color.
Find the probability that the first ball is red and the second is blue.



$$
X_i=\left\{\begin{array}{ll}
1&\mbox{if $i$-th chosen ball blue}\\
0&\mbox{otherwise}
\end{array}\right.
$$


$$\mathbb{P}(X_1=0,X_2=1)=\mathbb{P}(X_1=0)\mathbb{P}(X_2=1|X_1=0)=\frac{3}{4}\times \frac{1}{3}=\frac{1}{4}$$

### Example - CDF

Using the below CDF find $\mathbb{P}(X=5)$ and $\mathbb{P}(X\ge20)$.


$$\mathbb{P}(X=5)=F(5)-F(5-)=0.2-0=0.2$$


$$\mathbb{P}(X<20)=F(20-)=0.6\ \ \Rightarrow\ \ 
\mathbb{P}(X\ge20)=1-\mathbb{P}(X<20)=0.4$$

[<a href="#Joint,-Marginal,-and-Conditional Distribution">Back to top</a>]

# Independent Random Variables

<div align="center"><img src="img/Screen Shot 2018-01-31 at 1.14.18 PM.png" width="30%"></div>

Bayesian Methods for Machine Learning (Coursera)

<div align="center"><img src="img/Screen Shot 2018-01-31 at 1.18.03 PM.png" width="30%"></div>

Bayesian Methods for Machine Learning (Coursera)

### independent random variables

$X$ and $Y$ are independent if for any $x$ and $y$
$$
p(x,y)=p(x)p(y)
$$

$X_1, X_2, \cdots, X_n$ are independent if for any $x_1$, $x_2$,$\cdots$, $x_n$
$$
p(x_1,x_2,\ldots,x_n)=p(x_1)p(x_2)\cdots p(x_n)
$$

### Pairwise independent random variables

$X_1, X_2, \cdots, X_n$ are pairwise independent if for any pair $X_i$, $X_j$
$$
\mbox{$X_i$ and $X_j$ are independent}
$$

### Conditionally independent random variables

$X_1$, $\cdots$, $X_n$ are conditionally independent conditioned on $Y$ if
for any $x_1$, $\cdots$, $x_n$, $y$
$$
p(x_1,x_2,\ldots,x_n|y)=p(x_1|y)p(x_2|y)\cdots p(x_n|y)
$$

### Example - Two independent random variables

The marginal PMFs of $X$ and $Y$ are given by

$$
\begin{array}{|c|c||c|c|c||c|} \hline
{\bf \mathbb{P}(Y=y_j)}&{\bf y_j}&&&&                                     \\\hline\hline  
\frac{3}{10}&{\bf 3}&&&&\\\hline
\frac{2}{10}&{\bf 2}&&&&\\\hline
\frac{3}{10}&{\bf 1}&&&&\\\hline
\frac{2}{10}&{\bf 0}&&&&\\\hline\hline
&&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
&&\frac{2}{10}&\frac{3}{10}&\frac{5}{10}&{\bf \mathbb{P}(X=x_i)}\\\hline
\end{array}
$$

Suppose $X$ and $Y$ are independent.
Find the joint PMF of $X$ and $Y$, i.e., fill up the blank in below table.

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&&&&\\\hline
{\bf 2}&&&&\\\hline
{\bf 1}&&&&\\\hline
{\bf 0}&&&&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

$$
\begin{array}{|c|c||c|c|c||c|} \hline
{\bf \mathbb{P}(Y=y_j)}&{\bf y_j}&&&&                                     \\\hline\hline  
\frac{3}{10}&{\bf 3}&\frac{2}{10}\times\frac{3}{10}=\frac{6}{100}&\frac{3}{10}\times\frac{3}{10}=\frac{9}{100}&\frac{5}{10}\times\frac{3}{10}=\frac{15}{100}&\\\hline
\frac{2}{10}&{\bf 2}&\frac{2}{10}\times\frac{2}{10}=\frac{6}{100}&\frac{3}{10}\times\frac{2}{10}=\frac{6}{100}&\frac{5}{10}\times\frac{2}{10}=\frac{10}{100}&\\\hline
\frac{3}{10}&{\bf 1}&\frac{2}{10}\times\frac{3}{10}=\frac{6}{100}&\frac{3}{10}\times\frac{3}{10}=\frac{9}{100}&\frac{5}{10}\times\frac{3}{10}=\frac{15}{100}&\\\hline
\frac{2}{10}&{\bf 0}&\frac{2}{10}\times\frac{2}{10}=\frac{6}{100}&\frac{3}{10}\times\frac{2}{10}=\frac{6}{100}&\frac{5}{10}\times\frac{2}{10}=\frac{10}{100}&\\\hline\hline
&&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
&&\frac{2}{10}&\frac{3}{10}&\frac{5}{10}&{\bf \mathbb{P}(X=x_i)}\\\hline
\end{array}
$$

### Example - Two dependent random variables

The joint PMF of $X$ and $Y$ is given by

$$
\begin{array}{|c||c|c|c||c|} \hline
{\bf y_j}&&&&                                     \\\hline\hline  
{\bf 3}&\frac{1}{6}&\frac{1}{6}&\frac{1}{6}&\\\hline
{\bf 2}&\frac{1}{6}&\frac{1}{6}&0&\\\hline
{\bf 1}&\frac{1}{6}&0&0&\\\hline
{\bf 0}&0&0&0&\\\hline\hline
&{\bf 0}&{\bf 1}&{\bf 2}&{\bf x_i}\\\hline
\end{array}
$$

Determine whether $X$ and $Y$ are independent.


\begin{eqnarray}
\mbox{Conditional PMF $X$ given $Y=1$}
&&
p_{0|1}=1\nonumber\\
\mbox{Conditional PMF $X$ given $Y=2$}
&&
p_{0|2}=1/2,
\quad
p_{1|2}=1/2\nonumber\\
\mbox{Conditional PMF $X$ given $Y=3$}
&&
p_{0|3}=1/3,
\quad
p_{1|3}=1/3,
\quad
p_{2|3}=1/3\nonumber
\end{eqnarray}
The conditional PMF of $X$ given $Y=y_j$ depends on $y_j$
and hence $X$ and $Y$ are depen.



\begin{eqnarray}
\mbox{Conditional PMF $Y$ given $X=0$}
&&
p_{1|0}=1/3,
\quad
p_{2|0}=1/3,
\quad
p_{3|0}=1/3\nonumber\\
\mbox{Conditional PMF $Y$ given $X=1$}
&&
p_{2|1}=1/2,
\quad
p_{3|1}=1/2\nonumber\\
\mbox{Conditional PMF $Y$ given $X=2$}
&&
p_{3|2}=1\nonumber
\end{eqnarray}
The conditional PMF of $Y$ given $X=x_i$ depends on $x_i$
and hence $X$ and $Y$ are dependent.


[<a href="#Joint,-Marginal,-and-Conditional Distribution">Back to top</a>]

# Distributions related to Coin Flips

\begin{array}{cl}\hline
\mbox{Distribution}&\mbox{Random variable}\\\hline
B(p)&\mbox{Flip a $p$-coin and check whether we have a head}\\
B(n,p)&\mbox{Flip a $p$-coin $n$ times and count the number of heads}\\
Geo(p)&\mbox{Flip a $p$-coin until first head and count the number of flips}\\
NB(r,p)&\mbox{Flip a $p$-coin until $r$-th head and count the number of flips}\\\hline
\end{array}

\begin{array}{ccc}\hline
\mbox{Distribution}&\mbox{Mean}&\mbox{Variance}\\\hline
B(p)&p&pq\\
B(n,p)&np&npq\\
G(p)&\frac{1}{p}&\frac{q}{p^2}\\
NB(r,p)&\frac{r}{p}&\frac{rq}{p^2}\\\hline
\end{array}

### Example - How to generate $\pm 1$ random variables from Bernoulli random variables

If $X$ has a Bernoulli distribution $\text{B}(p)$ with success rate $p$,
then 
$$
Y:=2X-1=\left\{\begin{array}{rl}
1&\mbox{with probability}\ p\\
-1&\mbox{with probability}\ 1-p
\end{array}\right.
$$

In particular,
suppose we flip a fair coin $n$ times independently
and we record the $i$-th flip result as $X_i$
by converting $H$ and $T$ to 1 and 0 respectively.
Then $X_i$ is either 1 or 0 equally likely.
Now, let $Y_i=2X_i-1$.
Then
$$
\mbox{$X_i$ iid with}\
X_i=\left\{\begin{array}{ll}
1&\mbox{with prob}\ 0.5\\
0&\mbox{with prob}\ 0.5
\end{array}\right.
\Rightarrow 
\mbox{$Y_i$ iid with}\
Y_i=\left\{\begin{array}{rl}
1&\mbox{with prob}\ 0.5\\
-1&\mbox{with prob}\ 0.5
\end{array}\right.
$$

### $X+Y$ when $X\sim B(n,p)$, $Y\sim B(m,p)$

\begin{array}{lllllllll}
\mbox{$X\sim B(n,p)$, $Y\sim B(m,p)$}
&\quad\Rightarrow\quad&
X+Y\sim B(n+m,p)
&&\mbox{Not true in general}\nonumber\\
\mbox{$X\sim B(n,p)$, $Y\sim B(m,p)$}
&\quad\Rightarrow\quad&
X+Y\sim B(n+m,p)
&&\mbox{True if $X$, $Y$ independent}\nonumber
\end{array}

### Proof.
We flip the $p$-coin $n$ times and count the number $X$ of heads. 
We flip this coin $m$ times additionally and count the number $Y$ of heads in these additional coin flips. 
Then,
$X+Y$ is simply the number of heads in these $n+m$ coin flips.
So, $X+Y$ is $B(n+m,p)$. 

### Proof. (Divide and conquer)

Divide:
$$\mathbb{P}(X+Y=k)=\sum_{0\le l\le n,\ 0\le k-l\le m}\mathbb{P}(X=l,Y=k-l)$$


Conquer: 
\begin{eqnarray}
\mathbb{P}(X=l,Y=k-l)
&=&\mathbb{P}(X=l)\mathbb{P}(Y=k-l|X=l)\quad(\mbox{$X$ and $Y$ are independent})\nonumber\\
&=&\mathbb{P}(X=l)\mathbb{P}(Y=k-l)\quad(\mbox{$X\sim B(n,p)$, $Y\sim B(m,p)$})\nonumber\\
&=&{n\choose l}p^lq^{n-l}{m\choose k-l}p^{k-l}q^{m-(k-l)}\nonumber
\end{eqnarray}


By Vandermonde's iity
\begin{eqnarray}
\mathbb{P}(X+Y=k)
&=&\sum_{0\le l\le n, 0\le k-l\le m}{n\choose l}p^lq^{n-l}{m\choose k-l}p^{k-l}q^{m-(k-l)}\nonumber\\
&=&\left[\sum_{0\le l\le n, 0\le k-l\le m}{n\choose l}{m\choose k-l}\right]p^{k}q^{n+m-k}\nonumber\\
&=&{n+m\choose k}p^{k}q^{n+m-k}\nonumber
\end{eqnarray}

### Example - Number of couples with same birthday

There are $n$ people in the class.
Suppose each one choose one's birthday independently and uniformly over the 365 days.
For each pair $i$ and $j$
we let $A_{ij}$ be the event that $i$ and $j$ share the common birthday
and let $1_{A_{ij}}$ be its indicator.
Then, the number $X$ of common birthday pairs can be represented as 
$$
X=\sum_{1\le i<j\le n}^n1_{A_{ij}}
$$
Is $X$ binomial?

The indicator random $1_{A_{ij}}$ is either 1 or 0.
So it is a Bernoulli random variable with success rate $p=P(A_{ij})=1/365$;
$$1_{A_{ij}}\sim B(p)$$

\begin{eqnarray}
\mbox{$1_{A_{ij}}$ are  independent}\quad\ \
\ \ \Rightarrow\ \ 
X\sim B(m,p),\quad m={n\choose 2}
&&\mbox{Not true}\nonumber\\
\mbox{$1_{A_{ij}}$ are not independent}
\ \ \Rightarrow\ \ 
X\ \mbox{not binomial}\quad\quad\quad\quad\ \
&&\mbox{True}\nonumber
\end{eqnarray}

Suppose 1 and 2 share the common birthday
and suppose 1 and 3 share the common birthday.
Then, of course 2 and 3 share the common birthday.
In other word
$$
P(A_{23}|A_{12},A_{13})=1\neq\frac{1}{365}=P(A_{23})
$$

# Distributions related to Dice Rolling

\begin{array}{cl}\hline
\mbox{Distribution}&\mbox{Random variable}\\\hline
Cat({\bf p})&\mbox{Roll a ${\bf p}$-dice and check the result}\\
Mul(n,{\bf p})&\mbox{Roll a ${\bf p}$-dice $n$ times and count the number of out comes}\\\hline
\end{array}

### Parameter ${\bf p}=(p_1,\ldots,p_K)$ 
$$
p_j\ge 0,\quad\quad\quad\quad\sum_{i=1}^Kp_j=1
$$
### Categorical distribution $Cat({\bf p})$
$$
\mathbb{P}(X_1=0,\ldots,X_j=1,\ldots,X_K=0)=p_j
$$
### Multinomial distribution $Mul(n,{\bf p})$
$$
\mathbb{P}(X_1=n_1,\ldots,X_j=n_j,\ldots,X_K=n_K)={n\choose n_1\cdots n_K}p_1^{n_1}\cdots p_K^{n_K}
$$

[<a href="#Joint,-Marginal,-and-Conditional Distribution">Back to top</a>]