# Joint mass function

:::{.callout-note}
Most examples will involve two random variables, but everything can be generalized for more of them.
:::

## Definition
Suppose two discrete random variables $X$ and $Y$ are defined on a common probability space, and can take on values
$x_1, x_2, \dots$ and $y_1, y_2, \dots,$ respectively. The joint probability mass function of them is defined as <br>
$$p(x_i , y_j ) = P{X = x_i , Y = y_j}, i = 1, 2, \dots , j = 1, 2, \dots .$$
This function contains all information about the joint distribution of $X$ and $Y$.<br>
Any joint mass function satisfies :
$$p(x,y)\ge 0, \; \forall x,y \in\mathbb{R}$$
$$\sum_{i,j}p(x_i,j_j)=1$$
Any function with above properties is a joint probability mass function.

## Marginal mass functions 
Marginal mass functions are <br>
$$p_X(x_i):=P\{X=x_i\}$$ 
and 
$$p_Y(y_i):=P\{Y=y_i\}$$
Also
$$p_X(x_i)=\sum_jp(x_i,j_j)$$
and
$$p_Y(y_j)=\sum_ip(x_i,j_j)$$ 

## Example
An urn has $3$ red, $4$ white, $5$ black balls. Drawing $3$ at once, let $X$ be the number of red, $Y$ the number of white balls drawn.
The joint mass function is:

| $Y \downarrow$  $X \rightarrow$ | 0 | 1 | 2 | 3 |$p_Y(\cdot)$
|---------|:-----|------:|:------:|:------:|:------:|
| __0__    | $\displaystyle \frac{\binom{5}{3}}{\binom{12}{3}}$   |   $\displaystyle \frac{\binom{3}{1}\binom{5}{2}}{\binom{12}{3}}$  |    $\displaystyle \frac{\binom{3}{2}\binom{5}{1}}{\binom{12}{3}}$   |  $\displaystyle \frac{\binom{3}{3}\binom{5}{0}}{\binom{12}{3}}$ |  $\displaystyle \frac{\binom{8}{3}}{\binom{12}{3}}$
| __1__     |  $\displaystyle \frac{\binom{4}{1}\binom{5}{2}}{\binom{12}{3}}$   |    $\displaystyle \frac{\binom{4}{1}\binom{3}{1}\binom{5}{1}}{\binom{12}{3}}$  |   $\displaystyle \frac{\binom{4}{1}\binom{3}{2}}{\binom{12}{3}}$    | 0| $\displaystyle \frac{\binom{4}{1}\binom{8}{2}}{\binom{12}{3}}$ 
| __2__       |  $\displaystyle \frac{\binom{4}{2}\binom{5}{1}}{\binom{12}{3}}$    |      $\displaystyle \frac{\binom{4}{2}\binom{3}{1}}{\binom{12}{3}}$  |   0    |0| $\displaystyle \frac{\binom{4}{2}\binom{8}{1}}{\binom{12}{3}}$ 
| __3__       |  $\displaystyle \frac{\binom{4}{3}}{\binom{12}{3}}$     |     0 |   0    |  0|   $\displaystyle \frac{\binom{4}{3}}{\binom{12}{3}}$ 
| $p_X(\cdot)$       |   $\displaystyle \frac{\binom{9}{3}}{\binom{12}{3}}$    |      $\displaystyle \frac{\binom{3}{1}\binom{9}{2}}{\binom{12}{3}}$  |    $\displaystyle \frac{\binom{3}{2}\binom{9}{1}}{\binom{12}{3}}$   |  $\displaystyle \frac{\binom{3}{3}}{\binom{12}{3}}$ | 1

: Table shows Joint probability distribution. 



# Conditional mass function

## Definition 

suppose $p_Y(y_j)>0$. The conditional mass function of $X$ given $Y=y_j$ is defined by 
$$p_{X \mid Y}(x \mid y_i):= P\{X=x \mid Y=y_j\}=\frac{\overbrace{p(x,y_i)}^{\text{joint}} }{\underbrace{p_Y(y_i)}_{\text{marginal}} }$$
As the conditional probability was a proper probability, this is a proper mass function: $\forall x,y_i$
$$p_{X \mid Y}(x \mid y_j) \ge 0, \qquad \sum_i p_{X|Y}(x_i \mid y_j)=1$$

### Example

Let $X$ and $Y$ have joint mass function 

| $X \downarrow$  $Y \rightarrow$| 0 | 1 | 
|---------|:-----|------:|
| __0__      | 0.4   |  0.2 |
| __1__     | 0.1  | 0.3 |


: joint distribution

The conditional distribution of $X$ given $Y=0$ is
$$p_{X\mid Y}(0 \mid 0)= \frac{p(0,0)}{p_Y(0)}= \frac{p(0,0)}{p(0,0)+p(1,0)}=\frac{0.4}{0.4+0.1}=\frac{4}{5}$$ 
$$p_{X\mid Y}(1 \mid 0)= \frac{p(1,0)}{p_Y(0)}= \frac{p(1,0)}{p(0,0)+p(1,0)}=\frac{0.1}{0.4+0.1}=\frac{1}{5}$$ 

# Independent Random Variable

## Definition

Random variables $X$ and $Y$ are independent, if events formulated with them are so, That is, if for every $A, B \sube \mathbb{R}$
$$P\{X\in A,Y\in B\}=P\{X\in A\} \cdot P \{ Y \in B\} $$

:::{.callout-tip}
The abbreviation i.i.d. is used for independent and identically distributed random variables.
:::

Two random variables $X$ and $Y$ are independent if and only if their joint mass function factorizes into the product of the marginals :
$$p(x_i,y_i)=p_X(x_i)\cdot p_Y(y_i), \qquad \forall x_i,y_i$$  

## Discrete Convolution

Let $X$ and $Y$ be independent, integer valued random variables with respective mass functions $p_X$ and $p_Y$ . Then
$$p_{X+Y}(k) = \sum_{i=-\infty}^\infty p_X(k − i) \cdot p_Y (i), \qquad (\forall k \in \mathbb{Z})$$
This formula is called discrete convolution of the mass function $p_X$ and $p_Y$

- Proof 
  $$\begin{align*}
  p_{X+Y}(K)&=P\{X+Y=K\}\\
  &=\sum_{i=-\infty}^\infty P\{X=k-i,Y=i\}\\
  &=\sum_{i=-\infty}^\infty P_X(k-i)\cdot p_Y(i)
  \end{align*}$$

Let $X \sim \textrm{Poi}(\lambda)$ and $Y \sim \textrm{Poi}(\mu)$ be independent than: <br>
$X+Y \sim \textrm{Poi}(\lambda+\mu)$

- proof
  $$\begin{align*}
  p_{X+Y}(K)&=\sum_{i=-\infty}^\infty P_X(k-i)\cdot p_Y(i) \\
  &= \sum_{i=-\infty}^\infty \frac{\lambda^{(k-i)}}{(k-i)!}e^{-\lambda}\cdot \frac{\mu^i}{i!}e^{-\mu} \\
  &= e^{-\lambda - \mu}\frac{1}{k!}\sum_{i=-\infty}^\infty \frac{k!}{(k-i)!\cdot i!}\lambda^{(k-i)} \cdot \mu^i \\
  &= e^{-\lambda - \mu}\frac{1}{k!}\sum_{i=-\infty}^\infty \binom{k}{i} \lambda^{(k-i)} \cdot \mu^i \\
  &= e^{-(\lambda + \mu)}\frac{1}{k!}(\lambda + \mu)^k \\
  &=\textrm{Poi}(\lambda+\mu)
  \end{align*}$$

Let $X,Y$ be i.i.d. $\textrm{Geom}(p)$ variables then: <br>
$X+Y$ is not geometric.

- proof
  $$\begin{align*}
  p_{X+Y}(K)&=\sum_{i=-\infty}^\infty P_X(k-i)\cdot p_Y(i) \\
  &=\sum_{i=1}^{k-1}(1-p)^{k-i-1}p\cdot (1-p)^{i-1}p \\
  &=(k-1)(1-p)^{k-2}p^2 
  \end{align*}$$  
  Hence $X+Y$ is not Geometric, _it's actually called Negative Binomial._

Let $X \sim \textrm{Binom}(n,p)$ and $Y \sim \textrm{Binom}(m,p)$ be independent (__notice the same__ $\mathbf p$) then :<br>
$X+Y \sim \textrm{Binom}(n+m,p)$

- proof
  $$\begin{align*}
  p_{X+Y}(K)&=\sum_{i=-\infty}^\infty P_X(k-i)\cdot p_Y(i) \\
  &=\sum_{i=0}^k \binom{n}{k-i}p^{k-i}(1-p)^{n-k+i}\cdot \binom{m}{i}p^i(1-p)^{m-i} \\
  &=p^k(1-p)^{m+n-k}\sum_{i=0}^k \binom{n}{k-i} \binom{m}{i}  \\
  &=\binom{m+n}{k}  p^k(1-p)^{m+n-k} \\
  &=\textrm{Binom}(n+m,p)
  \end{align*}$$  
  To prove above equation we used the fact that $\sum_{i=0}^k \binom{n}{k-i} \binom{m}{i}=\binom{m+n}{k}$

## Continuous convolution

Suppose $X$ and $Y$ are independent continuous random variables with respective densities $f_X$ and $f_Y$. Then their sum is a continuous random variable with density
$$f_{X+Y}(a)=\int_{- \infty} ^ \infty f_X(a-y)\cdot f_Y(y)dy, \qquad (\forall a \in \mathbb{R})$$

## Gamma distribution

Let $X$ and $Y$ be i.i.d. $\mathrm{Exp}(\lambda),$ and the density of their sum $(a\ge 0)$

$$\begin{align*}
f_{X+Y}(a)&=\int_{-\infty}^\infty f_X(a-y) \cdot f_Y(y)dy \\
&=\int_{0}^a \lambda e^{-\lambda (a-y)}\cdot \lambda e^{-\lambda y}dy \\
&=\lambda^2 e^{-\lambda a}\cdot y \Big |_0^a \\
&= \lambda^2 a \cdot e^{-\lambda a}
\end{align*}$$
This density is called $\mathrm{Gamma}(2,\lambda)$

<br><br>

Let $X \sim \mathrm{Exp}(\lambda)$ and $Y \sim \mathrm{Gamma}(2,\lambda)$ be i.i.d. again

$$\begin{align*}
f_{X+Y}(a)&=\int_{-\infty}^\infty f_X(a-y) \cdot f_Y(y)dy \\
&=\int_{0}^a \lambda e^{-\lambda (a-y)}\cdot \lambda^2 y \cdot e^{-\lambda y}dy \\
&=\lambda^3 e^{-\lambda a}\cdot \frac{y^2}{2} \Big |_0^a \\
&= \frac{\lambda^3 a^2 \cdot e^{-\lambda a}}{2}
\end{align*}$$
This density is called $\mathrm{Gamma}(3,\lambda)$

<br><br>

Let $X \sim \mathrm{Exp}(\lambda)$ and $Y \sim \mathrm{Gamma}(3,\lambda)$ be i.i.d. again

$$\begin{align*}
f_{X+Y}(a)&=\int_{-\infty}^\infty f_X(a-y) \cdot f_Y(y)dy \\
&=\int_{0}^a \lambda e^{-\lambda (a-y)}\cdot\frac{\lambda^3 y^2 \cdot e^{-\lambda y}}{2}dy \\
&=\lambda^3 e^{-\lambda a}\cdot \frac{y^3}{2\cdot 3} \Big |_0^a \\
&= \frac{\lambda^4 a^3 \cdot e^{-\lambda a}}{2\cdot 3} \\
&= \frac{\lambda^4 a^3 \cdot e^{-\lambda a}}{3!}
\end{align*}$$
This density is called $\mathrm{Gamma}(4,\lambda)$


<br><br><br>
The convolution of $n$ i.i.d. $\mathrm{Exp}(\lambda)$ distributions results in the $\mathrm{Gamma}(n,\lambda)$ density:

$$f(X)=\frac{\lambda^n X^{n-1} \cdot e^{-\lambda X}}{(n-1)!},\qquad \forall X\ge 0 \tag{1}$$
and zero otherwise.
<br><br><br><br>
This is the density of the sum of n i.i.d. $\mathrm{Exp}(\lambda)$ random variables. In particular, $\mathrm{Gamma}(1,\lambda) \equiv  \mathrm{Exp}(\lambda)$
<br><br><br><br>

Now if we integrate $f(X)$ it should equal to $1$
$$\begin{align*}
\int_{-\infty}^\infty f(x) &= \int_{-\infty}^\infty \frac{\lambda^n X^{n-1} \cdot e^{-\lambda X}}{(n-1)!} dx \\
&=  \int_{-\infty}^\infty \frac{ (\lambda X)^{n-1} \cdot e^{-\lambda X}}{(n-1)!} \lambda dx \\
&=1
\end{align*}$$
Now we write $Z= \lambda X, \;dZ=\lambda dX$,from above equation we get:
$$(n-1)! = \int_{-\infty}^\infty  (Z)^{n-1} \cdot e^{-Z}  dZ $$

The Gamma function is defined for every $\alpha > 0$ real numbers, by
$$\Gamma(\alpha) :=\int_{-\infty}^\infty  Z^{\alpha-1} \cdot e^{-Z}  dZ$$
In particular, $\Gamma(n)=(n-1)!$ for positive integer $n$
<br><br>
Using equation $(1)$ we can write Gamma distribution 

$$f(X)=\frac{\lambda^n X^{n-1} \cdot e^{-\lambda X}}{\Gamma(n)}, \qquad \forall X\ge 0$$
and zero otherwise.

<br><br>
If $X \sim \mathrm{Gamma}(\alpha , \lambda),$ then

$$EX= \frac{\alpha}{\lambda}, \qquad \mathrm{Var}X=\frac{\alpha}{\lambda ^2}$$

# Expectation, covariance

## Expectation

Expectation is defined as 
$$EX = \sum_i X_i\cdot p(X_i), \qquad EX=\int_{-\infty}^\infty Xf(X)dX$$

### Properties of Expectation

- Simple monotonicity property :
  If $a \le X \le b$ then, $a \le EX \le b$ <br>
  Proof:<br>
  $$\begin{align*}
  & a =a\cdot 1 =a \sum_i p(X_i) \\
  & \le\sum_i X_i p(X_i) \le \\
  & b \sum_i p(X_i) =b \cdot 1=b
  \end{align*}$$
- Expectation of _functions of variables_<br>
  Let $X$ and $Y$ be the random variables and $g: \mathbb{R}\times \mathbb{R} \rightarrow \mathbb{R}$ function then 
  $$\mathbf{E}g(X,Y)=\sum_{i,j}g(X_i,Y_j)\cdot p(X_i,Y_j)$$
- Expectation of sums and differences:<br>
  - Let $X$ and $Y$ be the random variables then $E(X+Y)=EX+EY$ and $E(X-Y)=EX-EY$<br>
    Proof:
    $$\begin{align*}
    E(X\pm Y) &= \sum _{i,j}(X_i \pm Y_j)\cdot P(X_i, Y_j) \\
    &=\sum_i \sum_j X_i \cdot P(X_i, Y_j) + \sum_i \sum_j Y_j \cdot P(X_i, Y_j) \\
    &=\sum_i X_i\cdot p_X(X_i) \pm \sum_j Y_j\cdot p_Y(Y_j)\\
    &=EX \pm EY
    \end{align*}$$
  - Let $X$ and $Y$ be the random variable such that $X \le Y$, then $EX \le EY$<br>
    Proof: <br>
    The difference $Y-X$ is non negative and difference of it's expectation is also non negative
    $$\begin{align*}
    & &E(Y-X) &\ge0 \\
    & \Rightarrow & EY-EX &\ge 0 \\
    & \Rightarrow & -EX &\ge -EY \\
    &\Rightarrow & EX &\le EY \\
    \end{align*}$$


  