## Theory

- Just like discrete random variables, continuous random variables also have covariances. And similar to section 44, though we rewrite these in terms of integrals rather than summations, the underlying logic is exactly the same

#### Section 44.1 (Covariance)

- **Definition 44.1**: If X and Y are random variables, the covariance of X and Y, symbolized by $Cov[X,Y]$ is defined as $$Cov[X,Y] = E[(X - E[X])(Y - E[Y])]$$

- **Theorem 44.1**: The covariance is also defined as $$Cov[X,Y] = E[XY] - E[X]E[Y]$$
    - Proof:
    - $$\begin{align}
        Cov[X,Y] &= E[(X-E[X])(Y - E[Y])] \\
        &= E[XY - XE[Y] - YE[X] + E[X]E[Y]] \\
        &= E[XY] - E[X]E[Y] - E[Y]E[X] + E[X]E[Y] \\
        &= E[XY] - E[X]E[Y] 
        \end{align}$$

- **Example 44.1 (Covariance Between the First and Second Arrival Times)**: In Example 41.1, we saw that the joint distribution of the first arrival time X and the second arrival time Y in a Poisson process of rate $\lambda = 0.8$ is $$ f(x,y) = \begin{Bmatrix} 0.64 e^{-0.8y} & 0 \lt x \lt y \\ 0 & \text{otherwise} \end{Bmatrix}$$ What is the covariance between X and Y?
    - **Note:** This is going to be a big pain to derive, but we'll just do it for practise
    - Intuitively, it is quite clear that and later X appears, the later Y appears. Obviously, the second arrival cannot arrive before the first by definition. So we should expect covariance to be positive

    - First, we make use of the relationship $Cov[X,Y] = E[XY] - E[X]E[Y]$
    - $E[X] = \frac{1}{\lambda}$
        - because X follows an exponential distribution with $\lambda = 0.8$
    - $E[Y] = \frac{1}{\lambda} + \frac{1}{\lambda} = \frac{2}{\lambda}$
        - We know that Y is the sum of 2 exponential distributions (because Y is time for arrival 1 + time for arrival 2). 
        - Each arrival is independent by Poisson assumption. 
        - So $E[Y] = E[X_1] + E[X_2] = 2 \cdot E[X_1] = 2 \cdot \frac{1}{\lambda}$
    - $E[XY] = 4.6875$
        - $$\begin{align}
            E[XY] &= \int_{y=0}^{\inf} \int_{x=0}^{y} xy \cdot f(x,y) dx dy \\ 
            &= \int_{y=0}^{\inf} y f(x,y) \int_{x=0}^{y} x dx dy \\ 
            &= \int_{y=0}^{\inf} y f(x,y) \frac{y^2}{2} dy \\ 
            &= \int_{y=0}^{\inf} \frac{y^3}{2} \cdot 0.64 e^{-0.8y} dy \\ 
            &= \int_{y=0}^{\inf} y^3 \cdot 0.32 e^{-0.8y} dy & u = y^3, du = 3y^2, dv = 0.32 e^{-0.8y}, v = -0.4 e^{-0.8y} \\ 
            &= -y^3 \cdot 0.4 e^{-0.8y} + \int_{y=0}^{\inf} y^2 \cdot 1.2 e^{-0.8y} dy & u = y^2, du = 2y, dv = 1.2 e^{-0.8y}, v = -1.5 e^{-0.8y} \\ 
            &= -y^3 \cdot 0.4 e^{-0.8y} - y^2 \cdot 1.5 e^{-0.8y} + \int_{y=0}^{\inf} y \cdot 3 e^{0.8y} dy & u = y, du = 1, dv = 3 e^{-0.8y}, v = -3.75 e^{-0.8y} \\ 
            &= -y^3 \cdot 0.4 e^{-0.8y} - y^2 \cdot 1.5 e^{-0.8y} - y \cdot 3 e^{-0.8y} + \int_{y=0}^{\inf} 3.75 e^{-0.8y} dy \\ 
            &= [-y^3 \cdot 0.4 e^{-0.8y} - y^2 \cdot 1.5 e^{-0.8y} - y \cdot 3 e^{-0.8y} - 4.6875 e^{0.8y}]^{\inf}_{0} \\ 
            &= (-0-0-0-0) - (-0-0-0-4.6875) \\
            &= 4.6875 \\
            \end{align}$$
    - $E[XY] - E[X]E[Y] = 4.6875 - \frac{2}{0.8^2} = 1.5625$
    

#### Section 44.2 (Covariance properties)

- Let $X, Y, Z$ be random variables, and let $c$ be a constant

- Properties
    - $Var[X] = Cov[X,X]$ 
        - $$\begin{align}
            \text{Proof:} \\
            Cov[X,X] &= E[(X-E[X])(X-E[X])] \\
            &= E[X^2 - XE[X] - XE[X] + E[X]^2] \\
            &= E[X^2] - E[X]E[X] - E[X]E[X] + E[X]^2 \\
            &= E[X^2] - E[X]^2 \\
            &= Var[X]
            \end{align}$$
    - $Cov[cX, Y] = c \cdot Cov[X,Y] \\ Cov[X, cY] = c \cdot Cov[X,Y]$
        - $$\begin{align}
            \text{Proof:} \\
            Cov[cX, Y] &= E[(cX - E[cX])(Y - E[Y])] \\
            &= E[cXY - cXE[Y] - E[cX]Y + E[cX]E[Y]] \\
            &= cE[XY] - cE[X]E[Y] - cE[X]E[Y] + cE[X]E[Y] \\
            &= cE[XY] - cE[X]E[Y] \\
            &= c \cdot (E[XY] - E[X]E[Y]) \\
            &= c \cdot Cov[X,Y]
            \end{align}$$
    - $Cov[X+Y, Z] = Cov[X,Z] + Cov[Y,Z] \\ Cov[X, Y+Z] = Cov[X,Y] + Cov[X,Z]$
        - $$\begin{align}
            \text{Proof:} \\
            Cov[X+Y,Z] &= E[(X+Y - E[X+Y])(Z - E[Z])] \\
            &= E[XZ + YZ - (X+Y)E[Z] - E[X+Y]Z + E[X+Y]E[Z]] \\
            &= E[XZ + YZ - E[Z]X - E[Z]Y - E[X+Y]Z + E[X+Y]E[Z]] & \text{by linearity of expectations} \\
            &= E[XZ + YZ - E[Z]X - E[Z]Y - E[X]Z - E[Y]Z + E[X]E[Z] + E[Y]E[Z]] \\
            &= E[XZ] + E[YZ] - E[Z]E[X] - E[Z][Y] - E[X][Z] - E[Y][Z] + E[X]E[Z] + E[Y]E[Z] \\
            &= (E[XZ] - E[X]E[Z]) + (E[YZ] - E[Y]E[Z]) \\
            &= Cov(X,Z) + Cov(Y,Z)
            \end{align}$$
    - $Cov[X,Y] = Cov[Y,X]$
        - $$\begin{align}
            \text{Proof:} \\
            Cov[X,Y] &= E[(X-E[X])(Y-E[Y])] \\
            &= E[(Y-E[Y])(X-E[X])] \\
            &= Cov[Y,X]
            \end{align}$$
    - $Cov[X,c] = 0$
        - $$\begin{align}
            \text{Proof:} \\
            Cov[X,c] &= E[(X-E[X])(c-E[c])] \\
            &= E[(X-E[X]) \cdot 0] \\
            &= 0
            \end{align}$$
    

- **Example 44.2**: We solved example 44.1 using integration by parts 4 times. That is a truly terrible way of solving it. We can make it easier to solve by applying the rules here:
    - To recap, we want to find $Cov[X,Y]$. The difficulty here is that $X$ and $Y$ are not independent. So trying to compute covariance is tricky. Let's rewrite the problem in another way. 
    - We know that $Y$ comprises 2 arrivals, and since arrivals are from the same Poisson distribution, they are identical and independent. 
    - Hence, $Cov[X,Y] = Cov[X, X+A]$, where $A$ is the time between the arrival of the first and the second particle
    - $$\begin{align}
        Cov[X, X+A] &= Cov[X,X] + Cov[X,A] \\
        &= Cov[X,X] & \text{by independence, } Cov[X,A] = 0 \\
        &= \frac{1}{0.8}^2 \\
        &= 1.5625
        \end{align}$$

- **Example 44.3 (Standard deviation of arrival times)**: In example 43.3, we saw that the expecteed value of the $r$-th arrival time in a Poisson process of rate $\lambda =0.8$ is $\frac{r}{0.8}$. What is the standard deviation of the $r$-th arrival time?
    - Let the arrival time of the r-th arrival by $T_r$
    - We know that the arrival time of the r-th arrival is the sum of time between all arrivals, all of which are independent and identical 
        - $$T_r = t_1 + t_2 + ... + t_r$$
    - Hence, the expectation of $T_r$ is
        - $$E[T_r] = E[t_1] + E[t_2] + ... + E[t_r] = \frac{1}{\lambda} + \frac{1}{\lambda} + ... \frac{1}{\lambda} = \frac{r}{\lambda}$$
    - By definition
        - $$\begin{align}
            Var[T_r] &= Cov[T_r, T_r] \\
            &= Cov[t_1 + t_2 + ... t_r, t_1 + t_2 + ... t_r] \\
            &= \sum_{i=j} Cov[t_i, t_j] + \sum_{{i \neq j}} Cov[t_i, t_j] \\
            &= \sum_{i=j} Cov[t_i, t_j] & \text{by independence } \sum_{{i \neq j}} Cov[t_i, t_j] = 0 \\
            &= \sum_{i=j} Var[t_i] \\
            &= r \cdot Var[t_i] \\
            &= \frac{r}{\lambda^2} \\ \\

            SD[T_r] &= \sqrt{\frac{r}{\lambda^2}} \\
            &= \frac{\sqrt{r}}{\lambda}
            \end{align}$$
