In [18]:
import sympy as sp

def factorial(n):
    
    if n<1:
        return 1
    else:
        return n*factorial(n-1)
    
def binomial(n, m):
    
    if m<0:
        return 1
    elif m>n:
        return 1
    else:  
        return factorial(n)/( factorial(n-m)*factorial(m) )

## Marginal Probability and Density

1. Let $Y_1$ and $Y_2$ be discrete random variables then the *marginal probability functions* of $Y_1$ and $Y_2$ are

$$ p_1(y_1) = \sum_{y_2} p(y_1, y_2) \quad\mbox{and}\quad p_2(y_2) = \sum_{y_1} p(y_1, y_2) $$

2. Let $Y_1$ and $Y_2$ be continuous random variables with joint PDF $f$, then the *marginal density functions* of $Y_1$ and $Y_2$ are

$$ f_1(y_1) = \int_{-\infty}^\infty f(y_1, y_2) dy_2 \quad\mbox{and}\quad f_2(y_2) = \int_{-\infty}^\infty f(y_1, y_2) dy_1 $$

## Example 

A congression committee formed of 4 democrats, 3 republicans, and 1 independent is forming a subcommittee of three people  to work on some legislation. Suppose the subcommittee is selected completely randomly with each member equally likely to be chosen. Let $Y_1$ be the number of democrats on the subcommittee and $Y_2$ the number of republicans. 

Find the marginal probabilities for $Y_1$ and $Y_2$.  


## Conditional Probability

Which then brings us to conditional probability. Suppose $Y_1$ and $Y_2$ are discrete random variables, we define *the probability $Y_1$ is a value given that $Y_2$ is a value*:

$$ p(y_1 | y_2) = P(Y_1 = y_1 | Y_2 = y_2) = \frac{P(Y_1 = y_1, Y_2 = y_2)}{P(Y_2 = y_2) }  $$

provided that $P(Y_2 = y_2) > 0 $.  Note that the denominator is the marginal distribution function.

### Example

Consider the subcommittee being formed above. What is the probability distribution for the number of democrats on the subcommittee given that 1 republican is on the subcommittee?

### Example

We roll two dice and let $Z_1$ be the value of the first die, and $Z_2$ the sum of the two dice. 

- What is the probability that the first die is a 1 given that the sum on the two dice is 2?

- What is the probability that the first die is a 1 given that the sum on the two dice is 7?

### Condition Probability for Continuous Random Variables

Suppose $Y_1$ and $Y_2$ are continuous random varialbes with a joint PDF $f(y_1, y_2)$. Then the *conditional density functions* are given by 

$$ f(y_1 | y_2) = \frac{ f(y_1, y_2) }{f_2(y_2) } \quad\mbox{and}\quad f(y_2|y_1) = \frac{f(y_1, y_2)}{f_1(y_1)} $$

### Example 

Consider the random variables $Y_1$ and $Y_2$ with joint density given by:

$$ f(y_1, y_2) = \left\{ \begin{matrix} 2 & 0 \leq y_2 \leq y_1 \leq 1 \\ 0 & \mbox{otherwise} \end{matrix}\right. $$

Find the probability that $y_1 \leq \frac12$ given that $y_2 \leq \frac34 $

### Example

Back to our example from the beginning of class. Suppose we are testing our virus test on 1000 people and we find the following:

|     | Infected | Not Infected | 
| --- | --- | --- |
| Tested Positive | 120 | 10 | 
| Tested Negative | 30  | 840 | 

Note in practice what we would mean by *Infected* and *Not Infected* would be *Ended up hospitalized*, *Were showing enough symptoms*, or *Tested positive on another known test*. 

How likely is it that someone who has tested negative, in fact is Infected? (noting that what we really mean is *how likely is it that someone from our study*; we will come back to how we can use this data to extrapolate to the broader application of this test).

# Independent Random Variables

The question of whether two random variables are independent has come up a few times already in class. Let's make it formal now.

## Definition

Let $Y_1$ and $Y_2$ be two jointly distributed random variables with marginal cummultative distribution functions $F_1(y_1)$ and $F_2(y_2)$.  Then the two variables are *independent* if the JDF is given by

$$ F(y_1, y_2) = F_1(y_1) F_2(y_2) $$

If $Y_1$ and $Y_2$ are not independent they are called *independent*

This definition could be phrased in terms of the probability distribution of discrete random variables or the probability density functions. 

## What does this mean

Two random variables are independent if we can compute the probability of them both happening by computing the likliehood of each one separately and then taking the product. *Think dice*

### Example

A class at the university has 15 mathematics majors, 8 software engineering majors, and 3 students from other majors. The instructor is going to choose a team of 5 students for a project. If the students are choosen at random, we let $Y_1$ by the number of mathematics majors and $Y_2$ the number of software engineering majors on the team. 

Is $Y_1$ independent of $Y_2$?

### Example

Let two continuous random variables have a joint density function given by:

$$ f(y_1, y_2) = \left\{ \begin{matrix} 2 y_1 & 0 \leq y_1 \leq 1 \quad\mbox{and}\quad 0 \leq y_2 \leq 1 \\ 0 & \mbox{otherwise} \end{matrix} \right. $$

are $Y_1$ and $Y_2$ indepdent?

### Example 

Let two continuous random variables have a joint density function given by:

$$ f(y_1, y_2) = \left\{ \begin{matrix} 2 & 0 \leq y_2 \leq y_1 \leq 1 \\ 0 & \mbox{otherwise} \end{matrix} \right. $$

are $Y_1$ and $Y_2$ independent?

## Independence and Conditional Probability

Note that independence has a consequence for conditional probability:

$$ p(y_1 | y_2 ) = \frac{ p(y_1, y_2) }{ p_2(y_2) } $$

However if $Y_1$ and $Y_2$ are independent then $p(y_1, y_2) = p_1(y_1) p_2(y_2) $ then:

$$ p(y_1 | y_2) = p_1(y_1) $$ 

I.e. the probability of $Y_1$ conditioned by $Y_2$ is just the same as the marginal probability of $Y_1$. In other words $Y_2$ is not contributing information to the probablity we assign to $Y_1$.

### Example 

Back to our example from the beginning of class. Suppose we are testing our virus test on 1000 people and we find the following:

|     | Infected | Not Infected | 
| --- | --- | --- |
| Tested Positive | 5 | 120 | 
| Tested Negative | 35  | 840 | 

Show that the test and the infection are independent (and conclude that this is not a very useful test).