# 6.041x - Unit 2: Conditioning and Independence
### Notes by Leo Robinovitch

***
## Core Concepts:

* **Conditional Probability:**
  * Probabilities from a revised model that takes into account information about the outcome of a probabilistic experiment  
  * Fundamental definition: $P(A|B) = \frac{P(A \cap B)}{P(B)}$ (valid when $P(B) \geq 0$)
  * Same rules apply:
    * $P(A|B) \geq 0$
    * $P(\Omega|B) = \frac{P(\Omega \cap B)}{P(B)} = \frac{P(B)}{P(B)} = 1$
    * $P(B|B) = 1$
    * Additivity: If $A \cap C = \emptyset$, then $P(A \cup C | B) = P(A|B) + P(C|B)$?
      * LHS: $\frac{P((A \cup C) \cap B)}{P(B)} = \frac{P((A \cap B) \cup (C \cap B)))}{P(B)} = \frac{P(A \cap B) + P(C \cap B)}{P(B)}$ because events are disjoint, and we can see that this matches RHS
    * Therefore all axioms for probability ALSO true for conditional probabilities!
    
    
  
*  **Three Key Tools** for conditional probability:
  1. Multiplication rule
  2. Total probability theorem
  3. Bayes' Rule (foundation of inference)  
  
  
* **Multiplication Rule:**
  * From definition of conditional probability: $P(A \cap B) = P(B)P(A|B) = P(A)P(B|A)$
  * Therefore, $P(A \cap B \cap C) = P((A \cap B) \cap C) = P(A \cap B)*P(C|A \cap B) = P(A)*P(B|A)*P(C|A \cap B) \to$ the probability of a given outcome on a leaf diagram is the multiplication of the probabilities of each path to get there
  * Precisely, $P(A_1 \cap A_2 ... \cap A_n) = P(A_1)* \prod\limits_{i=2}^{n}P(A_i|A_1 ... \cap A_{i-1})$  
  
  
* **Total Probability Theorem:**
  * Partition $\Omega$ into $A_1, A_2, A_3...A_i$ and know $P(A_i)$ and $P(B|A_i)$ for every i
  * $P(B) = \sum\limits_{i}P(A_i)P(B|A_i) \to$ a weighted average where $P(A_i)$ are the weightings (note that sum of all $A_i
  $'s add to 1)
  * If i $\to$ infinity, use countable additivity and replace summation with integral
  
  
* **Bayes' Rule:**
  * Again, partition $\Omega$ into $A_1, A_2, A_3...A_i$ and know $P(A_i)$ and $P(B|A_i)$ for every i
    * Here, $P(A_i) \to$ "Initial Beliefs"
  * Revised beliefs: $P(A_i|B) = \frac{P(A_i \cap B)}{P(B)}$
    * Numerator: Multiplication Rule: $P(A_i \cap B) = P(A_i)P(B|A_i)$
    * Denominator: Total Probability Theorem: $P(B) = \sum\limits_{j}P(A_j)P(B|A_j)$
  * Final Bayes Theorem: $P(A_i|B) = \frac{P(A_i)P(B|A_i)}{\sum\limits_{j}P(A_j)P(B|A_j)}$
    * Foundation for inference: given model about world $P(B|A_i)$, draw conclusions about causes $P(A_i|B)$  
    
    
* **Independence:**
  * Intuition: two events are independent if the occurrence of one event does not change our beliefs about the occurrence of the other
    * Completely different from disjoint events! In fact, if $P(A)>0$ and $P(B)>0$, disjoint $A$ and $B$ cannot be independent
  * If $P(B|A) = P(B)$, B is independent of A and vice versa
  * As such, conditional probability/multiplication rule becomes $P(A \cap B) = P(A)P(B|A) = P(A)P(B)$
  * This is the true (symmetric) definition of independence: $\mathbf{P(B \cap A) = P(B)P(A)}$
    * Note that this applies even when P(A) = 0 (normally conditional independence relies on that this not being true)
  * If $A$ and $B$ are independent, than $A$ and $B^c$ are also independent (if two events are independent, their complements are also independent)
  
  
* **Conditional Independence:**
  * Given C, are A and B independent?
  * Mathematically, $P(A \cap B|C) = P(A|C)P(B|C)$?
  * Cannot say that if A and B are independent, they remain independent under any C
    * Example: Two coins, A and B. $P(A) = P(B) = 0.5$. Once chosen, $P(Heads|A) = 0.9$, and $P(Heads|B) = 0.1$. If independent, $P(Toss_{11} = H|Previous10 = H) = P(Toss_{11} = H)$.
      * $P(Toss_{11} = H) = P(A)P(Heads|A) + P(B)P(Heads|B) = 0.5*0.9 + 0.5*0.1 = 0.5$
      * $P(Toss_{11} = H|Previous10 = H)$ should be approximately 0.9, as it is very likely that B occurred.
      * As such, $0.5 \neq \sim 0.9$ and no independence between tosses.
  
  
* **Independence about Collections of Events:**
  * Intuition: information about some of the events does not change the probability related to remaining events
  * Mathematically, $P(A_i \cap A_j ... \cap A_m) = P(A_i)P(A_j)...P(A_m)$ for all distinct indices i,j,...m


* **Pairwise Independence:**  
  * Pairwise independence: $P(A_i \cap A_j) = P(A_i)P(A_j)$ for any i, j
      * Also implies this for combos of 3, 4, 5, etc. (not just 2)
  * Example: Two independent, fair coin tosses:
    * $H_1$: first toss is heads
    * $H_2$: second toss is heads
    * $C$: two tosses had same result
      * Pairwise independent: $P(H_1 \cap C) = 1/4 = P(H_1)*P(C)$
      * Not independent: $P(H_1 \cap H_2 \cap C) = 1/4 \neq P(H_1)P(H_2)P(C) = 1/8$
      * Intuition: $P(C|H_1) = 1/2$, but $P(C|H_1 \cap H_2) = 1$. As such, $C$ is not independent from $H_1$ and $H_2$ **collectively**

***
## Lecture 2 Exercises:

**#1 True or False:**

*A. If Ω is finite and we have a discrete uniform probability law, and if B≠∅, then the conditional probability law on B, given that B occurred, is also discrete uniform.*
* True, because the outcomes inside B maintain the same relative proportions as in the original probability law.

*B. If Ω is finite and we have a discrete uniform probability law, and if B≠∅, then the conditional probability law on Ω, given that B occurred, is also discrete uniform.*
* False. Outcomes in Ω that are outside B have zero conditional probability, so it cannot be the case that all outcomes in Ω have the same conditional probability.



**#2 Let the sample space be the unit square, Ω=[0,1]^2, and let the probability of a set be the area of the set. Let A be the set of points (x,y)∈[0,1]2 for which y≤x. Let B be the set of points for which x≤1/2. Find P(A∣B).**

* $P(A|B) = \frac{P(A \cap B)}{P(B)}$
  * $P(A \cap B) = 1/2 * 1/2 * 1/2  = 1/8$
  * $P(B) = 1/2$
  * Therefore, $P(A|B) = 1/4$  
  
  
**#3 True or False:**

*A.* $P(A\cap B \cap C^c)=P(A \cap B)P(C^c∣A \cap B)$
  * True
  
*B.* $P(A \cap B \cap C^c)=P(A)P(C^c∣A)P(B∣A \cap C^c)$
  * True
  
*C.* $P(A \cap B \cap C^c)=P(A)P(C^c \cap A∣A)P(B∣A \cap C^c)$
  * True, because $P(C^c \cap A∣A) = P(C^c|A)$
  
*D.* $P(A \cap B∣C)=P(A∣C)P(B∣A \cap C)$
  * True --> application of multiplication rule $P(A \cap B) = P(A)P(B|A)$ with extra prior event C
  
  
**#4 We have an infinite collection of biased coins, indexed by the positive integers. Coin i has probability 2−i of being selected. A flip of coin i results in Heads with probability 3−i. We select a coin and flip it. What is the probability that the result is Heads? The geometric sum formula may be useful here:**  
<img src="l2_ex4.png", style="height:px;width:=250px;">
  
* $P(Heads) = \sum\limits_{i=1}^{\infty}P(A_i)P(Heads|A_i) = \sum\limits_{i=1}^{\infty}2^{-i}3^{-i} = \sum\limits_{i=1}^{\infty}\frac{1}{6}^{i} = \frac{1/6}{1-1/6} = \frac{1}{5}$  


**#5 A test for a certain rare disease is assumed to be correct 95% of the time: if a person has the disease, the test result is positive with probability 0.95, and if the person does not have the disease, the test result is negative with probability 0.95. A person drawn at random from a certain population has probability 0.001 of having the disease.**

*A. Find the probability a random person tests positive:*
  * $P(+|Has) = 0.95$
  * $P(-|!Has) = 0.95$
  * $P(Has) = 0.001$
  * Infer $P(-|Has) = 0.05$, $P(+|!Has) = 0.05$, $P(!Has) = 0.999$
  * Total Probability Theorem: $P(+) = P(Has)P(+|Has) + P(!Has)P(+|!Has) = 0.001*0.95 + 0.999*0.05 = 0.0509$
  
*B. Given that the person just tested positive, what is the probability he actually has the disease?*
  * $P(Has|+) = \frac{P(+|Has)P(Has)}{P(+)} = \frac{0.95*0.001}{0.0509} = 0.01866 = 1.87\%$





***
## Lecture 3 Exercises:  
Unfortunately, the course made the in-lecture exercises unavailable 3 weeks after archiving the course.

***
### Solved Problems:

**#1 **

***
### Problem Set 2:

**#1 **