# CE2 Mathematics: Probability & Statistics 2018 
# Lecture 2 - Chapter 1
## Sections 1.4-1.7

### To launch the notebook:
- From the menu above, select `Kernel` and `Restart & Run All`.  
- Make sure that the `%matplotlib inline` and `%run Codes/Lecture02.ipy` cells were executed without an error. A number will appear to the left of each (e.g. `In [1]`) if successful.
- If you prefer to follow the lecture in slide format, select the Lecture Overview cell and click on the RISE button <kbd><i class="fa-bar-chart fa"></i></kbd> in the menu above

### Solutions to worked problems during lecture:
- The answers are encoded by passwords; the information regarding the format of the password can be found by hovering your mouse over the <i class="fa fa-info-circle fa-2x" title="Password details"></i> icon
- Type your answer within the quotes in `password''`, then type `Shift` and `Enter` (or click the `>| Run` button above). If you get it correct, the full solution will be displayed. If the answer is incorrect, a random string of text will appear.

In [1]:
%matplotlib inline

In [2]:
%run Codes/Lecture02.ipy

# Lecture Overview
The lecture today will finish Chapter 1 and introduce several important relationships that will be used throughout the course:
- 1.4 **Conditional Probability & Chain Rule**
- 1.5 **Total Law**
- 1.6 **Bayes' Theorem**
- 1.7 **Independence**

Examples will be used to illustrate throughout.

# 1.4 Does Conditional Probability Change the Fundametal Properties (P1-P4)?
No: conditioning changes the value of the probability, but not the event. In the same way that the function $P(\cdot)$ assigns a probability to any event $A \subseteq \Omega$, the function $P(\cdot |B)$ assigns a conditional probability (given $B$) to any event $A \subseteq \Omega$. Thus, all the identities we have encountered so far will still work, for example:


$$
\begin{align}
\text{Recall P1: } &P(\overline A) = 1 - P(A) \\
&\color{red}{P(\overline A|B) = 1 - P(A|B)} \\
\text{Recall P4: } &P(A \cup B) = P(A) + P(B) - P(A,B) \\ &\color{red}{P(A \cup B|C) = P(A|C) + P(B|C) - P(A,B|C)}
\end{align}
$$

## 1.4 Example: Conditional Probability - Rolling Two Dice
- $P(\omega_1,\omega_2) = \frac{1}{36}$ for all $\omega_1, \omega_2 \in \{1,...,6\})$ 
- Event $A = \{(\omega_1,\omega_2): \omega_1 \leq 4,\omega_2 \leq 4 \}$ contains 16 outcomes
- Event $\color{red}{B} = \{(\omega_1,\omega_2): \omega_1 + \omega_2 \geq 7\}$ contains 21 outcomes

![](attachment:TwoDice_AllEvents_Fig.svg)

- $P(\color{red}{B}|A)$ - Given event $A$ has happened, the three outcomes $(A,\color{red}{B})$ each now occur with a probability of $\frac{1}{16}$

![](attachment:TwoDice_GivenA_Fig.svg)

- $P(A|\color{red}{B})$ - Given event $\color{red}{B}$ has happened, the three outcomes $(A,\color{red}{B})$ each now occur with a probability of $\frac{1}{21}$

![](attachment:TwoDice_GivenB_Fig.svg)

# 1.4 Chain Rule
- Rearranging the definition of Conditional Probability, ($P(A,B) = P(A|B)P(B)$), can be used to derive the **Chain Rule**:
$$P(A,B,C) =  P(C|A,B) P(B|A) P(A)$$
- We first condition $P(A,B,C)$ on $A$:
$$P(A,B,C) = \color{green}{P(B,C|A)}  P(A)$$
- Then we condition $\color{green}{P(B,C|A)}$ on $B$
$$\color{green}{P(B,C|A)} =  \color{blue}{P(C|B,A) P(B|A)}$$
- Some "tricks" for more complicated expressions provided in `Probability_Stats_MathsII_ChainRule_Ch1.pdf` in Summary folder on BBl

In [3]:
Math(NOTE_2A)

<IPython.core.display.Math object>

## Example: Chain Rule - 2015 Exam, Q11a part (i)
Your friend has come to visit you in London and wants to know if they can fish in the river Thames.  You are worried that due to pollution there will not be many fish available.  To quantify this uncertainty, you define the following events.
- $A$ = {The river Thames is polluted}
- $B$ = {A sample of water from the Thames detects pollution}
- $C$ = {Fishing is permitted in the Thames}

You estimate the following information:

- $P(A) = 0.30$
- $P(B|A) = 0.75 \qquad P(B|\overline{A}) = 0.20$
- $P(C|A,B) = 0.20 \qquad P(C|\overline{A},B) = 0.15$
- $P(C|A,\overline{B}) = 0.80 \qquad P(C|,\overline{A},\overline{B}) =0.90$

Where $\overline{A}$ corresponds to the Thames _not_ being polluted, etc.

1. What is the probability that the river is polluted, is detected to be polluted, and fishing is allowed (i.e.  $P(A,B,C)$)?

In [4]:
display_answer(question=0, password='0.045')

<IPython.core.display.Latex object>

# 1.5 Total Law of Probability
- Suppose that events $A_1,...,A_n$ form a **partition** of $\Omega$; that is: 
  1. They are pairwise **mutually exhaustive** (*pg 5 Lecture Notes*)
  2. They are **collectively exhaustive** (*pg 5 Lecture Notes*)
  3. They all have nonzero probability, $P(A_i) > 0$ for $i = 1,...,n$

- Then for any event $B \subset \Omega$, the Law of Total Probability states that:
$$P(B) = \sum_{k=1}^{n} P(B,A_k) = \sum_{k=1}^{n} P(B | A_k) P(A_k)$$

![](attachment:Total_Probability.jpeg)

## Example: Total Law of Probability - 2015 Exam, Q11a part (ii)
Your friend has come to visit you in London and wants to know if they can fish in the river Thames.  You are worried that due to pollution there will not be many fish available.  To quantify this uncertainty, you define the following events.
- $A$ = {The river Thames is polluted}
- $B$ = {A sample of water from the Thames detects pollution}
- $C$ = {Fishing is permitted in the Thames}

You estimate the following information:

- $P(A) = 0.30$
- $P(B|A) = 0.75 \qquad P(B|\overline{A}) = 0.20$
- $P(C|A,B) = 0.20 \qquad P(C|\overline{A},B) = 0.15$
- $P(C|A,\overline{B}) = 0.80 \qquad P(C|,\overline{A},\overline{B}) =0.90$

Where $\overline{A}$ corresponds to the Thames _not_ being polluted, etc.

2. What is the probability that a sample of water is _not_ detected to be polluted, and fishing is allowed in the Thames (i.e. $P(\overline{B},C)$)?

In [5]:
display_answer(question=1, password='0.564')

<IPython.core.display.Latex object>

# 1.6 Bayes' Theorem
- **Bayes' Theorem** allows us to reverse the orer of the conditioning probabilities:
$$P(B|A) = \frac{P(A|B) P(B)}{\color{red}{P(A)}}$$

- or equivalently by envoking the Total Law above:
$$P(B|A) = \frac{P(A|B)P(B)}{\color{red}{\sum_{k=1}^{n} P(A|B_k)P(B_k)}}$$

## Example: Bayes' Theorem - 2015 Exam, Q12a
Various forms of diagnostic screening are commonly used in an atempt to spot serious illnesses early.  Let's say you have a medical procedure to determine if you have a particular form of cancer.  Let $T$ denote the event the test is positive for cancer, and $C$ denote the event you actually do have cancer (thus, $\overline{C}$ denotes the event that you do _not_ have cancer).  From the general prevalence of this form of cancer, we are given that $P(C) = 0.0001$.  The accuracy of the diagnostic test is:

$P(T|C) = 0.90 \qquad P(T|\overline{C}) = 0.001$

1. Draw a Bayes' Net for this problem and use the conditional relationships to write the full joint distribution.
2. Write down the expression for the probability that you have cancer, given the test is positive (i.e. $P(C|T)$) using the probabilistic inference approach for Bayes' Nets.  What is the common name for the equation you derived?  Compute this probability.
3. Would you recommend this test as a method for screening this particular type of cancer?  Explain why or why not?

In [6]:
display_answer(question=2, password='0.0825')

<IPython.core.display.Latex object>

# 1.7 Independence
- Events can be considered **"independent"** (denoted as $A \perp B$) if knowing one does not tell us anything about the other.
$$\color{red}{P(A, B) = P(A)P(B)}$$

- The above formula is the standard test for independence
- $A \perp B$ and $P(B) > 0$ also has the consequence that:
$$P(A|B) = \frac{P(A,B)}{P(B)} = P(A)$$

## Example: Independence
- Imagine we toss a coin twice. There are four outcomes in the sample space of this experiment with equal probability.
$$P(HH) = P(TT) = P(HT) = P(TH) = \frac{1}{4}$$
- Define events $A$ and $B$ as the events where the first and second coin tosses result in heads, respectively.
  - Are events $(A,B)$ independent?
- Define event $C$ to include outcomes where **at least one toss is heads**.
  - Are events $(A,C)$ independent?
  - Are events $(B,C)$ independent?
  
<i class="fa fa-info-circle fa-2x" title="Hint: the password for the answer is P(A,B), P(A,C), P(B,C)."></i>

In [9]:
display(VBox([input_box, two_coins_out], display = 'flex', align_items = 'center'))

VBox(children=(HBox(children=(HTML(value='Show event(s):', layout=Layout(width='15%')), Checkbox(value=True, d…

In [10]:
display_answer(question=3, password='1/4,1/2,1/2')

<IPython.core.display.Latex object>

# Summary

- This lecture concludes Chapter 1
- Main take home point: we introduced some important probability relationships that will be used throughout this term, specifically
  - 1.3 Inclusion-Exclusion formula
  - 1.4 Conditional Probability & Chain Rule
  - 1.5 Total Law
  - 1.6 Bayes' Theorem
  - 1.7 Independence

- These are summarised at the start of Chapter 1 in the lecture notes
- Problem Set 1, also a Jupyter notebook with password-protected solutions, will be released on BBl this week
- Lecture 3 will begin Chapter 2