In [1]:
# HIDDEN 
# flake8: noqa t
import sys
import os
projroot = lambda p: p if os.path.isfile(os.path.join(p, "Pipfile")) or p == '/' else projroot(os.path.dirname(p))
sys.path.append(projroot(os.getcwd()))

import pandas as pd
import urllib
import math
import numpy as np
from typing import Tuple, Any
from IPython.display import display, Math, Latex, display_latex, HTML

 # Bayes formula


In [2]:
# HIDDEN 
display(Latex(r""" 
\begin{equation}
P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} 
\end{equation}"""))


<IPython.core.display.Latex object>


 Over the years I've encountered numerous probability puzzles.  These puzzles are usually against human intuitions with
 surprising answers.  The common thread behind these puzzles are that they can all be solved using Baye's theorem in a
 consistent way.  Below we will go through three examples in the

 # Example 1


  Probability of a disease in the population is 1%.  A diagnose is 90% accurate on a diseased person, with a 5%
  mis-diagnose rate on a normal person.  If a randomly selected person is positively diagnosed, what's the probability
  of the actual disease?

  - Let A: has disease
  - Let B: positively diagnosed
  - $P(B \mid A)$: probability of positively diagnosed given having disease = 0.9
  - $P(A)$: has disease = 0.01
  - $P(B)$: positively diagnosed = 0.01 * 0.9 + 0.99 * 0.05 =0.0585
  - $P(A \mid B)$: =  (0.9 * 0.01) / 0.0585 = 0.154 = 15%

 A position diagnose result in 15% of probability of having the actual disease.  This seems slow, but makes sense intuitively:
 because the disease is rare, a randomly chose person is more likely to be disease free and mis diagnosed.  A common sense
 thing to do is to get a second test.  Now that we can replace $P(A)$ with 0.15 instead of 0.01, reapply the Bayes theorem
 again will mean a second positive diagnose results in 74% of chance of having the disease.




 # Example 2

 Monty hall.  3 closed doors, 1 with prize behind it.  Contestant randomly selects 1 door.  The host, knowing what's
 behind the doors, pick one of the remaining two doors that is empty and reveal it to the contestant.  Host then ask the contestant
 if he/she wants to switch.  Should the contestant switch?

 Before we reframe the question in Bayes, lets fix the parameters: say contestant picks door1, and
 host reveals door3.  So the question becomes

 1. $P(Prize=door1 \mid Open=door3)$:  probability of prize behind door1 given host opens door3
 2. $P(Prize=door2 \mid Open=door3)$:  probability of prize behind door2 given host opens door3


 Which of above two is greater?  If 2 is greater, then definitely switch!

 Note $P(Prize=door3 \mid Open=door3)$ is not among consideration, by the question definition this is eliminated from
 the probability space.

 Next we solve for 1 and 2

 1. $P(Prize=door1 \mid Open=door3)$ = $\frac{P(Open=door3 \mid Prize=door1) P(Prize=door1)}{P(Open=door3)}$
     - $P(Open=door3 \mid Prize=door1)$, probability of host open door3 given prize is behind door1, = 1/2
     - $P(Prize=door1)$, probability of prize is behind door1, = 1/3
     - $P(Open=door3)$, probability of host open door3 in general, = 1/2
     - result = (1/2 * 1/3) / 1/2 = 1/3
     - This also matches intuition: contestant picked randomly 1 out of 3 and stayed with the pick, the probability
       of winning should remain 1/3.

 2. $P(Prize=door2 \mid Open=door3)$ = $\frac{P(Open=door3 \mid Prize=door2) P(Prize=door2)}{P(Open=door3)}$
     - $P(Open=door3 \mid Prize=door2)$, probability of host open door3 given prize is behind door2, = 1.  In other words, the host has no choice.
     - $P(Prize=door2)$, probability of prize is behind door2, = 1/3
     - $P(Open=door3)$, probability of host open door3 in general, = 1/2
     - result = (1 * 1/3) / 1/2 = 2/3


 2/3 > 1/3, switch is a good idea!

 Another way to think of this problem is that when the host removed an empty door, the host removed the uncertainty:
 2 doors out of three by definition has 2/3 chance of containing the prize, now that the empty door is removed, the entire
 2/3 probability is assigned to the only remaining closed door.

 Now suppose the question is modified as such:  host's toddler son wondered in and accidentally knocked over door3
 which is revealed to be empty,  does switching door enhances the contestant's probability of winning?

 1. $P(Prize=door1 \mid Open=door3)$ = $\frac{P(Open=door3 \mid Prize=door1) P(Prize=door1)}{P(Open=door3)}$
     - $P(Open=door3 \mid Prize=door1)$, probability of toddler knocks door3 given prize is behind door1, = 1/2
     - $P(Prize=door1)$, probability of prize is behind door1, = 1/3
     - $P(Open=door3)$, probability of toddler knocks over door3 in general, = 1/2
     - result = (1/2 * 1/3) / 1/2 = 1/3

 2. $P(Prize=door2 \mid Open=door3)$ = $\frac{P(Open=door3 \mid Prize=door2) P(Prize=door2)}{P(Open=door3)}$
     - $P(Open=door3 \mid Prize=door2)$, probability of toddler knocks door3 given prize is behind door2, = 1/2
     - $P(Prize=door2)$, probability of prize is behind door2, = 1/3
     - $P(Open=door3)$, probability of toddler knocks over door3 in general, = 1/2
     - result = (1/2 * 1/3) / 1/2 = 1/3


 1/3 = 1/3, switching makes no difference!

 From contestant's perspective, both scenarios appear the same.  However, the actual probability of winning the prize
 after switching the door varies a great deal depending on the **circumstances** behind the scenario: whether the host
 chose the empty door with or without the knowledge of the prize location.  Contemplate on this for a moment,
 then proceed to the next example.


 # Example 3

 ## Version 1
 We have a random sample of parents with exactly two children.  We then ask one parent if at least one of the child
 is a daughter.  If the parent says yes,  what's the probability that the other child is also a daughter?

 If the other child is also a daughter, then both children are daughters,  we denote this as P(DD).  We then denote that
 P(D) is at least one child is daughter.  So we are seeking $P(DD | D)$.

 $P(DD | D)$ = $\frac{P(D \mid DD) P(DD)}{P(D)}$

   - $P(D \mid DD)$, probability one is daughter given both are daughters: 1
   - $P(DD)$, probability both are daughters: 1/4
   - $P(D)$, probability at least one daughter = 1 - probability of both sons = 1 - 1/4 = 3/4
   - result = (1/2 * 1/3) / 1/2 = 1/3


 The probability of the other child is a daughter is 1/3.


 ## Version 2
 Same as version 1, except we ask if each parent if they have a daughter named Lucy.  If the parent says yes,  what's the
 probability that the other child is also a daughter?
r
 Now we are looking for $P(DD | Lucy)$.

 $P(DD | Lucy)$ = $\frac{P(Lucy \mid DD) P(DD)}{P(Lucy)}$

   - $P(DD)$, probability bother are daughters: 1/4
   - $P(Lucy)$, probability of a girl is named Lucy, lets assume it's L (the actual value doesn't really matter).
   - $P(Lucy \mid DD)$, probability of at least one girl is named Lucy given there are two daughters, this is 2 * $P(Lucy)$ = 2 * L.
   - result = (2 \* P(Lucy) * 1/4) / P(Lucy) = (2 * L * 1/4) / L  = 1/2


 So the probability of other child is a daughter is 1/2.


 This example also illustrates the essence of bayesian reasoning:  the posterior probability is affected by the input (priors), or
 our knowledge of the world!