# Report02 - Nathan Yee

This notebook contains report02 for computational baysian statistics fall 2016

MIT License: https://opensource.org/licenses/MIT

In [2]:
import numpy as np

import thinkbayes2
from thinkbayes2 import Pmf, Cdf, Suite
import thinkplot

% matplotlib inline

## Original type 1 - multi colored pens

Suppose you have a black, violet, red, and green pen. Each pen is identifiable by its colored cap or colored ink, For example, the red pen contains a red cap and red ink. The colored cap is visible, but the ink's color can only be seen on paper. Suppose you randomly mix up the pen's caps such that every cap has an equal probability of being on any colored ink. Later that day, you use the black capped pen and the ink is not black.  What is the probability that the red capped pen has red ink?

To make naming of the long hypotheses easier, I will list a simple naming convention below.  
B: Black  
V: Violet  
R: Red  
G: Green  
c: cap  
i: ink  
n: not  

So BcnBi stands for: "Black cap not Black ink"

### Solution 1 - Two Hypotheses

This problem can be solved with either two or four hypotheses. We will start with two hypotheses:  
hypo 1: The red cap has red ink  
hypo 2: The red cap does not have red ink  

In [15]:
class Pens2(Suite):

    def Likelihood(self, data, hypo):
        """Computes the likelihood of `data` given `hypo`. In this case, we will only
        update a single time with a single piece of data. This function servs as a place
        for calculating likelihoods
        
        data: BcnBi
        hypo: RcRi, RcnRi
        
        returns: float
        """
        if hypo == 'RcRi':
            return 2/3
        if hypo == 'RcnRi':
            return 7/9
        else:
            return 0 # hypotheses are inputted incorectly

When we have two hypotheses, calculating the likelihood is a little bit tricky.  
For our first hypothesis, RcRi, the probability that the black cap does not have black ink is 2/3. This is because two (G, V) out of 3 (G, V, B) pens are avaliable.  
For the second hypothesis, RcnRi, the probability that the black cap does not have black ink is 7/9. To see why, let's break it up into 3 scenarios, the red cap has a violet, green, or black cap. If RcVi, then the black cap coudld have red, black or green ink (2 out of 3 pens). If RcGi, then the black cap could have red, black, or violet ink (2 our of three pens). If RcBi, then the black cap could have red, green or violet ink (3 out of 3 pens). In total, if RcnRi, then the probability that the black cap does not have black ink is (2 + 2 + 3)/(3 + 3 + 3) = 7/9

In [16]:
pens2 = Pens2()
pens2['RcRi'] = 1  # possible ink colors: R
pens2['RcnRi'] = 3 # possible ink colors: V, G, B

In [17]:
pens2.Normalize()
pens2.Print()

RcRi 0.25
RcnRi 0.75


In [18]:
pens2.Update('BcnBi')
pens2.Print()

RcRi 0.2222222222222222
RcnRi 0.7777777777777778


After a single update, the probability that the red capped pen has red ink has fallen from 25% -> 22.2%

### Solution 1 - Four Hypotheses
In the previous two hypotheses solution, making the likelihood function was somewhat difficult. We will redo the problem with four hypotheses to have a simplier likelihood function.

In [19]:
class Pens4(Suite):

    def Likelihood(self, data, hypo):
        """Computes the likelihood of `data` given `hypo`. In this case, we will only
        update a single time with a single piece of data. This function servs as a place
        for calculating likelihoods
        
        data: BcnBi
        hypo: RcRi, RcVi, RcGi, RcBi
        
        returns: float
        """
        if hypo == 'RcRi':
            return 2/3
        if hypo == 'RcVi':
            return 2/3
        if hypo == 'RcGi':
            return 2/3
        if hypo == 'RcBi':
            return 3/3
        else:
            return 0 # hypotheses are inputted incorectly

When we have four hypotheses, calculating the likelihood is much easier.  
'RcRi': black cap could have violet, green, or black  
'RcVi': black cap could have red, green, or black  
'RcGi': black cap could have red, violet or black  
'RcBi': black cap could have red, violet, or green  

In [25]:
hypos = ['RcRi', 'RcVi', 'RcGi', 'RcBi']
pens4 = Pens4(hypos)
pens4.Print()

RcBi 0.25
RcGi 0.25
RcRi 0.25
RcVi 0.25


In [26]:
pens4.Update('BcnBi')
pens4['RcRi']

0.22222222222222224

It turns out, it does not matter how many hypotheses we start with. After seeing that the black cap does not have black ink, the probability of having red cap red ink comes out to be 22.2%