In [1]:
from __future__ import print_function, division

% matplotlib inline

from thinkbayes2 import Hist, Pmf, Suite
from sympy import symbols

Suppose there are two full bowls of cookies. Bowl #1 has 10 chocolate chip 
and 30 plain cookies, while bowl #2 has 20 of each. Our friend Fred picks 
a bowl at random, and then picks a cookie at random. We may assume there is
no reason to believe Fred treats one bowl differently from another, 
likewise for the cookies. The cookie turns out to be a plain one. How 
probable is it that Fred picked it out of Bowl #1?

First, I defined the hypotheses as being the cookie was from bowl 1 and the
cookie was from bowl 2. The chances of picking each bowl independent of
data are equal, and p(H) is defined as such (line one). After, I multiplied
the likelihood of picking a plain cookie from each bowl, 75 percent for bowl one (line two) and 50 percent (line three) for bowl two, by the previously defined p(H). Finally, I normalized (line four) and printed the results (line five).

In [13]:
pmf = Pmf(dict(Bowl1 = 1,Bowl2 = 1)) #line one
pmf['Bowl1'] *= .75 #line two
pmf['Bowl2'] *= .5 #line three
pmf.Normalize() #line four
pmf.Print() #line five

Bowl1 0.6
Bowl2 0.4


Suppose you're on a game show, and you're given the choice of three doors: 
Behind one door is a car; behind the others, goats. You pick a door, say 
Door A [but the door is not opened], and the host, who knows what's behind 
the doors, opens another door, say Door B, which has a goat. He then says 
to you, "Do you want to pick Door C?" Is it to your advantage to switch 
your choice? He opens B with probability p and C with probability 1-p.

Because the answer for this problem will be written in terms of a variables, a variable needs to be defined first (line one). Because each of the doors has an equal probability of holding the car alone, the pmf is initialized with A, B, and C having equal weight (line two). The likelihoods are then multiplied by the p(H). The probability of the host opening door B if the car is behind door A is p as stated in the problem (line three). The host cannot open door B if the car is behind door B, so the likelihood is zero (line four). The host must open door B if the car is behind door C, so the probability is 1 there (line five). Finally, as always, I normalized and printed the result (line five and six).

In [3]:
p = symbols('p') #line one
pmf = Pmf('ABC') #line two
pmf['A'] *= p #line three
pmf['B'] *= 0 #line four
pmf['C'] *= 1 #line five
pmf.Normalize() #line six
pmf.Print() #line seven

A 0.333333333333333*p/(0.333333333333333*p + 0.333333333333333)
B 0
C 0.333333333333333/(0.333333333333333*p + 0.333333333333333)


According to the CDC, Compared to nonsmokers, men who smoke are about 23 
times more likely to develop lung cancer and women who smoke are about 13 
times more likely.'' Also, among adults in the U.S. in 2014:

    Nearly 19 of every 100 adult men (18.8%) Nearly 15 
    of every 100 adult women (14.8%)

If you learn that a woman has been diagnosed with lung cancer, and you know
nothing else about her, what is the probability that she is a smoker?

The p(H) is defined as given by the problem statement in the pmf 
initialization (line one). The likelihoods, multiplied by the p(H) values are 13 and 1
for smokers and nonsmokers respectively because women are 13 times more 
likely to get lung cancer if they smoke (line two and three). Finally, I normalize the result
and print the result (line four and five).

In [5]:
pmf = Pmf(dict(Smoker = 14.8, Nonsmoker = 85.2)) #line one
pmf['Smoker'] *= 13 #line two
pmf['Nonsmoker'] *= 1 #line three
pmf.Normalize() #line four
pmf.Print() #line five


Nonsmoker 0.306916426513
Smoker 0.693083573487


The blue M&M was introduced in 1995.  Before then, the color mix in a bag 
of plain M&Ms was (30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 
10% Tan).  Afterward it was (24% Blue , 20% Green, 16% Orange, 14% Yellow, 
13% Red, 13% Brown). 

A friend of mine has two bags of M&Ms, and he tells me that one is from 
1994 and one from 1996.  He won't tell me which is which, but he gives me 
one M&M from each bag.  One is yellow and one is green.  What is the 
probability that the yellow M&M came from the 1994 bag?

For this problem, I defined Mixa as the case where the yellow is in the
1994 bag and Mixb as the case where the yellow is in the 1996 bag (line one). Alone, the chances of each are equal, so the odds of each are 1 in the pmf  initialization. The likelihoods are the odds of the yellow and green candies for the respective mixes of each hypothesis multiplied together. As always, the likelihoods are multiplied by the p(H) values (line two and three). Finally, I normalize and print the results (line four and five).

In [7]:
pmf = Pmf(dict(Mixa = 1,Mixb = 1)) #line one
pmf['Mixa'] *= .2*.2 #line two
pmf['Mixb'] *= .14*.1 #line three
pmf.Normalize() #line four
pmf.Print() #line five

Mixa 0.740740740741
Mixb 0.259259259259


Elvis Presley had a twin brother who died at birth.  What is 
the probability that Elvis was an identical twin?

First, the p(H) values are defined in the pmf initialization (line one). The odds of identical twins are 8 percent to the 92 percent chance of fraternal twins. The likelihoods of two boys is one half for identical twins because the only two options are two boys and two girls. The likelihood of two boys with fraternal twins is one quarter because there are four possible boy-girl combinations. In the code, the likelihoods are represented as 1 and .5 because it is only important that it is twice as likely that there will be two boys with identical twins. Finally, I multiplied the likelihoods, normalized, and printed the result (lines two through five).

In [None]:
pmf = Pmf(dict(identical=8,fraternal=92)) #line one
pmf['identical'] *= 1 #line two
pmf['fraternal'] *= .5 #line three
pmf.Normalize() #line four
pmf.Print() #five

According to a recent study, 25 percent of girls aged 3 are less than 36
inches. Similarly, 10 percent of boys aged 3 are less than 36 inches tall. 
What are the odds that a 36 inch 3 year old is a boy?

The chances are essentially equal of having a boy or girl, so the p(H)'s 
are both 1 (line one). The likelihoods are defined based on the data given in the problem. I multiplied the p(H)'s by the likelihoods, normalized, and printed the results (line two through four).

In [12]:
pmf = Pmf(dict(Boy=1,Girl=1)) #line one
pmf['Boy'] *= .1 #line two
pmf['Girl'] *= .25 #line three
pmf.Normalize() #line four
pmf.Print() #line five

Boy 0.285714285714
Girl 0.714285714286


There is a national unemployment of 4.9 percent. The highest unemployment in the country is in New Mexico, which has an unemployment rate of 6.7 percent. What are the odds that an unemployed person is from New Mexico?

First, I defined the p(H) for New Mexico and the nation based on the 
percentage of the national population that lives in New Mexico 
(.65 percent) (line one). After, I multiplied by the likelihoods, which were the NM and national unemployment rates (lines two and three). Finally, I normalized and printed the results (lines four and five).

In [14]:
pmf = Pmf(dict(NM=.0065,Nation=1)) #line one
pmf['NM'] *= .067 #line two
pmf['Nation'] *= .049 #line three
pmf.Normalize() #line four
pmf.Print() #line five

NM 0.00880945878974
Nation 0.99119054121
