# Report 02
### Daniel Bishop

License: Attribution 4.0 International (CC BY 4.0) 

In [25]:
from thinkbayes2 import Pmf

## The Chess Problem
### Copied from Quiz 1

> Two identical twins are members of my chess club, but they never show up on the same day; in fact, they strictly alternate the days they show up. I can't tell them apart except that one is a better player than the other: Avery beats me 60% of the time and I beat Blake 70% of the time. If I play one twin on Monday and win, and the other twin on Tuesday and lose, which twin did I play on which day?

Since the two twins always alternate their attendance, there are only 2 possible hypotheses for which one showed up on each day.
* Avery and then Blake  
or
* Blake and then Avery

In [73]:
class Chess(Pmf):
    def __init__(self, hypos):
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()
        self.day = False;
        
    def Update(self, data):
        self.day = not self.day; # move to next day
        for hypo in self.Values():
            like = self.Likelihood(data, hypo)
            self.Mult(hypo, like)
        self.Normalize()
    
    '''
    the odds of the player winning or losing vs each opponent
    '''
    player_outcomes = {
        'blake': {
            'win': .7,
            'lose': .3,
            'alternate': { # quick hack to let each hypo access the likelihoods of the other
                'win': .4,
                'lose': .6
            }
         },
        'avery': {
            'win': .4,
            'lose': .6,
            'alternate': {
                'win': .7,
                'lose': .3
            }
        }
    }
    
    '''
    data an be either 'win' or 'lose',
    the likelihood of each outcome changes based on the 'day' where each new piece of data
    represents a new day.
    
    the hypothesis is the opponent that the player faced on Monday, which is the first data
    and is also an even day.
    '''
    def Likelihood(self, data, hypo):
        if (self.day):
            return self.player_outcomes[hypo][data]
        else:
            # choose the likelihood of the other opponent because they swap on the odd days
            return self.player_outcomes[hypo]["alternate"][data]
            

We can now use our model to predict which player we faced on Monday, and which we faced on Tuesday.

In [74]:
pmf = Chess(['blake', 'avery'])
dataset = ['win']
for data in dataset:
    pmf.Update(data)
for hypo, prob in pmf.Items():
    print(hypo, prob)

avery 0.36363636363636365
blake 0.6363636363636364


Here we can see that from the win on Monday, it is likely that we played against Blake, since we are more likely to win versus him.

In [75]:
pmf = Chess(['blake', 'avery'])
dataset = ['win', 'lose']
for data in dataset:
    pmf.Update(data)
for hypo, prob in pmf.Items():
    print(hypo, prob)

avery 0.22222222222222224
blake 0.7777777777777778


After losing on Tuesday, we are even more certain that we played against Blake on Monday, since that would mean that we played Avery on Tuesday, and Avery is much more likely to beat us than Blake.

In [76]:
pmf = Chess(['blake', 'avery'])
dataset = ['win', 'win']
for data in dataset:
    pmf.Update(data)
for hypo, prob in pmf.Items():
    print(hypo, prob)

avery 0.5
blake 0.5000000000000001


However, if we had instead won on Tuesday, we would once again be unsure of which twin we had played on either day.

## Kim Rhode

From the chapter 4 exercises of Think Bayes:  

> At the 2016 Summer Olympics in the Women's Skeet event, Kim Rhode faced Wei Meng in the bronze medal match. They each hit 15 of 25 skeets, sending the match into sudden death. In the first round, both hit 1 of 2 skeets. In the next two rounds, they each hit 2 skeets. Finally, in the fourth round, Rhode hit 2 and Wei hit 1, so Rhode won the bronze medal, making her the first Summer Olympian to win an individual medal at six consecutive summer games.
  
> But after all that shooting, what is the probability that Rhode is actually a better shooter than Wei? If the same match were held again, what is the probability that Rhode would win?

From the single match that we are given information on, it can be extracted that Kim has an overall accuracy of 64.5% and that Wei has an overall accuracy of 61.3%. Using these probabilities, we can predict the likelihood that Kim Rhode would win a rematch.

In [114]:
import random

class Match():
    def __init__(self):
        self.kim_score = 0;
        self.wei_score = 0;

    '''
    simulate standard match with overtime and return name of winner
    '''
    def run(self):
        self.kim_score = 0;
        self.wei_score = 0;
        
        for i in range(25):
            if (random.random() <= .645):
                self.kim_score += 1
            if (random.random() <= .613):
                self.wei_score += 1
                
        if (self.kim_score == self.wei_score):
            self.overtime()
            
        if (self.kim_score > self.wei_score):
            return 'kim'
        else: 
            return 'wei'
            
    def overtime(self):
        for i in range(2):
            if (random.random() <= .645):
                self.kim_score += 1
            if (random.random() <= .613):
                self.wei_score += 1
        
        if (self.kim_score == self.wei_score):
            self.overtime()
        
kim_wins = 0;
wei_wins = 0;
match = Match()
num_matches = 100000
for i in range(num_matches):
    winner = match.run()
    if (winner == 'kim'):
        kim_wins += 1
    else:
        wei_wins += 1
        
print("kim wins: " + str(kim_wins/num_matches))
print("wei wins: " + str(wei_wins/num_matches))

kim wins: 0.59471
wei wins: 0.40529


From repeated simulation, we can estimate that Kim Rhode would win a rematch about 60% of the time, based on both shooters' accuracy during the Olympic match.

## Original Problem

## Extended Socks in the Dark

Suppose you have two boxes of socks, one with 20% white socks and the other with 60% white socks. The rest of the socks are black. You always pick from the same box. Each time you pick a sock out of the box, if it is a black sock, you put it back in the second box. If it is a white sock, you put it aside in a safe place. What are the odds that I was picking from Drawer 1 after finding 3 black socks and 2 white socks in that order?

In [146]:
class Drawer():
    def __init__(self, white, black):
        self.white = white
        self.black = black
    
    def add(self, color):
        if (color == "white"):
            self.white += 1
        elif (color == "black"):
            self.black += 1
            
    def pick(self, color):
        if (color == "white"):
            self.white -= 1
        elif (color == "black"):
            self.black -= 1
    
    def getPercentTotal(self, color):
        if (color == "white"):
            mix = self.white / (self.white + self.black)
        elif (color == "black"):
            mix = self.black / (self.white + self.black)
        else:
            mix = 0
        
        if (mix < 0):
            return 0
        else:
            return mix

class Drawers(Pmf):
    def __init__(self, hypos):
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()
        self.mixes = {
            'Drawer 1': Drawer(20, 80),
            'Drawer 2': Drawer(60, 40)
        }
        
        
    def Update(self, data):
        for hypo in self.Values():
            like = self.Likelihood(data, hypo)
            self.Mult(hypo, like)
        self.Normalize()
        
    def Likelihood(self, data, hypo):
        if (data == 'white'):
            self.mixes[hypo].pick(data)
        elif (data == 'black'):
            self.mixes[hypo].pick(data)
            self.mixes['Drawer 2'].add(data)
                
        return self.mixes[hypo].getPercentTotal(data)

Now that we have set up the model to accomodate the specific pick and drop method, we can update it with the picking of 3 black socks followed by 2 white socks.

In [147]:
pmf = Drawers(['Drawer 1', 'Drawer 2'])
dataset = ['black', 'black', 'black', 'white', 'white']
for data in dataset:
    pmf.Update(data)
for hypo, prob in pmf.Items():
    print(hypo, prob)

Drawer 1 0.44923621876057607
Drawer 2 0.5507637812394239


Thus we know that it is likely that we were drawing from Drawer 2, as it had more white socks and the picking of the black socks had no effect on the number of socks in it.