Making Choices with Monty (Hall, that is)
=======================================================

Adapted from: cs109.org

The Monty Hall Problem
-------------------------

Here's a fun and perhaps surprising probability riddle, and a good way to get some practice writing Python functions and using the other Python programming concepts we learned in the previous lessons.

In a gameshow, contestants try to guess which of 3 closed doors contain a cash prize (goats are behind the other two doors). Of course, the odds of choosing the correct door are 1 in 3. As a twist, the host of the show opens a door after a contestant makes his or her choice. This door is always one of the two the contestant did not pick, and is also always one of the goat doors (note that it is always possible to do this, since there are two goat doors). At this point, the contestant has the option of keeping his or her original choice, or swtiching to the other unopened door. The question is: is there any benefit to switching doors? The answer surprises many people who haven't heard the question before.



## The big picture
Let's start by listing some of the main components of simulating somebody playing the 3-door game. We'll implement each of the main components as a separate function and then have a main driver program that simulates the entire game many times by calling our functions. Our goal is to simulate the strategy of always switching to the strategy of always not switching by simulating each 10000 times and computing the winning percentage for each strategy.

* Simulate randomly putting the prize behind one of the doors
* ...
* ...
* ...
* ...
* Main program that calls these functions

#### Objectives

*   Generate random sequences of integers
*   Design functions to simulate the various parts of the Monty Hall 3-Door problem
*   Use Python help from the notebook
*   Write a master simulation program that uses the component functions we write
*   Report the results of a simulation run

### Simulating the Prize Door

Let's start by creating a simple simulation of randomly putting the car behind one of the three doors.  To do that, we need to be ably to generate a random integer representing door 1, 2, or 3. Since Python uses "0-based" lists and arrays, we are going to number the doors 0, 1 and 2 to make our life easier. Let's play with numpy's random number generation features to figure out how to do this. Then, we'll spec out and write the actual function to simulate a series of doors containing the car.

In [1]:
import numpy as np

In [2]:
# How to generate 10 random numbers between 0 and 2?
#np.random.randint?

In [3]:
print (np.random.randint(0,3,10))

[2 2 1 1 1 1 2 1 0 0]


Ok, now you'll write a function called `simulate_prizedoor` to do this. See the detailed spec below and add your own code to complete the task. 

In [4]:

# Fill in key line
def simulate_prizedoor(nsim):
    """
    Function
    --------
    simulate_prizedoor

    Generate a random array of 0s, 1s, and 2s, representing
    hiding a prize between door 0, door 1, and door 2

    Parameters
    ----------
    nsim : int
        The number of simulations to run

    Returns
    -------
    sims : array
        Random array of 0s, 1s, and 2s

    Example
    -------
    print simulate_prizedoor(3)
    [0 0 2]

    Note that when you print a numpy array, by default, it displays it as above.
    If you do:

    prize_doors = simulate_prizedoor(3)
    type(prize_doors)

    you'll see that prize_doors is a numpy ndarray.
    """
    #compute here
    answer = np.random.randint(0,3,nsim)
    return answer



In [5]:
# Now test your function
print (simulate_prizedoor(3))

[2 0 0]


Next, write a function that simulates the contestant's guesses for `nsim` simulations. Call this function `simulate_guess`. The specs:

In [6]:
# Fill in missing line or lines
def simulate_guess(nsim):
    """
    Function
    --------
    simulate_guess

    Return any strategy for guessing which door a prize is behind. This
    could be a random strategy, one that always guesses 2, whatever.

    Parameters
    ----------
    nsim : int
        The number of simulations to generate guesses for

    Returns
    -------
    guesses : array
        An array of guesses. Each guess is a 0, 1, or 2

    Example
    -------
    # Here's output from strategy of always guessing door 0
    print simulate_guess(5)
    [0 0 0 0 0]
    """
    #compute here
    guess = np.random.randint(0,3,nsim)
    return guess

In [7]:
# Now test your function
guesses = simulate_guess(10)
print (guesses)
print (type(guesses))

[2 0 2 0 1 2 0 0 0 1]
<class 'numpy.ndarray'>


Next, write a function, `goat_door`, to simulate randomly revealing one of the goat doors that a contestant didn't pick. This will likely take you a little more code. There are a few different cases to consider depending on where the prize is and which door the contestant picked. Certainly you can do it via "brute force" checking of the handful of possibilities. Here are the detailed function specs.

In [8]:

# You fill in missing parts

def goat_door(prizedoors,guesses):
    """
    Function
    --------
    goat_door

    Simulate the opening of a "goat door" that doesn't contain the prize,
    and is different from the contestants guess

    Parameters
    ----------
    prizedoors : array
        The door that the prize is behind in each simulation
    guesses : array
        THe door that the contestant guessed in each simulation

    Returns
    -------
    goats : array
        The goat door that is opened for each simulation. Each item is 0, 1, or 2, and is different
        from both prizedoors and guesses

    Examples
    --------
    print goat_door(np.array([0, 1, 2]), np.array([1, 1, 1]))
    [2 2 0]
    """    
    #compute here
    # Should find elegant way  but let's just brute force it
    nsims = len(prizedoors)
    goat = np.zeros(nsims, dtype=np.int)
    for i in range(nsims):
        if prizedoors[i] == 0:
            if guesses[i] == 1:
                goat[i] = 2
            elif guesses[i] == 2:
                goat[i] = 1
            else:
                goat[i] = np.random.choice([1,2])
        
        if prizedoors[i] == 1:
            if guesses[i] == 0:
                goat[i] = 2
            elif guesses[i] == 2:
                goat[i] = 0
            else :
                goat[i] = np.random.choice([0,2])
        
        if prizedoors[i] == 2:
            if guesses[i] == 0:
                goat[i] = 1
            elif guesses[i] == 1:
                goat[i] = 0
            else:
                goat[i] = np.random.choice([0,1])
        
    return goat


**Hacker Extra** Now certainly there are shorter, more "Pythonic" ways of implementing the `goat_door` function. Push yourself to find one. Here's a skeleton of my alternative approach (with a key part left out). Hint: Learn about *list comprehensions* (e.g. see p418 in Python for Data Analysis book or just Google).

    def goat_door2(prizedoors,guesses):
    
        # Here's a shorter way
        nsims = ???
        goat = np.zeros(nsims, dtype=np.int)
        for i in range(nsims):
            possible_goats = ??? This is the key line ???
            goat[i] = np.random.choice(possible_goats)
        return goat

In [9]:
# Make this work
def goat_door2(prizedoors,guesses):

    # Here's a shorter way
    nsims = len(prizedoors)
    goat = np.zeros(nsims, dtype=np.int)
    for i in range(nsims):
        possible_goats = np.setdiff1d([0,1,2],[guesses[i],prizedoors[i]])
        goat[i] = np.random.choice(possible_goats)
    return goat

Now, clearly we need to test if all of our functions are working correctly.

In [10]:
prizes = simulate_prizedoor(10)
guesses = simulate_guess(10)
goats = goat_door(prizes,guesses)
goats2 = goat_door2(prizes,guesses)
print (prizes)
print (guesses)
print (goats)
print (goats2)

[2 1 1 1 2 1 1 2 2 2]
[1 2 1 2 2 0 2 0 2 1]
[0 0 0 0 1 2 0 1 1 0]
[0 0 0 0 1 2 0 1 1 0]


Write a function, `switch_guess`, that represents the strategy of always switching a guess after the goat door is opened.

In [11]:
#your code here
def switch_guess(guesses, goatdoors):
    """
    Function
    --------
    switch_guess

    The strategy that always switches a guess after the goat door is opened

    Parameters
    ----------
    guesses : array
         Array of original guesses, for each simulation
    goatdoors : array
         Array of revealed goat doors for each simulation

    Returns
    -------
    The new door after switching. Should be different from both guesses and goatdoors

    Examples
    --------
    >>> print switch_guess(np.array([0, 1, 2]), np.array([1, 2, 1]))
    >>> array([2, 0, 0])
    """  
    nsims = len(goatdoors)
    choice = np.zeros(nsims, dtype=np.int)
    for i in range(nsims):
        choice[i] = np.random.choice(np.setdiff1d([0,1,2],[guesses[i],goatdoors[i]]))
    
    return choice

In [12]:
final_choice = switch_guess(guesses,goats)

print ("Prizes:\t\t", prizes)
print ("Guesses:\t", guesses)
print ("Goats:\t\t", goats)
print ("Final choice:\t", final_choice)

Prizes:		 [2 1 1 1 2 1 1 2 2 2]
Guesses:	 [1 2 1 2 2 0 2 0 2 1]
Goats:		 [0 0 0 0 1 2 0 1 1 0]
Final choice:	 [2 1 2 1 0 1 1 2 0 2]


Last function: write a `win_percentage` function that takes an array of `guesses` and `prizedoors`, and returns the percent of correct guesses.

In [13]:
def win_percentage(guesses, prizedoors):
    """
    Function
    --------
    win_percentage

    Calculate the percent of times that a simulation of guesses is correct

    Parameters
    -----------
    guesses : array
        Guesses for each simulation
    prizedoors : array
        Location of prize for each simulation

    Returns
    --------
    percentage : number between 0 and 100
        The win percentage

    Examples
    ---------
    >>> print win_percentage(np.array([0, 1, 2]), np.array([0, 0, 0]))
    33.333
    """
            
    return 100 * (guesses == prizedoors).mean()


In [14]:
print (win_percentage(final_choice, prizes))

70.0


Now, put it together. Simulate 10000 games where contestant keeps his original guess, and 10000 games where the contestant switches his door after a  goat door is revealed. Compute the percentage of time the contestant wins under either strategy. Is one strategy better than the other?

In [15]:

def switch_vs_stay(nsim):
    """
    Function
    --------
    switch_vs_stay

    Calculate the percent of times that the switching strategy wins and pct
    of time that the stay strategy (don't switch) wins

    Parameters
    -----------
    nsim : int
        Number of simulations


    Returns
    --------
    wint_pct : List containing the two win percentages
        The win percentage for switch and for not switch

    Examples
    ---------
    print switch_vs_stay(1000)
    [66.2, 33.8]
    """    
    N = 10000

    prizes = simulate_prizedoor(N)
    guesses = simulate_guess(N)
    goats = goat_door2(prizes,guesses)
    final_choice = switch_guess(guesses,goats)
    
    
    winpct_switch = win_percentage(final_choice,prizes)
    winpct_stay =  win_percentage(guesses, prizes)
    
    return [winpct_switch, winpct_stay]

In [16]:
results = switch_vs_stay(10000)

In [17]:
print (results)

[66.759999999999991, 33.239999999999995]
