## The Monty Hall Problem

The Monty Hall problem is a puzzle loosely based on the TV game show *Let's Make a Deal* and seems to be counterintuitive for many people.  The problem setup is straightforward enough:  Suppose you are a contestant in a game show, and the host Monty Hall shows you 3 doors.  One of the doors has a prize behind it and the other 2 doors have goats behind them.  Monty tells you to pick a door.  After you've picked a door, Monty does something gnarly:  He opens one of the 2 remaining doors and reveals a goat.  Then he asks you if you want to switch doors or stick with your remaining choice.  Obviously we are assuming that you would rather have the prize and not the goat.  The question is:  Should you switch or not?

Many people think that it doesn't matter whether you switch or not, and they reason that there should be an equal probability ($\,=\frac{1}{2}$) that the prize is behind the door you chose versus the door that Monty Hall didn't open.  In fact, the optimal solution is to always switch doors.  Switching doors give you a $\frac{2}{3}$ probability of getting the prize, while not switching gives you a $\frac{1}{3}$ probability.  This tends to surprise people.  I think one of the reasons for this is that people misunderstand conditional probability, but there also seem to be deeper psychological reasons. In fact, this is a problem that cognitive psychologists like to study.  

My purpose here is not to get into the psychology of the Monty Hall problem, but I want to code up the problem using Python and then simulate lots and lots of Monty Hall games (say, 100000 games).  By this we can estimate the true probability of winning the prize by always switching and we will see that the proportion of wins will be very close to $\frac{2}{3}$ with the switching strategy and $\frac{1}{3}$ otherwise.

After running the simulation, I want to give a proof of the Monty Hall problem using a bit of probability theory.

### 1. The logical explanation

First let's give a relatively simple explanation why the optimal strategy in the Monty Hall problem is always switching.

In [1]:
import numpy as np
import random

In [2]:
def prizedoor(n):
    '''Given a positive integer n, return a random array of length n
    where each entry is either 0, 1, or 2 representing a hidden 
    prize behind door 0, 1, or 2.  Run n simulations of the prizedoor.'''
    
    return np.random.randint(3, size = n)

def initial_guess(n):
    '''This function also returns a random array of length n consisting
    of 0s, 1s, and 2s representing n simulations of contestants guessing
    a door randomly.'''
    
    return np.random.randint(3, size = n)

print( prizedoor(5), initial_guess(5) )

[0 1 0 1 0] [2 2 2 0 2]


Goat door

In [3]:
def goatdoor(prizedoors, guesses):
    '''This function simulates opening a "goat door" for each prize door
    and each guess.  The goat door should not be the prize door and should 
    be different from the contestant's guess.'''
    
    goatdoors = []
    for pair in zip(prizedoors, guesses):
        if pair[0] == pair[1]:
            other_doors = [ d for d in [0,1,2] if d != pair[0] ]
            goatdoors.append( random.choice(other_doors) )
        else:
            other_door = [ d for d in [0,1,2] if d not in (pair[0], pair[1]) ]
            goatdoors.append( other_door[0] )
        
    return np.array(goatdoors)

Switch guess

In [4]:
def switch_guess(guesses, goatdoors):
    '''This function implements the strategy of always switching
    a guess after a goat door is opened.'''
    
    switches = []
    for pair in zip(guesses, goatdoors):
        switch_door = [ d for d in [0,1,2] if d not in (pair[0], pair[1]) ]
        switches.append( switch_door[0] )
        
    return np.array(switches)
    

Now let's simulate the game both with the strategy of switching and without the strategy of switching

In [5]:
def run_monty(n, switch=True):
    '''Simulate n Monty Hall games.  The default value of switch
    represents the contestants always switching their initial guess.
    Set switch=False to simulate the contestants not switching.'''
    
    prizedoors = prizedoor(n)
    guesses = initial_guess(n)
    goatdoors = goatdoor(prizedoors, guesses)
    if switch:
        contestantdoors = switch_guess(guesses, goatdoors)
    else:
        contestantdoors = guesses
        
    wins = 0
    for pair in zip(prizedoors, contestantdoors):
        if pair[0] == pair[1]:
            wins += 1
            
    # print the relevant stats
    print('Number of games: ', n)
    print('Number of wins:  ', wins)
    print('Win Percentage:   %.2f% %' % round(100 * wins / n, 2) )

run_monty(100000, switch=True)
print()
run_monty(100000, switch=False)

Number of games:  100000
Number of wins:   66601
Win Percentage:   66.60%

Number of games:  100000
Number of wins:   33505
Win Percentage:   33.51%
