This script simulates choosing prize doors on a game show where there is a best, mid, and worst prize. The function generate_door_choices() randomly assings each door (1 , 2, and 3) as the best, mid, or worst prize. It simiulates the player randomly making an initial choice and one of the non-choosen doors is revealed (i.e elmininated). It then randomly chooses between the remaining two doors. In other words, sometimes the player switches doors before the final reveal. The simulation is ran 10,000 times.

In [1]:
import random as rnd
import pandas as pd


def generate_door_choices():
# setting rank of each door
    doors = [1,2,3]
    best = rnd.randrange(1,4,1)
    doors.remove(best)
    worst = rnd.choice(doors)
    

    if doors.index(worst) == 0:
        mid = doors[1]
    else:
        mid = doors[0]
        
#The ranking_dict keeps track of how prize doors 1,2, and 3 were ranked       
        
    ranking_dict = {
        best : "best",
        mid : "mid",
        worst : "worst"
        }
    
    
#simulating first round of choosing a door
# and simulating eliminating a door i.e revealing one of the non-choosen doors
        

    your_first_choice = rnd.randrange(1,4,1)
    
    eliminate = rnd.randrange(1,4,1)
    while eliminate == your_first_choice:
        eliminate = rnd.randrange(1,4,1)

    doors = [1,2,3]
    doors.remove(eliminate)
  
#simulating a random 2nd choice
    your_second_choice = rnd.choice(doors)
    if your_second_choice == your_first_choice:
        switch = "no"
    else:
        switch = "yes"
        
    return [ranking_dict[your_first_choice],ranking_dict[eliminate],ranking_dict[your_second_choice], switch]
    


A data frame is created to hold the simulated results

In [2]:
tenthousand = pd.DataFrame(columns = ["first_choice","eliminated","second_choice","switch"])



for i in range(0,9999):
    tenthousand.loc[i] = generate_door_choices()

switch_10000 = tenthousand[tenthousand["switch"] == "yes"]
remain_10000 = tenthousand[tenthousand["switch"] == "no"]






In [10]:
#show first 10 of the switch subset
switch_10000.head(10)

Unnamed: 0,first_choice,eliminated,second_choice,switch
1,best,mid,worst,yes
5,mid,best,worst,yes
6,mid,worst,best,yes
8,worst,best,mid,yes
9,best,worst,mid,yes
10,worst,best,mid,yes
11,best,mid,worst,yes
13,best,mid,worst,yes
15,mid,worst,best,yes
16,mid,best,worst,yes


In [9]:
#show first 10 of the remain subset
remain_10000.head(10)

Unnamed: 0,first_choice,eliminated,second_choice,switch
0,worst,mid,worst,no
2,best,worst,best,no
3,best,worst,best,no
4,best,mid,best,no
7,worst,mid,worst,no
12,mid,best,mid,no
14,mid,best,mid,no
17,worst,best,worst,no
19,mid,best,mid,no
20,best,worst,best,no


Functions are created to find the portion of times a switch lead to a better prize, a switch lead to the best prize given the best prize was not eliminated, and reaminig lead to the best prize given the best prize was not elminated. 

In [3]:

def switch_was_good(df_switch):
    df_good_switch = len(df_switch.query("second_choice == 'best'")) + len(df_switch.query("first_choice == 'worst' and second_choice == 'mid'"))
    return(df_good_switch/len(df_switch))

def ending_with_best_on_switch(df_switch):
#ONLY COUNTS IF BEST NOT ELIMINATED
    df_best_remains_and_switch = df_switch.query("eliminated != 'best'")
    return(len(df_best_remains_and_switch.query("second_choice == 'best'"))/len(df_best_remains_and_switch))

def ending_with_best_on_remain(df_remain):
#ONLY COUNTS IF BEST NOT ELIMINATED
    df_best_remains_and_remain = df_remain.query("eliminated != 'best'")
    return(len(df_best_remains_and_remain.query("second_choice == 'best'"))/len(df_best_remains_and_remain))


In [4]:
switch_good_val = round(switch_was_good(switch_10000),4)
print( "The switch lead to a better prize at a rate of : " , switch_good_val , ", which implies remaining whould have been better at a rate of: ", round(1 - switch_good_val,4) )

The switch lead to a better prize at a rate of :  0.4985 , which implies remaining whould have been better at a rate of:  0.5015


In [5]:
print("switched and ended with best prize given the best prize was not eliminated: ",round( ending_with_best_on_switch(switch_10000), 4))

switched and ended with best prize given the best prize was not eliminated:  0.5002


In [7]:
print("remained with first pick and ended with the best prize given the best prize was not eliminated ", round( ending_with_best_on_remain(remain_10000),4))

remained with first pick and ended with the best prize given the best prize was not eliminated  0.5041


This shows switching does not increase the probability of making a better choice or ending with the best prize. The second round of choosing is still a coin flip because all you know is one door has a better prize.

Note: The last two rates ( ending_with_best_on_switch and ending_with_best_on_remain) belong to different subsets so they are not compliments i.e the rates do not need to add to 1. 