## Assignment 2

This assignmemt is based on content discussed in module 2 and will work on a solution for the famous Monty Hall game.  Retrieved from https://en.wikipedia.org/wiki/Monty_Hall_problem


## Learning outcomes

- Program a simple simulation to solve a concrete statistical problem
- Develop insight into the Bayesian probabilistic viewpoint
- Recognize that statistical intuition can sometimes be wrong


** Question 1 **
Given below is the description of the problem.  

Suppose you're on a game show and you're given the choice of three doors. 

Behind one door is a car; behind the others, goats. The car and the goats were placed randomly behind the doors before the show.

The rules are:

After you have chosen a door, the door remains closed for the time being. 
The game show host, Monty Hall, who knows what is behind the doors, now has to open one of the two remaining doors, and the door he opens must have a goat behind it. If both remaining doors have goats behind them, he chooses one randomly. 

After Monty opens a door with a goat, he will ask you to decide whether you want to stay with your first choice or to switch to the last remaining door. 
Imagine that you chose Door 1 and the host opens Door 3, which has a goat. 
He then asks you "Do you want to switch to Door Number 2?" Is it to your advantage to change your choice? 

**NOTES:**
1. The player may initially choose any of the three doors (not just Door 1).
2. The host opens a different door revealing a goat (not necessarily Door 3).
3. The host gives the player a second choice between the two remaining unopened doors. 


![image.png](attachment:image.png)

(Source: https://en.wikipedia.org/wiki/Monty_Hall_problem#/media/File:Monty_open_door.svg )

[Monty hall problem ]

- Write Python code to solve the Monty Hall problem. Simulate at least a thousand games using three doors for each strategy and show the results in such a way as to make it easy to compare the effects of each strategy.


## Solution

To solve the Monty Hall problem, we need to model it out. To do so we need to include the steps of: picking the first door, having a door with a goat removed, choice of whether to switch doors and to keep a record of the results. 

For the simulation, we will focus on 2 strategies, either the player switches the door or they keep the same door. The reason for this is simplicity. Realistically, the door numbers are labels and as long as there are three doors, the labels do not actually impact the probabilities. If we were to model, door 1, door 2 and door 3 and then show both whether they switched or not, we would have 6 scenarios to consider. By focusing on whether the player switches or not we only have 2 scenarios to simulate. We will randomize: the door that is picked, how the prizes are placed behind the doors, and whether or not the door is switched.

The simulation loop follows the following structure:
- provide outcomes and create a df representing the doors 
- randomize the doors so that the prizes are randomly allocated to the doors
- randomly pick a door for a player
- "show" and remove a door that has a goat
- randomize the decision to switch or not
- implement the decision to switch
- record the results
- update counter

In [1]:
#import
import pandas as pd
import numpy as np

In [2]:
# simulation
# put this into a loop
# list to keep record of outcomes
countList =[]
# variable to use to place record of outcomes
count = ()
# the player's second choice, choice after they switched
playerChoice2 = ''
# list of doors remaining
doorlist = []
# counter for instances where switched
countYes = 0
# counter for instances where did not switch
countNo = 0
#while loop to get atleast 1000 results of instances of switched and not switched
while countYes <= 1000 or countNo <= 1000:
    # list outcomes
    outcomes = [('Door 1', 'goat'),('Door 2','goat'),('Door 3','car')]
    # put outcomes into df
    doors = pd.DataFrame(outcomes, columns=['Door Number','Prize'])
    # randomize the doors
    doors['Door Number'] = np.random.choice(doors['Door Number'], 3, replace=False)
    doors['Prize'] = np.random.choice(doors['Prize'], 3, replace=False)
    # player picks a door
    playerChoice = np.random.choice(doors['Door Number'], 1)[0]
    # door that is not player selected and has a goat is removed/shown
    doorRemove = doors[(doors['Prize'] != 'car') & (doors['Door Number'] != playerChoice)]
    doorRemove = np.random.choice(doorRemove['Door Number'], 1)[0]
    #drop doorRemove
    doors = doors[doors['Door Number'] != doorRemove]
    #set remaining doors as index
    doors.set_index('Door Number', inplace=True)
    # randomize whether the player picks to switch or not
    switch = np.random.choice(['yes','no'], 1)[0]
    # switches the door if picked to switch, else leaves it as is
    if switch == 'yes':
        # create a list of the doors remaining
        doorlist = doors.index.values.tolist()
        # remove the player's choice from doors remaining
        doorlist.remove(playerChoice)
        # assign the remaining door to the playerChoice2, to represent the switch
        playerChoice2 = doorlist[0]
        # add a tuple of first choice, whether they switched or not, their final choice and result
        count = (playerChoice, switch, playerChoice2, doors.loc[playerChoice2,'Prize'])
        # add count tuple to list
        countList.append(count)
        # update counter
        countYes += 1
    else:
        # add tuple of choice, switch, final choice and result
        count = (playerChoice, switch, playerChoice, doors.loc[playerChoice,'Prize'])
        # add tuple to list
        countList.append(count)
        # update counter
        countNo += 1

In [5]:
# look at the count list of records of all simulated games
countList

[('Door 2', 'no', 'Door 2', 'goat'),
 ('Door 3', 'yes', 'Door 2', 'car'),
 ('Door 1', 'no', 'Door 1', 'goat'),
 ('Door 2', 'no', 'Door 2', 'car'),
 ('Door 1', 'no', 'Door 1', 'goat'),
 ('Door 3', 'no', 'Door 3', 'goat'),
 ('Door 2', 'yes', 'Door 3', 'car'),
 ('Door 1', 'no', 'Door 1', 'goat'),
 ('Door 1', 'yes', 'Door 3', 'car'),
 ('Door 3', 'no', 'Door 3', 'goat'),
 ('Door 2', 'yes', 'Door 3', 'car'),
 ('Door 3', 'yes', 'Door 1', 'car'),
 ('Door 3', 'no', 'Door 3', 'car'),
 ('Door 2', 'no', 'Door 2', 'goat'),
 ('Door 3', 'no', 'Door 3', 'goat'),
 ('Door 2', 'no', 'Door 2', 'goat'),
 ('Door 3', 'yes', 'Door 1', 'car'),
 ('Door 2', 'no', 'Door 2', 'car'),
 ('Door 1', 'no', 'Door 1', 'goat'),
 ('Door 3', 'no', 'Door 3', 'car'),
 ('Door 3', 'yes', 'Door 1', 'goat'),
 ('Door 3', 'no', 'Door 3', 'car'),
 ('Door 1', 'yes', 'Door 2', 'car'),
 ('Door 3', 'yes', 'Door 2', 'car'),
 ('Door 3', 'no', 'Door 3', 'car'),
 ('Door 2', 'yes', 'Door 1', 'goat'),
 ('Door 2', 'yes', 'Door 3', 'car'),
 ('Do

In [7]:
len(countList)

2046

In [8]:
# create a df for the results of the game
countFrame = pd.DataFrame(countList, columns=['Initial Choice', 'Switch','Final Choice','Prize'])
countFrame.head()

Unnamed: 0,Initial Choice,Switch,Final Choice,Prize
0,Door 2,no,Door 2,goat
1,Door 3,yes,Door 2,car
2,Door 1,no,Door 1,goat
3,Door 2,no,Door 2,car
4,Door 1,no,Door 1,goat


In [9]:
# enumerate of the Prize column,
# where 1 represents a success of winning the car
# and 2 represents the failure of selecting the goat
countFrame['Prize'] = countFrame['Prize'].map({'car':1,'goat':0})
countFrame.head()

Unnamed: 0,Initial Choice,Switch,Final Choice,Prize
0,Door 2,no,Door 2,0
1,Door 3,yes,Door 2,1
2,Door 1,no,Door 1,0
3,Door 2,no,Door 2,1
4,Door 1,no,Door 1,0


In [13]:
# percentage of successes when switched
countFrame[countFrame['Switch'] == 'yes']['Prize'].sum() / len(countFrame[countFrame['Switch'] == 'yes']) * 100

64.63536463536464

In [14]:
# percentage of successes when player did not switch
countFrame[countFrame['Switch'] == 'no']['Prize'].sum() / len(countFrame[countFrame['Switch'] == 'no']) * 100

36.9377990430622

Taking a look above, we can see that the success when the doors was switched when the option was presented was 64.6% and the success when the door was not switched was 36.9%. This is not completely in line with the expectation of 66-67% for when switched and 33-34% when not switched. Our simulation is off our expectation by 2-3%. That being said, it still demonstrates that the strategy the leads to a more favourable outcome is switching when the option is presented. Switching approximately doubles your chances of winning the car.