# Random Data and Sampling 

In this final section, we will talk about random data and sampling. A lot of fun can be had with random data, especially when creating Monte Carlo simulations and games. 

We will learn some basic random operations and then talk about probability distributions and simulations. 

## Why Use Random Data? 

Using randomized data comes up a lot in data science and machine learning. In stochastic gradient descent, we train machine learning models iteratively and randomly sample one or more datapoints from a dataset in a loop. This is done because traversing an entire dataset in a loop is computationally expensive. We also see randomness used in other optimization algorithms, like hill climbing and simulated annealing. 

We can also use random data for Monte Carlo simulations, meaning we use random data to model real-life events. 

## Monty Hall Problem with Monty Carlo

A **monty carlo** simulation is a type of model that uses randomized data to understand something in the real world. Let's put a new angle on the classic Monty Hall Problem, where a game show hosts presents a game contestant three doors. One of the doors has a price, the other three have goats. 



In [95]:
import numpy as np 

n = 100

prize_doors = np.random.choice(3, n)
chosen_doors = np.random.choice(3, n)

In [96]:
prize_doors

array([2, 0, 0, 0, 2, 1, 0, 2, 2, 2, 2, 1, 0, 0, 0, 2, 1, 1, 1, 0, 0, 2,
       2, 1, 2, 2, 2, 2, 1, 0, 2, 2, 0, 1, 0, 2, 2, 0, 1, 0, 0, 2, 0, 1,
       2, 0, 1, 1, 2, 2, 1, 1, 1, 0, 1, 1, 0, 0, 1, 2, 1, 2, 2, 0, 2, 1,
       2, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 0, 0, 2, 0, 0, 2, 2, 0, 1, 0, 0,
       2, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0])

In [97]:
chosen_doors

array([0, 1, 0, 2, 1, 1, 2, 1, 2, 1, 2, 2, 2, 2, 0, 1, 2, 0, 0, 0, 1, 0,
       2, 2, 2, 1, 0, 2, 1, 2, 0, 0, 2, 1, 1, 0, 2, 1, 2, 2, 2, 1, 2, 1,
       1, 1, 1, 2, 0, 0, 2, 0, 0, 0, 2, 1, 2, 1, 1, 2, 2, 2, 0, 2, 0, 2,
       1, 0, 1, 1, 2, 0, 2, 1, 1, 1, 0, 2, 0, 2, 1, 0, 1, 0, 0, 1, 1, 0,
       2, 0, 2, 0, 2, 1, 0, 1, 0, 1, 2, 1])

In [98]:
opened_doors = np.zeros(n, dtype=int) 

for i in range(n): 
    doors = np.arange(0,3)
    opened_doors[i] = np.random.choice(
        doors[(doors != prize_doors[i]) & (doors != chosen_doors[i])]
    )

opened_doors

array([1, 2, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 2, 2, 2, 2, 1,
       1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 2, 1, 0, 2, 0, 1, 1, 0, 1, 2,
       0, 2, 0, 0, 1, 1, 0, 2, 2, 1, 0, 2, 1, 2, 0, 0, 0, 1, 1, 1, 1, 0,
       0, 1, 0, 0, 1, 2, 1, 2, 0, 0, 1, 1, 1, 1, 2, 1, 0, 1, 1, 0, 2, 1,
       0, 2, 0, 2, 1, 2, 2, 2, 2, 2, 1, 2])

In [99]:
switch_doors = np.zeros(n, dtype=int) 

for i in range(n): 
    doors = np.arange(0,3)
    switch_doors[i] = np.random.choice(
        doors[(doors != chosen_doors[i]) & (doors != opened_doors[i])]
    )

switch_doors

array([2, 0, 2, 0, 2, 2, 0, 2, 1, 2, 1, 1, 0, 0, 2, 2, 1, 1, 1, 1, 0, 2,
       0, 1, 1, 2, 2, 0, 2, 0, 2, 2, 0, 2, 0, 2, 1, 0, 1, 0, 0, 2, 0, 0,
       2, 0, 2, 1, 2, 2, 1, 1, 1, 2, 1, 0, 0, 0, 2, 1, 1, 0, 2, 0, 2, 1,
       2, 2, 2, 2, 0, 1, 0, 0, 2, 2, 2, 0, 2, 0, 0, 2, 2, 2, 2, 2, 0, 2,
       1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0])

In [100]:
stay_wins = np.sum(chosen_doors == prize_doors) 
switch_wins = np.sum(switch_doors == prize_doors)

print(f"STAY WINS: {stay_wins}")
print(f"SWITCH WINS: {switch_wins}")

STAY WINS: 34
SWITCH WINS: 66
