# Simple Simulations in Python

To write a simulation, we must identify all factors that might influence the outcome of the simulation and write Python code to simulate each of these factors.

## Simulation
The objective of the code we will develop is to store the results of every run of our simulation in a DataFrame. By storing the data in a DataFrame, you can use all the tools and techniques you already know to select a subset of rows of in a DataFrame, to group data within a DataFrame, to find descriptive statistics about data in the DataFrame, and more!

Almost all simulations will follow a similar "pattern" where we need to only modify the pattern in a few select areas to create a simulation to solve a variety of different problems.

## Simulation Pattern
Every simulation we will write will follow a six-step pattern:

1. We will create a initially empty Python List called data to accumulate each run of our simulation. This will always be `data = []`.
2. We will write a for-loop to run a block of code for each run of our simulation. For a 10,000 run simulation, `for i in range(10000):`.
3. Inside of the for-loop, we will simulate all real-world factors. For a simple simulation of a six-sided die roll, `roll = random.randint(1, 6)` is the only real-world variable.
4.Inside of the for-loop, we will accumulate all real-world factors we simulated in Python dictionary called `d`. 
    - We will always name the key in our dictionary the same as our real-world factor, except the key must have quotes around it.

    - For example, if you have a single real-world variable roll, our dictionary `d` is: `d = { "roll": roll }`.

    - If we have two real world variables `red` and `blue`', our dictionary `d` separates the two variables with a comma: `d = { "red": red, "blue": blue }`.

    - If the real-world variable is height, our dictionary `d` is: `d = { "height": height }`.

    - If we have two real world variables one and two', our dictionary `d` is: `d = { "one": one, "two": two }`.

    - We will always refer to our variable by the variable name itself. (The effect of this is that we are creating a column in our DataFrame labeled with the name of our variable.)
5. Inside of the for-loop, we will append our dictionary to our list `data`. This will always be: `data.append(d)`.
6. Finally, outside of the for-loop, we will save our `data` as a DataFrame `df`. This will always be: `df = pd.DataFrame(data)`, which creates a DataFrame out of `data`.

## Simulate Rolling Die

One of the most simple simulations we can write is to simulate rolling fair, six-sided die.

### Example: Simulating Rolling a Six-sided Die

Using the six-sided die example, the full simulation code to simulate rolling a six-sided die 600 times and saving the results will be six lines of code:

In [4]:
import random
import pandas as pd

data = []                      # Step 1, empty list `data`
for i in range(600):           # Step 2: for-loop
  roll = random.randint(1, 6)  # Step 3: simulate real-world factors
  d = { "roll": roll }         # Step 4: accumulate factors in dictionary `d`
  data.append(d)               # Step 5: append `d` to `data`
df = pd.DataFrame(data)        # Step 6: create the DataFrame (outside of the for-loop)

In [5]:
df.head()

Unnamed: 0,roll
0,2
1,2
2,4
3,1
4,2


### Example: Simulating Rolling Two Six-sided Dice

If we want to roll two six-sided dice, there are now two real-world factors that happen every simulation. Let's think of one die as a "white" die (variable white) and the other as the "black" die (variable black):

In [6]:
# Step 1, empty list `data`:
data = []

# Step 2: for-loop:
for i in range(600):
  # Step 3: simulate all real-world factors:
  black = random.randint(1, 6)  
  white = random.randint(1, 6)

  # Step 4: accumulate all factors in dictionary `d`:
  d = { "white": white, "black": black }

  # Step 5: append `d` to `data`
  data.append(d)

# Step 6: create the DataFrame (outside of the for-loop)
df = pd.DataFrame(data)

In [7]:
df.head()

Unnamed: 0,white,black
0,3,4
1,6,3
2,6,5
3,3,6
4,3,5


### Example: Simulating picking a card from a deck of 52 cards

Using Python, collect 30,000 observations of picking a card from a deck of
52 cards

In [8]:
# imulation using numbers:
data = []
for i in range(30000):
    card = random.randint(1,52)
    d = {'card': card}
    data.append(d)

df = pd.DataFrame(data)

In [9]:
# imulation using suit and rank:
data = []
suits = ['club', 'heart', 'diamond', 'spade']
ranks = ['A', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K']

for i in range(30000):
    suit = random.choice(suits)
    rank = random.choice(ranks)
    d = {'suit': suit, 'rank': rank}
    data.append(d)

df = pd.DataFrame(data)

In [10]:
df.head()

Unnamed: 0,suit,rank
0,heart,8
1,spade,9
2,diamond,5
3,spade,5
4,spade,J


### Analysis

1. What is the estimation of the probability of drawing a heart?
- 30000*(13/52) = 30000*(1/4) = 7500

2. What is the estimation of the probability of drawing a queen?
- 30000*(4/52) = 30000*(1/13) $\approx$ 2308

In [17]:
print("number of hearts", len(df[df['suit']=='heart']))
print("number of queens", len(df[df['rank']=='Q']))

number of hearts 7406
number of queens 2304


In [None]:
len(df[df['suit']=='heart'])