# Statistial Simulation Using Python

Probability is the study of random phenomena. Random variables can be broken down into two broader categories: discrete and continuous. 

Discrete random variables can take on only a finite number of values, while continuous random variables can take on any value within a range. The probability of a discrete random variable taking on a particular value is called a probability mass function (PMF). The probability of a continuous random variable taking on a particular value is called a probability density function (PDF). The probability of a continuous random variable taking on a value between two numbers is called a cumulative distribution function (CDF).

In [2]:
import numpy as np 

In [8]:
# Initialize seed and parameters
np.random.seed(586852)

In [28]:
deck_of_cards = [(i,j) for i in ['Heart','Club','Diamond','Spade'] for j in range(0,13)]
np.random.shuffle(deck_of_cards)
# Print out the top three cards
card_choices_after_shuffle = deck_of_cards[:3]
print(card_choices_after_shuffle)

[('Spade', 2), ('Heart', 11), ('Spade', 10)]


### Simulation Basics

**Steps to perform a simulation:**
1. Define possible outcomes
2. Assign probabilities to each outcome
3. Define relationships between variables
4. Repeat the process a large number of times 
5. Analyze the results


In [171]:
# Below is the code for the dice roll simulation
# If the player rolls the same number on both dice, they win
dice, probabilities, num_of_dice = [1,2,3,4,5,6], [1/6,1/6,1/6,1/6,1/6,1/6], 2
sims, wins = 100, 0
for i in range(sims):
    dice_rolls = np.random.choice(dice, size=num_of_dice, p=probabilities)
    if dice_rolls[0] == dice_rolls[1]:
        wins += 1

print(f'Out of {sims} simulations, the number of times the dice rolled the same number is {wins}')

Out of 100 simulations, the number of times the dice rolled the same number is 16


Simulations are useful to understand the behavior of a system. They are also useful to understand the behavior of a system under different conditions. Another example is trying to predict if a person will win the lottery. The lottery is a random process, so we can use a simulation to predict the outcome.

In [239]:
# Initialize size and simulate outcome
lottery_ticket_cost, num_tickets, grand_prize = 10, 1000, 10000
chance_of_winning = 1/num_tickets
size = 2000
payoffs = [-lottery_ticket_cost, grand_prize-lottery_ticket_cost]
probs = [1-chance_of_winning, chance_of_winning]
outcomes = np.random.choice(a=payoffs, size=size, p=probs, replace=True)
# Mean of outcomes.
answer = np.mean(outcomes)
print("Average payoff from {} simulations = {}".format(size, answer))

Average payoff from 2000 simulations = 0.0


In [310]:
# Initialize simulations and cost of ticket
sims, lottery_ticket_cost = 3000, 0

# Use a while loop to increment `lottery_ticket_cost` till average value of outcomes falls below zero
while 1:
    outcomes = np.random.choice([-lottery_ticket_cost, grand_prize-lottery_ticket_cost],
                 size=sims, p=[1-chance_of_winning, chance_of_winning], replace=True)
    if outcomes.mean() < 0:
        break
    else:
        lottery_ticket_cost += 1
answer = lottery_ticket_cost - 1

print("The highest price at which it makes sense to buy the ticket is {}".format(answer))

The highest price at which it makes sense to buy the ticket is 3


Lets work on some examples of probability and simulation.

In [322]:
# Shuffle deck & count card occurrences in the hand
n_sims, two_kind = 10000, 0
for i in range(n_sims):
    np.random.shuffle(deck_of_cards)
    hand, cards_in_hand = deck_of_cards[0:5], {}
    for [suite, numeric_value] in hand:
        # Count occurrences of each numeric value
        cards_in_hand[numeric_value] = cards_in_hand.get(numeric_value, 0) + 1
    
    # Condition for getting at least 2 of a kind
    if max(cards_in_hand.values()) >=2: 
        two_kind += 1

print("Probability of seeing at least two of a kind = {} ".format(two_kind/n_sims))

Probability of seeing at least two of a kind = 0.496 


In [323]:
# Pre-set constant variables
deck, sims, coincidences = np.arange(1, 14), 10000, 0

for _ in range(sims):
    # Draw all the cards without replacement to simulate one game
    draw = np.random.choice(deck, size=13, replace=False) 
    # Check if there are any coincidences
    coincidence = (draw == list(np.arange(1, 14))).any()
    if coincidence == True:
        coincidences += 1

# Calculate probability of winning
prob_of_winning = 1-coincidences/sims
print("Probability of winning = {}".format(prob_of_winning))

Probability of winning = 0.35750000000000004
