# Simulations

Business decisions often rely on simulations of events/decisions to understand outcomes and calculate risk. Any example?

In [None]:
# imports
import numpy as np
import random
import matplotlib.pyplot as plt
import scipy
from scipy import stats

%matplotlib inline

# Part 1

We are going to use the `TSLA.csv` file found on QTools

In [None]:
# Quick exercise:  

# run the starter code below

tsla = np.genfromtxt("TSLA.csv", usecols=5, delimiter=",", skip_header=1)
tsla_delta = np.diff(tsla)


In [None]:
tsla[-5:]

In [None]:
tsla_delta[-5:]

In [None]:
# calculate the mean and standard deviation for the dataset tsla_delta
tmean = tsla_delta.mean()
tmean

1.4291264980237153

In [None]:
tsigma = tsla_delta.std()
tsigma

13.166176040917861

In [None]:
stats.describe(tsla_delta)

DescribeResult(nobs=253, minmax=(-88.11001599999997, 55.64001399999995), mean=1.4291264980237153, variance=174.03608118940954, skewness=-0.8872627772175465, kurtosis=10.14449359960682)

In [None]:
# let's simulate 50 observations using this information
# I am going to refer to this as 1 experiment
days = np.random.normal(tmean, tsigma, 50)
days.shape


(50,)

In [None]:
days.mean()

1.3678659764810412

In [None]:
days.std()

11.970172684005036

In [None]:
# how many days in our simulation were above 5
len(days[days>=5])

16

In [None]:
# actual data
len(tsla_delta[tsla_delta >= 5])

69

In [None]:
len(tsla_delta)

253

In [None]:
# calculate the % of days (real) that were 5 or above
len(tsla_delta[tsla_delta >= 5]) / len(tsla_delta)


0.2727272727272727

In [None]:
# what % of days (simulated) are above 5
len(days[days>=5]) / len(days)

0.32

## Exercise:

We are going to expand above.  

1. Using the same mean/standard deviation calculated for tesla, each experiment will draw and observe 100 days
1. Calculate the number of days where the returns are $7 or more
1. Conduct this experiment a total of 75 times
1. What is the average across all experiments
1. How does this compare to the observed data in `tsla_delta`

In [None]:
# experiment = []

# for _ in range(??):
#     sim_days = np.random.normal(??)
#     sim_successful_days = ?
#     experiment.append(sim_successful_days)

    

# Part 2 - Coin Flipping (Anything with Yes/No like outcome)

In [None]:
# basics
SUCCESS_PROB = .5
np.random.choice([0,1], size=1, p=[SUCCESS_PROB, 1-SUCCESS_PROB])

array([1])

In [None]:
# size = 10 = # of trials.  We will refer to this as 1 run of the experiment:
flip = np.random.choice([0,1], size=10, p=[SUCCESS_PROB, 1-SUCCESS_PROB])
flip

array([1, 0, 0, 1, 1, 0, 0, 1, 1, 1])

In [None]:
# Sum shows us how many successes in the experiment
flip.sum()

6

In [None]:
# something a little more practical:
# Let's say in professional sports (hockey) 10% of the shots on net result in a goal

# assume all shots are of equal quality

# let's say a goalie faces 1000 shots.  How many goals (successes) will they let up?

In [None]:
GOAL_P = .1
shots = np.random.choice([1,0], size=100, p=[GOAL_P, 1-GOAL_P ])
#shots[:5]

In [None]:
shots[:20]

array([1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0])

In [None]:
shots.sum()/len(shots)

0.16

## Breakout Room Exercise:

The GM of the Boston Bruins contacted your analytics firm to help understand a shift in strategy they are looking to employ next season.

The team allowed 167 goals last season on 2103 shots.  

Moving forward, the team wants to understand how the following strategy would impact the number of goals allowed:

1.  Shots can be taken from 3 zones, zone A, B and C.
1.  The team wants to employ a strategy to shift where shots are taken from.  They believe that they can restrict their opponents to the following: zone A accounts for 15% of the shots, Zone B accounts for 65% of the shots, and Zone C the remaining 20%
1.  The shot success rates for zone A, B and C are 5%, 10% and 8% respectively
1.  The Bruins expect to allow 2100 shots next season

Run 100 experiments, where for each experiment, there are 2100 shots using the information provided above.

Across the 100 experiments, how many times would the Bruins reduce the number of goals allowed relative to 167 from the prior season?

Would you recommend the strategy?


In [None]:
# many ways to do this
# break the problem down
# for each shot, we need to pick a zone, and then associate the probability of the shot going in
# each experiment will have 2100 shots
# will run 100 experiments



# Part 3

Let's say we know the number of events that occur in a given timeframe.  We can model this with the poisson distribution.



In [None]:
# visitors to your website in a given day
# expected value, # of trials


# np.random.poisson(1200)

In [None]:
# let's simulate for a week, and then a year
week = np.random.poisson(1200, 7)
year = np.random.poisson(1200, 365)
print(week)
#print(year)

In [None]:
# lets do a quick plot
# you will dive much deeper into plotting in your next course
plt.hist(year)

In [None]:
# Example

# A cafe owner contacted your analytics firm to help understand capacity

# In a given week, they sell 1250 bagels that are baked in-house
# They will lose money if they do not have enough ingredients to satisfy demand

# Over a 52 week period, how many weeks can they expect to sell 1300 or more bagels?

bagels = np.random.poisson(1250, 52)
plt.hist(bagels)

## Breakout Room Exercise

The same cafe owner from above has re-engaged your analytics firm to help estimate employee expenses which are a function of the number of orders.  

1.  The cafe is open 252 days a year
1.  On average, they have 2200 orders a day
1.  The owner has estimated that they need 1 employee for every 300 orders
1.  The daily cost for an employee is $125/day

Simulate this exercise 100 times.  What is the average annual employee cost?