To get started, we are going to recap what we did in your first model and build a correlated version of the same model and compare.
using Pkg
Pkg.add("Distributions")
Pkg.add("StatsBase")
Pkg.add("Statistics")
Pkg.add("Dates")
Pkg.add("MCHammer")
Pkg.add("DataFrames")
Pkg.add("Gadfly")
using DataFrames, MCHammer, Gadfly, Distributions,Statistics, StatsBase, Dates, TimeSeries
n_trials = 10000
Random.seed!(1)
Revenue = rand(TriangularDist(2500000,4000000,3000000), n_trials)
Expenses = rand(TriangularDist(1400000,3000000,2000000), n_trials)
Input_Table = DataFrame(Revenue=Revenue, Expenses=Expenses)
using Distributions, StatsBase, DataFrames, MCHammer #hide
n_trials = 10000
Random.seed!(1)
Revenue = rand(TriangularDist(2500000,4000000,3000000), n_trials)
Expenses = rand(TriangularDist(1400000,3000000,2000000), n_trials)
# The Model
Profit = Revenue - Expenses
#Trial Results : the Profit vector (OUTPUT)
Profit
# Trials or Results Table (OUTPUT)
Trials = DataFrame(Revenue = Revenue, Expenses = Expenses, Profit = Profit)
cormat(Trials)
Using the corvar()
function, we are going to correlate the Revenue and Expenses at -0.8 and generate the results tables for both the correlated and uncorrelated versions.
#Apply correlation to random samples
Rev_Exp_Cor = 0.8
cor_matrix = [1 Rev_Exp_Cor; Rev_Exp_Cor 1]
#Validate input correlation. You can also use cormat() to define the correlation
#matrix from historical data.
cor_matrix
It is very important to join Trial into an array before applying correlation. Furthermore, this step is necessary in order to produce a sensitivity_chrt()
using Distributions, StatsBase, DataFrames, MCHammer
Input_Table = DataFrame(Revenue=Revenue, Expenses=Expenses)
Correl_Trials = corvar(Input_Table, n_trials, cor_matrix)
rename!(Correl_Trials, [:Revenue, :Expenses])
#Using the correlated inputs to calculate the correlated profit
Correl_Trials.Profit = Correl_Trials.Revenue - Correl_Trials.Expenses
#Verify results
cormat(Correl_Trials)
Input Correlation:
cor(Revenue,Expenses)
Input Correlation for the Correlated Model:
cor(Correl_Trials.Revenue, Correl_Trials.Expenses)
Make sure to put a line in your project that lists all the outputs you can query with the charting and stats functions.
println("Model Outputs: Trials, Correl_Trials, Profit, Correl_Trials.Profit")
Let us compare the percentiles of an uncorrelated model vs. a correlated one.
compresults_df = DataFrame(uprofit = Profit, cprofit = Correl_Trials.Profit )
plot(stack(compresults_df), x=:value, color=:variable, Geom.density)
Using GetCertainty()
we can do some simple probability accounting to assess the likelyhood of making 1m or less in profit :
GetCertainty(Profit, 1000000, 0)
GetCertainty(Correl_Trials.Profit, 1000000, 0)
By accounting for the correlation, we can see the probability of achieving our profit objective dropped by about 5%
fractiles()
allows you to get the percentiles at various increments to be able to compare results along a continuum.
#Uncorrelated
fractiles(Profit)
#Correlated
fractiles(Correl_Trials.Profit)
Sensitivity of uncorrelated results:
sensitivity_chrt(Trials,1)
Sensitivity of correlated results
sensitivity_chrt(Correl_Trials,3)
-
Accounting for correlation meant a 5% (42.5% vs. 47.7%) reduction in probability of not making our goals.
-
The Worse Case goes from -290k to 230k, a 225% difference
-
The critical driver in both cases is expenses.