Most start ups fail. 
But some markets, ideas, and teams are better than others. After Learning about start ups in class, I want to figure out some associated probabilities
    
Prior: 9/10 start ups fail, 1/10 start ups are successful

From a naive perspective, in order to have a successful company, 3 things need to align. The right market, idea, and team.
    
likelihood: given a good start up, whats the probability they have... 

a good market, a good product, and a good team? 0.9

a good market, a good product, and a bad team?  0.2

a good market, a bad product, and a good team?  0.9

a good market, a bad product, and a bad team?   0.8

a bad market, a good product, and a good team?  0.1

a bad market, a good product, and a bad team?   0.05

a bad market, a bad product, and a good team?   0.05

a bad market, a bad product, and a bad team?    0.001
    


In [82]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import normalize as norm

In [83]:
# Normalize probabilities
prior = np.array([0.1, 0.9])
initial_guess = np.array([0.9,  0.2, 0.9, 0.8, 0.1, 0.05, 0.05, 0.001])
inverse = 1 - initial_guess
inverse_normalized = inverse / (1-initial_guess).sum(0)
normalized_likelihoods = initial_guess / initial_guess.sum(0)
print(normalized_likelihoods.sum())
print(inverse_normalized)
print(normalized_likelihoods)

0.9999999999999999
[0.020004 0.160032 0.020004 0.040008 0.180036 0.190038 0.190038 0.19984 ]
[0.2999   0.066644 0.2999   0.266578 0.033322 0.016661 0.016661 0.000333]


In [84]:
 np.set_printoptions(precision=6)

In [122]:
def simulate_investor(prior, prior_choices, likelihood, inverse_likelihood, env_choices, N=1000, )->np.array:
    """Simulates the probability of successful startups based
    on priors and likelihoods.
    Args: 
        N - the number of simulation run
    Returns:
        N by 2 array with the sucess status, and whether they had 
        a good or bad market, product, and team.
    """
    print(f"Simulating {N} Start up runs")
    # Sample from Prior
    print(f"Prior: {prior}\nSuccess Likelihoods:{likelihood}\nFailure Likelihood:{inverse_likelihood}")
    
    sim_success_choices = np.random.choice(prior_choices, size=(N), p=prior)

    runs = np.zeros(shape=(N, 2), dtype=object)
    runs[:,0] = sim_success_choices

    pos_sim = np.random.choice(env_choices, size=(N), p=likelihood)
    neg_sim = np.random.choice(env_choices, size=(N), p=inverse_likelihood)
    
    runs[:,1] = np.where(runs[:,0] == "Success", pos_sim, neg_sim)
    return runs
    
sims = simulate_investor(prior, prior_choices, normalized_likelihoods, inverse_normalized, env_choices, 10000)

Simulating 10000 Start up runs
Prior: [0.1 0.9]
Success Likelihoods:[0.2999   0.066644 0.2999   0.266578 0.033322 0.016661 0.016661 0.000333]
Failure Likelihood:[0.020004 0.160032 0.020004 0.040008 0.180036 0.190038 0.190038 0.19984 ]


In [125]:
def get_table(results, prior_choices, env_choices):
    rv = np.zeros(shape=(2, 8))
    for row, success_value in enumerate(prior_choices):
        temp = results
        for col, env in enumerate(env_choices):
            suc = np.where(results[:,0] == success_value, 1, 0)
            rv[row][col] = (np.where(results[:,1] == env, 1, 0) * suc).sum()
    return rv/results.shape[0]    

In [130]:
simulated_table = get_table(sims, prior_choices, env_choices)
print(simulated_table)

[[0.0296 0.0069 0.0301 0.026  0.0033 0.002  0.0018 0.    ]
 [0.0159 0.1439 0.0162 0.0382 0.1609 0.1717 0.1699 0.1836]]


Analytical Probabilities

In [131]:
analytical = np.zeros(shape=(2, 8))
analytical[0,:] = prior[0] * normalized_likelihoods
analytical[1,:] = prior[1] * inverse_normalized

print(analytical)

[[2.999000e-02 6.664445e-03 2.999000e-02 2.665778e-02 3.332223e-03
  1.666111e-03 1.666111e-03 3.332223e-05]
 [1.800360e-02 1.440288e-01 1.800360e-02 3.600720e-02 1.620324e-01
  1.710342e-01 1.710342e-01 1.798560e-01]]


In [133]:
np.abs(analytical - simulated_table)

array([[3.900033e-04, 2.355548e-04, 1.099967e-04, 6.577807e-04,
        3.222259e-05, 3.338887e-04, 1.338887e-04, 3.332223e-05],
       [2.103601e-03, 1.288058e-04, 1.803601e-03, 2.192799e-03,
        1.132406e-03, 6.657932e-04, 1.134207e-03, 3.744029e-03]])