# Modeling Uncertainty

Regardless of whether the universe is inhernelty unpredictable or whether we simply don't have enough knowledge to accurately model events, the world is often unpredictable. But so far, we have only looked at functions which generate completely predicatble outcomes. This is a problem because when modeling real life events, most often- the best statement we can make is that some outcome is highly likely to occur, but not guaranteed to occur. Uncertainty needs to be addressed. There are basically three ways you can interpret past and future events:

**Determinism** means that the future can be predicted exactly from the past.

**Causal nondeterminism** means that the future cannot be predicted exactly from the past, some things are uncaused. There is true randomness in the universe.

**Predictive nondeterminism** means that the future could in principle be predicted from the past, but we don’t have enough information. There is chaos in the universe, but not true randomness.

For people who want to get things done (as opposed to just debating about the nature of the universe) the third option is the wisest outlook to take on the world. The conclusions to take away from the statement above is that doesn't matter if things are predictable or not, we can't predict them because we don't have enough info. So it's best to treat things as probalistic.

## 3 Facts of Probability

There is a lot of room for making miscalculations and getting things wrong in the field of probability, but fundamentally the field follows just three simple rules. It's not complicated:

- Probabilities are always in the range 0 to 1 (0% to 100%). Events cannot have a < 0% or > 100% chance of happening.

- If the probability of an event occuring is **p**, the probability of an event not occuring is **1-p**

- When events are independent of each other, the probability of all of the events occuring is equal to the product of each of the events occuring.
    - Example: event1 = 0.5  ,  event2 = 0.5 ....  Probability of both events occuring is 0.25

People often make the mistake of assuming events are independent when they actually are not. This often leads people to seriously incorrect answers when calculating odds.

### How to Deal with this

**Monte Carlo** and simulations can be good tools for dealing with predictive nondeterminism. You have exact equations for how everything plays out, so simulation works. But you don’t have good information about the initial state, and the output is very sensitive to initial conditions. So you try a bunch of possible initial conditions, equally probable if possible for simplicity, and look at the distribution of simulation results. You don’t say that the future will be drawn from that distribution, you say that the simulation results represent your best guess about what the future will be.

## Stochastic Processes

These are ongoing processes where predicting the next state might depend on both the previous states and some random element.

In [None]:
import random

def rollDieDeterministic():
    """ returns an int between 1 and 6"""
    return 5

def rollDieStochastic():
    """ returns a randomly chosen int
    between 1 and 6""" 
    return random.randrange(1,6)

print(rollDieDeterministic())
print(rollDieStochastic())

In the above code, the first function is **deterministic** not stochastic. This means we absolutely know what the output will be. It always returns 5. The docstring merely specified an integer between 1 and 6 must be returned and so that's what the function does. However, the second function is **stochastic**. We know a single die will always land on an integer 1 through 6. This state is fixed; the numbers on a die won't suddenly change values. But it's also not possible for people to calculate which number a die will land on based on the physics of a hand shake, wind temperature, drop distance, etc... The possible outcomes are structured, but there is chaos in the selection of each outcome.

## More on Stochastic Models

Let's look at two different simulations of dice rolls. Now, the probability of dice rolls is known and can be calulated easily, so note that this simulation is intended to demonstrate how simulations work in general, but it is obviously not a very practical implementation of simulations.

#### Match Multiple Dice Rolls

Let's discuss the following code in `runSim()`, which prints the actual and estimated probability of a series of dice rolls being exactly equal to some target dice roll. 

This func takes in 2 parameters:
- `goal` which is a user-definied dice roll we are trying to achieve by rolling randomely
- `numTrials` which is a user-input number of trials to run and test if a random dice roll will match `goal`

The function then iterates through every trial from `numTrials`
 - It generates as many random dices rolls as were set in `goal`.
 - It checks if result of a random rolls of die was exactly the same as our defined goal.
After each trial is tested, it prints the actual and estimated probability of reaching our goal, rounded to 8 decimal places.

In [38]:
random.seed(0)  # See note below this code cell

def runSim(goal, numTrials):
    total = 0
    for i in range(numTrials):
        result = ''
        for j in range(len(goal)):
            result += str(rollDieStochastic())
        if result == goal:
            total += 1
    print('Actual probability =',
          round(1/(6**len(goal)), 8)) 
    estProbability = round(total/numTrials, 8)
    print('Estimated Probability  =',
          round(estProbability, 8))

**Note:** The random module produces pseudorandom numbers. (We can't yet generate truly random numbers). These pseudorandom number take something internal that is constantly changing, like the number of milliseconds passed since january 1, 1968 and use that to generate numbers. So they appear random. The `.seed()` method overrides the random number generation method that the module typically uses and lets us specify some value ourselves. In the cell above, all pseudorandom numbers will be generated from zero. Every time we call random methods, they will generate the exact same results. This is essential for debugging stochastic problems.

In [39]:
# Run the simulation. Adjust goal and numTrials as you see fit.
# runSim('11111', 10000)

#### Box Cars

Box cars are a dice term for rolling two sixes. `fracBoxCars` takes in user-input param `numTests`, calls `rollDieStochastic` twice and checks if both rolls equal 6. If `True`, the roll gets counted in `numBoxCars`. Finally, the estimated probability is returned.

In [40]:
def fracBoxCars(numTests):
    numBoxCars = 0.0
    for i in range(numTests):
        if rollDieStochastic() == 6 and rollDie() == 6:
            numBoxCars += 1
    return numBoxCars/numTests

In [None]:
#Try it yourself
print('Frequency of double 6 =', str(fracBoxCars(100000)*100) + '%')

#### Final Thoughts on Dice Roll Simulations

1) It takes a lot of dice rolls to get a good estimate on the frequency of occurance for unlikely events. In fact, there are ways to identify how many trials would be enough.

2) Do not confuse sample probability with actual probability.

3) As metnioned above, it's not necessary to run these kinds of simulations on dice rolls. The probabilities on these events have a perfectly good closed for answer. However, there are many examples to come where we will see this is not the case; running simulations will be very useful.