### Randomness
Elements of Data Science Week 7

## Simulation Learning Goals
Simulate a task dependent on probability such as a die roll, repeat to get distribution and characteristics (mean, ...)
- Probability
    - np.random.choice()
- Simulation: Sample the distribution
    - Repeat and collect outcomes
    - Iteration: 
        `for i in np.arange(samples)`
- Examine resulting distribution of outcomes
    - Probability distribution
    - Uncertainty

#### A random distributions play a large role in statistical inference

In [None]:
import numpy as np
from datascience import *

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
# Fix for datascience plots
import collections as collections
import collections.abc as abc
collections.Iterable = abc.Iterable

## Coin toss

In [None]:
toss = np.array(['Heads', 'Tails'])
tosses=np.random.choice(toss,10)
tosses

In [None]:
tosses != 'Tails'

In [None]:
np.count_nonzero(tosses == 'Heads')/len(tosses)

### Simulate
- Simulate a set of 100 coin tosses, how many heads?
- Repeat simulation 20,000 times

In [None]:
def one_simulated_value():
    outcomes = np.random.choice(toss, 100)
    return np.count_nonzero(outcomes == 'Heads')

In [None]:
num_repetitions = 20000   # number of repetitions

heads = make_array() # empty collection array

for i in np.arange(num_repetitions):   # repeat the process num_repetitions times
    new_value = one_simulated_value()  # simulate one value using the function defined
    heads = np.append(heads, new_value) # augment the collection array with the simulated value

# That's it! The simulation is done.

In [None]:
simulation_results = Table().with_columns(
    'Repetition', np.arange(1, num_repetitions + 1),
    'Number of Heads', heads
)

In [None]:
simulation_results.show(3)

In [None]:
simulation_results.hist('Number of Heads', bins = np.arange(30.5, 69.6, 1))
plt.title('Simulation of 100 Coin Tosses')
plt.savefig('Simcoin.png')

## Die roll betting simulation
Bet a dollar on a single die roll
Outcomes

    - 0 or 1: lose a dollar (-$1)
    - 2 or 3: no change (0)
    - 4 or 5: gain a dollar (+$1)

In [None]:
def bet_on_one_roll():
    """Returns my net gain on one bet"""
    x = np.random.choice(np.arange(1, 7))  # roll a die once and record the number of spots
    if x <= 2:
        return -1
    elif x <= 4:
        return 0
    elif x <= 6:
        return 1

In [None]:
outcomes = np.array([])

for i in np.arange(600):
    outcome_of_bet = bet_on_one_roll()
    outcomes = np.append(outcomes, outcome_of_bet)
    
print(outcomes[0:10])
len(outcomes)

In [None]:
outcome_table = Table().with_column('Outcome', outcomes)
outcome_table.group('Outcome').barh(0)

## Sampling

In [None]:
marbles = np.random.choice(['purple','green'],100)
population =Table().with_columns('Color',marbles)
population


In [None]:
population.where('Color','purple').num_rows/population.num_rows

In [None]:
sample = population.sample(10)
sample

In [None]:
sample.where('Color','purple').num_rows

In [None]:
outcomes = np.array([])

for i in np.arange(1000):
    outcome = population.sample(10).where('Color','purple').num_rows/10
    outcomes = np.append(outcomes, outcome)
    
print(outcomes[0:10])
len(outcomes)

In [None]:
outcomes.mean()

In [None]:
outcome_table = Table().with_column('Outcome', outcomes)
outcome_table

In [None]:
outcome_table.hist(0)