### Randomness and Code Review
Elements of Data Science

## Simulation Learning Goals
Simulate a task dependent on probability such as a die roll, repeat to get distribution and characteristics (mean, ...)
- Probability
    - np.random.choice()
- Simulation: Sample the distribution
    - Repeat and collect outcomes
    - Iteration: 
        `for i in np.arange(samples)`
- Examine resulting distribution of outcomes
    - Probability distribution
    - Uncertainty

In [None]:
import numpy as np
from datascience import *

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

## Coding review

#### Data types
Important also because different types have different methods

In [None]:
x = "Groundhog"
type(x)


In [None]:
x == x.lower().replace('hog','pig')

In [None]:
x = Table().with_columns('data', np.random.choice(np.arange(1,7)))
type(x)
x

#### Iteration
Loop through lines of code to repeat process

In [None]:
names = []
count = 0
for animal in 'Dogs':
    count += 1
    names.append(animal)
    print(animal)
print(count, names)

#### Conditional

In [None]:
best_animal ='Groundhog'
for animal in make_array('Horse','Dog', 'Groundhog'):
    print(animal)
    if animal == 'Groundhog':
        print("Best animal found")
    elif animal == 'Horse':
        print("I like horses")
    else:
        print("not my favorite animal")

## Pizza Choice

In [None]:
import time
import numpy as np

In [None]:
pizza_choices = np.array(['cheese','buffalo chicken', 'pepperoni'])

In [None]:
pizzas = np.random.choice(pizza_choices,10)
pizzas

In [None]:
count = 0
for i in pizzas:
    count +=1
    print(count,":",i,"chosen from ",pizza_choices)
    time.sleep(1)

## Coin toss

In [None]:
toss = np.array(['Heads', 'Tails'])
tosses=np.random.choice(toss,100)
tosses

In [None]:
tosses == 'Tails'

In [None]:
np.count_nonzero(tosses == 'Heads')

### Simulate
- Simulate a set of 100 coin tosses, how many heads?
- Repeat simulation 20,000 times

In [None]:
def one_simulation():
    outcomes = np.random.choice(toss, 100)
    return np.count_nonzero(outcomes == 'Heads')

In [None]:
num_repetitions = 20000   # number of repetitions

heads = make_array() # empty collection array

for i in np.arange(num_repetitions):   # repeat the process num_repetitions times
    new_value = one_simulation()  # simulate one value using the function defined
    heads = np.append(heads, new_value) # augment the collection array with the simulated value

# That's it! The simulation is done.

In [None]:
heads

In [None]:
simulation_results = Table().with_columns(
    'Repetition', np.arange(1, num_repetitions + 1),
    'Number of Heads', heads
)

In [None]:
simulation_results.show(100)

In [None]:
simulation_results.hist('Number of Heads', bins = np.arange(30.5, 69.6, 1))
plt.title('Simulation of 100 Coin Tosses')
plt.savefig('Simcoin.png')