## Simulation Exercises

Using the repo setup directions, setup a new local and remote repository named statistics-exercises. The local version of your repo should live inside of ~/codeup-data-science. This repo should be named statistics-exercises

Do your work for this exercise in either a python file named simulation.py or a jupyter notebook named simulation.ipynb.

## Generating Random Numbers with Numpy

The `numpy.random` module provides a number of functions for generating random numbers.

- `np.random.choice`: selects random options from a list
- `np.random.random`: generates numbers between 0 and 1
- `np.random.uniform`: generates numbers between a given lower and upper bound
- `np.random.randn`: generates numbers from the standard normal distribution
- `np.random.normal`: generates numbers from a normal distribution with a specified mean and standard deviation

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'retina'

import viz # curriculum example visualizations (must live in same dir)

np.random.seed(123)

1. How likely is it that you roll doubles when rolling two dice?

In [2]:
n_trials = nrows = 10_000
n_dice = ncols = 1

roll1 = np.random.choice([1, 2, 3, 4, 5, 6], size = (n_trials, n_dice))
roll2 = np.random.choice([1, 2, 3, 4, 5, 6], size = (n_trials, n_dice))
roll = (roll1, roll2)
roll

(array([[6],
        [3],
        [5],
        ...,
        [2],
        [1],
        [1]]),
 array([[4],
        [5],
        [4],
        ...,
        [1],
        [5],
        [5]]))

In [3]:
doubles = (roll1 == roll2)
doubles

array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])

In [4]:
sums_by_trial = doubles.sum(axis=1)
sums_by_trial

array([0, 0, 0, ..., 0, 0, 0])

In [5]:
win_rate = doubles.mean()
win_rate

0.1677

2. If you flip 8 coins, what is the probability of getting exactly 3 heads? What is the probability of getting more than 3 heads?

In [9]:
n_flips = nrows = 10_000
n_coins = ncols = 8

toss = np.random.choice([True, False], size = (n_flips, n_coins))
toss

array([[False, False, False, ...,  True,  True,  True],
       [ True, False,  True, ...,  True, False, False],
       [ True,  True, False, ...,  True, False, False],
       ...,
       [ True,  True,  True, ..., False,  True, False],
       [ True,  True,  True, ...,  True,  True,  True],
       [False, False,  True, ..., False, False, False]])

In [10]:
toss.sum(axis=1)

array([4, 4, 4, ..., 4, 8, 2])

In [13]:
(toss.sum(axis=1) == 3).sum()

2211

In [15]:
print(f'Probability of getting exactly 3 Heads: {(toss.sum(axis=1) == 3).sum() / n_flips}')

Probability of getting exactly 3 Heads: 0.2211


In [16]:
print(f'Probability of getting more than 3 Heads: {(toss.sum(axis=1) > 3).sum() / n_flips}')

Probability of getting more than 3 Heads: 0.64


3. There are approximitely 3 web development cohorts for every 1 data science cohort at Codeup. Assuming that Codeup randomly selects an alumni to put on a billboard, what are the odds that the two billboards I drive past both have data science students on them?

4. Codeup students buy, on average, 3 poptart packages with a standard deviation of 1.5 a day from the snack vending machine. If on monday the machine is restocked with 17 poptart packages, how likely is it that I will be able to buy some poptarts on Friday afternoon? (Remember, if you have mean and standard deviation, use the np.random.normal) You'll need to make a judgement call on how to handle some of your values

5. Compare Heights
* Men have an average height of 178 cm and standard deviation of 8cm.
* Women have a mean of 170, sd = 6cm.
* Since you have means and standard deviations, you can use np.random.normal to generate observations.
* If a man and woman are chosen at random, what is the likelihood the woman is taller than the man?

6. When installing anaconda on a student's computer, there's a 1 in 250 chance that the download is corrupted and the installation fails. What are the odds that after having 50 students download anaconda, no one has an installation issue? 100 students?
* What is the probability that we observe an installation issue within the first 150 students that download anaconda?
* How likely is it that 450 students all download anaconda without an issue?

7. There's a 70% chance on any given day that there will be at least one food truck at Travis Park. However, you haven't seen a food truck there in 3 days. How unlikely is this?
* How likely is it that a food truck will show up sometime this week?

8. If 23 people are in the same room, what are the odds that two of them share a birthday? What if it's 20 people? 40?