# Case Study: Hacker Statistics
This notebook will allow you to apply all the concepts you've learned in this course. You will use hacker statistics to calculate your chances of winning a bet. Use random number generators, loops, and Matplotlib to gain a competitive edge!

# Random Numbers

In [None]:
import numpy as np

#### Random float
Randomness has many uses in science, art, statistics, cryptography, gaming, gambling, and other fields. You're going to use randomness to simulate a game.

All the functionality you need is contained in the `random` package, a sub-package of `numpy`. In this exercise, you'll be using two functions from this package:
- `seed()`: sets the random seed, so that your results are reproducible between simulations. As an argument, it takes an integer of your choosing. If you call the function, no output will be generated.
- `rand()`: if you don't specify any arguments, it generates a random float between zero and one.

In [None]:
# Use seed() to set the seed
np.random.seed(123)

# Generate your first random float with rand(), and print it out.
np.random.rand()

#### Roll the dice
In the previous exercise, you used `rand()`, that generates a random float between `0` and `1`.

You can just as well use `randint()`, also a function of the random package, to generate integers randomly. The following call generates the integer 4, 5, 6 or 7 randomly. **8 is not included**:

In [None]:
np.random.randint(4, 8)

- Use `randint()` with the appropriate arguments to randomly generate the integer `1`, `2`, `3`, `4`, `5` or `6`. This simulates dice. Print it out.
- Repeat the outcome to see if the second throw is different.

In [None]:
# Use randint() to simulate a dice
print(f'Dice one: {np.random.randint(1, 7)}')

# Use randint() again
print(f'Dice two: {np.random.randint(1, 7)}')

#### Determine your next move
In the Empire State Building bet, your next move depends on the number of eyes you throw with the dice. We can perfectly code this with an `if-elif-else` construct!

Assume that you're currently at step 50.
- If dice is 1 or 2, you go one step down.
- if dice is 3, 4 or 5, you go one step up.
- Else, throw the dice again. The number of eyes is the number of steps you go up.

Can you write the script?

In [None]:
# Use seed() to set the seed
np.random.seed(123)

# Starting step
step = 50

# Roll the dice
dice = np.random.randint(1,7)

# If dice is 1 or 2, you go one step down.
if dice <= 2 :
    step = step - 1
# if dice is 3, 4 or 5, you go one step up.
elif dice <= 5 :
    step = step + 1
# Else, you throw the dice again. The number of eyes is the number of steps you go up.
else :
    step = step + np.random.randint(1,7)

# Print out dice and step
print(f'Dice Roll: {dice} \nStep: {step}')

Given the value of `dice`, was `step` updated correctly?

# Random Step
If you use a dice to determine your next step, you can call this a random step. What if you use a dice 100 times to determine your next step? You would have a succession of random steps, or in other words, a random walk.

A random walk is a well known concept in science. For example, the financial status of a gambler can be modeled as a random walk. To record every step in your random walk, you need to learn how to gradually build a list with a `for` loop.

#### The next step
Before, you have already written Python code that determines the next step based on the previous step. Now it's time to put this code inside a `for` loop so that we can simulate a random walk.

In [None]:
np.random.seed(123)

# Initialize random_walk
random_walk = [0]

# # Simulate steps random walk. for loop should run 100 times
for x in range(100) :

    # Set step: last element in random_walk
    step = random_walk[-1]

    dice = np.random.randint(1,7)

    if dice <= 2:
        step = step - 1
    elif dice <= 5:
        step = step + 1
    else:
        step = step + np.random.randint(1,7)

    # append next step to random_walk
    random_walk.append(step)

print(random_walk)

#### How low can you go?
You have code that calculates your location in the Empire State Building after 100 dice throws. However, there's something we haven't thought about - you can't go below 0!

A typical way to solve problems like this is by using `max()`. If you pass `max()` two arguments, the biggest one gets returned. For example, to make sure that a variable `x` never goes below `10` when you decrease it, you can use: `x = max(10, x - 1)`

In [None]:
np.random.seed(123)
random_walk = [0]

for x in range(100) :
    step = random_walk[-1]
    dice = np.random.randint(1,7)

    if dice <= 2:
        # Replace below: use max to make sure step can't go below 0
        step = max(0, step - 1)
    elif dice <= 5:
        step = step + 1
    else:
        step = step + np.random.randint(1,7)

    random_walk.append(step)

print(random_walk)

#### Visualize the walk
Let's visualize this random walk! Remember how you could use `matplotlib` to build a line plot? `plt.plot(x, y)`

The first list you pass is mapped onto the `x` axis and the second list is mapped onto the `y` axis.

If you pass only one argument, Python will know what to do and will use the index of the list to map onto the `x` axis, and the values in the list onto the `y` axis.

In [None]:
# Import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.dpi'] = 180

In [None]:
np.random.seed(123)
random_walk = [0]

for x in range(100) :
    step = random_walk[-1]
    dice = np.random.randint(1,7)

    if dice <= 2:
        step = max(0, step - 1)
    elif dice <= 5:
        step = step + 1
    else:
        step = step + np.random.randint(1,7)

    random_walk.append(step)

# Plot random_walk
plt.plot(random_walk)

# Show the plot
plt.show()

# Distribution

#### Simulate multiple walks
A single random walk is one thing, but that doesn't tell you if you have a good chance at winning the bet.

To get an idea about how big your chances are of reaching 60 steps, you can repeatedly simulate the random walk and collect the results.

In [None]:
np.random.seed(123)

# Initialize all_walks (don't change this line)
all_walks = []

# Simulate random walk 10 times
for i in range(10):

    # Code from before
    random_walk = [0]
    for x in range(100) :
        step = random_walk[-1]
        dice = np.random.randint(1,7)

        if dice <= 2:
            step = max(0, step - 1)
        elif dice <= 5:
            step = step + 1
        else:
            step = step + np.random.randint(1,7)
        random_walk.append(step)

    # Append random_walk to all_walks
    all_walks.append(random_walk)

# Print all_walks
print(all_walks)

#### Visualize all walks
`all_walks` is a list of lists: every sub-list represents a single random walk. If you convert this list of lists to a NumPy array, you can start making interesting plots!

The nested `for` loop is already coded for you - don't worry about it. For now, focus on the code that comes after this `for` loop.

In [None]:
np.random.seed(123)

# initialize and populate all_walks
all_walks = []
for i in range(10):
    random_walk = [0]
    for x in range(100) :
        step = random_walk[-1]
        dice = np.random.randint(1,7)
        if dice <= 2:
            step = max(0, step - 1)
        elif dice <= 5:
            step = step + 1
        else:
            step = step + np.random.randint(1,7)
        random_walk.append(step)
    all_walks.append(random_walk)

# Convert all_walks to NumPy array: np_aw
np_aw = np.array(all_walks)

# Plot np_aw and show
plt.plot(np_aw)
plt.show()

# Clear the figure
plt.clf()

# Transpose np_aw: np_aw_t
np_aw_t = np.transpose(np_aw)

# Plot np_aw_t and show
plt.plot(np_aw_t)
plt.show()

You can clearly see how the different simulations of the random walk went. Transposing the 2D NumPy array was crucial; otherwise Python misunderstood.

#### Implement clumsiness
With this neatly written code of yours, changing the number of times the random walk should be simulated is super-easy. You simply update the `range()` function in the top-level `for` loop.

There's still something we forgot! You're a bit clumsy, and you have a 0.1% chance of falling down. That calls for another random number generation. Basically, you can generate a random float between `0` and `1`. If this value is less than or equal to 0.001, you should reset step to 0.

In [None]:
np.random.seed(123)

# Simulate random walk 250 times
all_walks = []
for i in range(250) :
    random_walk = [0]
    for x in range(100) :
        step = random_walk[-1]
        dice = np.random.randint(1,7)
        if dice <= 2:
            step = max(0, step - 1)
        elif dice <= 5:
            step = step + 1
        else:
            step = step + np.random.randint(1,7)

        # Implement clumsiness
        if np.random.rand() <= 0.001 :
            step = 0

        random_walk.append(step)
    all_walks.append(random_walk)

# Create and plot np_aw_t
np_aw_t = np.transpose(np.array(all_walks))
plt.plot(np_aw_t)
plt.show()

#### Plot the distribution
All these fancy visualizations have put us on a sidetrack. We still have to solve the million-dollar problem: What are the odds that you'll reach 60 steps high on the Empire State Building?

Basically, you want to know about the end points of all the random walks you've simulated. These end points have a certain distribution that you can visualize with a histogram.

Note that if your code is taking too long to run, you might be plotting a histogram of the wrong data!

In [None]:
np.random.seed(123)

# Simulate random walk 500 times
all_walks = []
for i in range(500) :
    random_walk = [0]
    for x in range(100) :
        step = random_walk[-1]
        dice = np.random.randint(1,7)
        if dice <= 2:
            step = max(0, step - 1)
        elif dice <= 5:
            step = step + 1
        else:
            step = step + np.random.randint(1,7)
        if np.random.rand() <= 0.001 :
            step = 0
        random_walk.append(step)
    all_walks.append(random_walk)

np_aw_t = np.transpose(np.array(all_walks))

# Select last row from np_aw_t: ends
ends = np_aw_t[-1, :]

# Plot histogram of ends, display plot
plt.hist(ends)
plt.show()

#### Calculate the odds
The histogram of the previous exercise was created from a NumPy array `ends`, that contains 500 integers. Each integer represents the end point of a random walk. To calculate the chance that this end point is greater than or equal to 60, you can count the number of integers in `ends` that are greater than or equal to 60 and divide that number by 500, the total number of simulations.

Well then, what's the estimated chance that you'll reach at least 60 steps high if you play this Empire State Building game? The ends array is everything you need; it's available in your Python session so you can make calculations in the IPython Shell.

In [None]:
true_vals = []
for index in ends:
    if index >= 60:
        true_vals.append(index)

odds = len(true_vals) / 500
odds