# Hacker Stats

Hacker stats are statistics that are generated through the use modern computation to explore some probabilistic situation —often, these involve games of chance.

Today, the game of chance in question will be a game of dice, which will require us to explore the concept of a random walk.

A Random Walk is defined by a series of values where each value is some random increment or decrement from the previous value.

I like to think of something like a stock price, where the stock may move up a bit or down a bit, the result can then be graphed.

In our case we are simulating a game where the goal is to reach step 50, starting at step 0, where we will roll a dice 100 times, with each roll getting us closer (or further) from our goal.

- We will begin our random walk at 0, depicted in python as a list of length 1 or `[0]` 

- We will roll a dice

- The next value in the list will be determined by the outcome of the dice roll, the game logic will be defined at that time. 


- We will then build out an entire random walk by rolling the dice 100 times. `for loop`

- We will then simulate these random walks 100 or even 1000 times. Another `for loop` on top of the previous one.


At the end of the notebook there is are problems with solutions that may look similar in real world practice. The functions found there are the exact same functions shown in the notebook but with fewer comments. As we work through the notebook, there we will see heavily commented versions of each of these function definitions, but sometimes it's hard to see how they all fit together when seen out of context. So use the last part of the notebook as a guiding light and the heavily commented versions as reference!

Let's get started!

## Imports

In [0]:
# For generating numbers
import numpy as np

# For plotting long term behavior
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

## 0. Setting a Random Seed

A `random seed` is a crucial part of random number generators. Random Number Generators create what are called *pseudo* random numbers, because they are not truly random. 

These numbers rely on a `random_seed`, either randomly assigned by numpy in the background or explicitly by the programmer with `np.random.seed(some_num)`.

This is crucial for having all of our individual random operations behave the same way, resulting in the same outputs. 

Ex: if we all set the same `seed` value, and run 10 different functions from the `np.random` module —in order, we will ALL see the same sequence of random numbers generated.

In [8]:
# No matter how many times we run this cell, the output will ALWAYS be 5. 
# This is because the .seed() is being set to 22 before the call to .randint(), 
# this effectively resets the behavior of randint() on each run.
np.random.seed(22)
np.random.randint(10)

5

<br>

## 1. Dice Game

Each step will up the complexity of the previous section!

### Roll a Single Dice
<hr>

This call to the `np.random.randint()` function, will result in one of the following random integers:
`1`, `2`, `3`, `4`, `5` or `6`

This effectively simulates the act of rolling a die!

In [0]:
dice = np.random.randint(1,7)
print(dice)

<br>

### Roll `n` Dice
<hr>

This `for loop` will execute as many dice rolls as we like.

ex) if we want `10` dice rolls —set `num_rolls = 10`

ex) if we want `50` dice rolls —set `num_rolls = 50` 

The goal is to eventually combine *this* loop with some sort of game logic, to recreate the dice game.

The game logic will use the result of each roll to move our player **UP** or **DOWN**

In [0]:
num_rolls = 10
for roll in range(num_rolls):
    dice = np.random.randint(1,7)
    print(dice)

### We can also store each result in a list, for later use!
**Store in a python `list` with the `.append()` method!**

In [0]:
# How many dice do you want to roll?
num_rolls = 10


# prepare an empty list
dice_rolls = [] 


#loop 10 times
for roll in range(num_rolls): 
    
    # "Roll a dice" and save output to variable dice
    dice = np.random.randint(1,7)
    
    # Append each iteration of dice to dice_rolls
    dice_rolls.append(dice)


    
# print final list outside of for loop
print(dice_rolls)

<hr>
<br>
<br>
<br>

### Game Logic - Taking a Step

The game logic will decide just how our player will behave. Each dice roll can result in either moving forward or backward.

### Rules
---
These rules are entirely arbitrary, we can change these and see drastic changes in the long-term probability. 
- Start at position 5
- Roll a dice
- if we roll a `1`, we stay put.
- if we roll a `2`, or `3`, we move back one step
- if we roll a `4`, or `5`, we move forward one step
- if we roll a `6`, we move forward two steps.

In [0]:
# Set inital Step at 5 so we don't end up at step next_step = -1.
init_step = 5 

# Roll a Single Dice 
dice = np.random.randint(1,7)

# Game Logic
if dice == 1:
    next_step = init_step
elif dice <= 3:
    next_step = init_step - 1
elif dice <= 5:
    next_step = init_step + 1 
else:
    next_step = init_step + 2
    
# Where did we step?
print("init_step:", init_step)
print("dice rolled:", dice)
print("next_step:", next_step)

<hr>
<br>

<h2 style="color:red"><center>---Exercise 1 and 2---</center></h2>

### A Simple *Random* Walk

In the above exercise, we were able to take a current position, roll a dice, and find our new position. However, we haven't saved this change anywhere, what if we wanted to take this new position and then roll a dice from there? How do we update the position so we can continually update it with consecutive dice rolls?

The answer is with a python list! This will be different than the simple append we did 2 exercises ago, because we will not be appending the dice roll, we will be apending the updated position of the player based on the dice roll.

This means We have to initialize a list with our first position, in this case `0`.
From there we must append each new position to the list by taking the latest position and running the dice game on it. 

### Steps
---
- At the beginning of the `loop`, `random_walk` `=` `[0]`, 
- `1st` iteration: if we roll a 6, we move up 2, then `random_walk` `=` `[0,2]`
- `2nd` iteration: if we roll a 1, we move back 1, then `random_walk` `=` `[0,2,1]`
- repeat `n` times, wehere `n` is the desired `len` of `random_walk`.

In [0]:
# Always initialize lists outside of for loops, otherwise you reinitialize on each iteration.
random_walk = [0]

for roll in range(10):
    
    
    print(random_walk)  # Look at random walk change as we iterate through loop
    
    
    init_step = random_walk[-1]  # grab latest value in random_walk, 0 at first.

    
    dice = np.random.randint(1,7)  # Roll dice

    
    # Think of this section as a unit that receives init_step and dice, and returns next_step.
    if dice <= 2:
        next_step = max(0, init_step - 1) # if we roll a 1 or 2, move down 1. If we end up at position -1, take 0 instead.  
    
    elif dice <= 5:
        next_step = init_step + 1  # If we roll a 3,4 or 5 move up 1.
        
    else:
        next_step = init_step + 2  #If we roll a 6, move back 2


    # add to random_walk
    random_walk.append(next_step)
    

print()    
print("Final:", random_walk)

<h2 style="color:red"><center>---Exercise 3---</center></h2>

## Simulate Games Using Functions

Now we have seen how to create a single random_walk, but what if we wanted to make ten, a hundred, even a thousand `random_walks`, how would we go about doing that?

The answer is that we would just stick all of the code above in another for loop, where there is yet another python list where each position will be a random_walk. However, this can get pretty hariry. 

for your viewing pleasure: The entire code below is to simulate 100 random_walks of 10 steps each, we will go over a better approach after this.

[This is Equivalent to](#func)
<a id=prog></a>

In [0]:
# Always initialize lists outside of for loops, otherwise you reinitialize on each iteration.
num_walks = 100
num_steps = 10


all_random_walks = []
for n in range(num_walks):
    random_walk = [0]
    for roll in range(num_steps):
        init_step = random_walk[-1]  
        dice = np.random.randint(1,7)  

        if dice <= 2:
            next_step = max(0, init_step - 1) 
        elif dice <= 5:
            next_step = init_step + 1 
        else:
            next_step = init_step + 2

        random_walk.append(next_step)
    
    all_random_walks.append(random_walk)
    
#     Print every random_walk as it is appended, saves us from the nasty print out below.
    print(random_walk)

    
# un-comment below to see all_walks in all of its messy glory
# print(all_random_walks)

<br>
<br>
<hr>

<h2 style="color:red"><center>---Exercise 4---</center></h2>

## Now, for the better approach.

If this feels fragmented or hard to follow, I have placed all functions in the bottom, in order. Achiving the results of the above, but broken down into small functions.

I will present each function, and describe how it gets used by the next to slowly build out a program similar to the one above, except much more functional and reuasble.

### `dice_roll()`

>**Input:** None

>**Output:** `dice`; Random Integer in range: [1,2,3,4,5,6]

In [0]:
def dice_roll():
    return np.random.randint(1,7)

<br>

### `dice_game()`

This function takes in a current `step`, performs the dice game logic, and outputs a `next_step`

> **Input:** `step` - Integer

> **Output:** `next_step` - Integer


In [0]:
def dice_game(prev_step):
    dice = dice_roll()
    if dice <= 3: 
        next_step = max(0, prev_step - 1)
        
    elif dice <= 5:
        next_step = prev_step + 1
        
    else: 
        next_step = prev_step + 2
        
    return next_step

<br>
<br>

### `random_walk()`

>**Input:** `num_steps`: how many steps for the walk.

>**Output:** A Single `random_walk`

Basically, we've managed to get the first 2 chunks into functions that can now be called from within the loop in `build_random_walk()` function. Anytime we call `build_random_walk()`, it will return a single `random_walk`.

Notice how within the for loop, we call `dice_game()`, were we pass in the `curr_step` and its logic handles and returns the `next_step`. Effetively, on each iteration, we are adding a new `next_step` to `rand_walk`. 

At the very end we actually call the function and save the `rand_walk` returned by build_random_walk(), into the variable `a_rand_walk`.

In [0]:
def build_random_walk(num_steps = 100):
    rand_walk = [0]
    for roll in range(num_steps):
        curr_step = rand_walk[-1]
        next_step = dice_game(curr_step)
        rand_walk.append(next_step)
    return rand_walk

# Call build_random_walk(), see its output.
a_rand_walk = build_random_walk()
print(a_rand_walk)

<br>

### `simulate_games()`

>**Input:** `num_sims` - Integer

>**Output:** `all_walks` - Numpy Array of `random_walks`
>> `num_sims` will decide how many `random_walks` will be contained inside of `all_walks`


We will do something very similar as before to now make many iterations of a single `random_walk`, this is the power of functions, we can put pieces of code inside of them and reproduce them with for loops. 

We will see `build_random_walk()` being referenced here, The loop will iterate as many times as the user defined `num_sims` dictates, making a new random_walk of 100 steps each time.

Each random_walk is then appended to all_walks, which at the end should contain as many random_walks as you'd like, 1000, 10000, you name it.

In [0]:
def simulate_games(num_sims):
    all_random_walks = []
    for sim in range(num_sims):
        rand_walk = build_random_walk(num_steps = 100)
        all_random_walks.append(rand_walk)
        
    return all_random_walks


all_walks = simulate_games(num_sims = 10)

# un-comment to see each walk in all_walks, will get unruly as you increase num_sims
# for walk in all_walks:
#     print(walk)
#     print()

<br>

### `prepare_data()`

>**Input:** `all_walks` - python list of all random walks

>**Output:** `all_walks_t` - the transpose of a numpy array containing all random walks

In [0]:
def prepare_data(all_walks):
    
    """For easier plotting"""
    
    np_walks = np.array(all_walks)
    np_walks_t = np.transpose(np_walks)
    return np_walks_t 

<br>

### `plot_distribution()`

>**Input:** `all_walks` - numpy array of all `random_walks`

>**Output:** Histogram of all `final_steps`

In [0]:
def plot_distribution(all_Walks):
    plt.figure()
    ax = plt.gca()
    
#     This grabs the final outcome of each walk.
    final_steps = all_walks[-1]
    ax.hist(final_steps)
    
    
    ax.set_title('Dice Game Outcomes')
    ax.set_xlabel('Game Outcome')
    ax.set_ylabel('# Games')
    plt.show()

<br>

### `plot_all_walks()`

>**Input:** `all_walks` - numpy array of all `random_walks`

>**Output:** plots every single `random_walk`

In [0]:
def plot_all_walks(all_walks):
    plt.figure(figsize=(12,8))
    ax = plt.gca()
    ax.set_title("All Random Walks")
    ax.set_xlabel("Dice Roll (Time)")
    ax.set_ylabel("Steps Taken (Random Walk)")
    ax.plot(all_walks)
    plt.show()

<hr>
<br>
<br>

### Run Simulations and Plot Results

In [0]:
all_walks = simulate_games(num_sims = 100)
all_walks_t = prepare_data(all_walks)
plot_distribution(all_walks_t)
plot_all_walks(all_walks_t)

<h2 style="color:red"><center>---Exercise 5---</center></h2>

<br>
<br>
<a id=func></a>

### In perspective: Each function builds on the last.

Use this for when you're reviewing this process, no need to go through this again in class, but it is very useful to see how these functions build off one another, to form the final data structure.

[This is Equivalent To](#prog)

In [0]:
def dice_roll():
    '''# This one rolls a die'''

    return np.random.randint(1,7)


def dice_game(curr_step):
    '''# Takes each dice roll and a curr_step returns next_step'''
    
    dice = dice_roll()
    if dice <= 3: 
        next_step = max(0, curr_step - 1)
    elif dice <= 5:
        next_step = curr_step + 1
    else: 
        next_step = curr_step + 2
    return next_step


def build_random_walk(num_steps):
    '''Saves each next_step to form a single random_walk'''
    
    rand_walk = [0]
    for roll in range(num_steps):
        curr_step = rand_walk[-1]
        next_step = dice_game(curr_step)
        rand_walk.append(next_step)
        
    return rand_walk


def simulate_games(num_sims):
    '''Saves each random_walk to a large list of all walks'''
    
    all_random_walks = []
    for sim in range(num_sims):
        rand_walk = build_random_walk(num_steps = 100)
        all_random_walks.append(rand_walk)
        
    return all_random_walks

The above are only function definintions, the section after `Plotting Functions` has both sets of functions actually being called to perform the hacker stats.

<br>

### Plotting Functions

In [0]:

def prepare_data(all_walks):
    
    """Transposes numpy array version of python list, for easier plotting"""
    
    np_walks = np.array(all_walks)
    np_walks_t = np.transpose(np_walks)
    return np_walks_t 


def plot_distribution(all_Walks):
    '''plots a distribution of numbers, in this case the final step of each random walk'''
    
    plt.figure()
    ax = plt.gca()
    final_steps = all_walks[-1]
    ax.hist(final_steps)
    ax.set_title('Dice Game Outcomes')
    ax.set_xlabel('Game Outcome')
    ax.set_ylabel('# Games')
    plt.show()
    
    
def plot_all_walks(all_walks):
    '''This will plot every single random walk in a set of all_walks'''
    
    plt.figure(figsize=(12,8))
    ax = plt.gca()
    ax.set_title("All Random Walks")
    ax.set_xlabel("Dice Roll (Time)")
    ax.set_ylabel("Steps Taken (Random Walk)")
    ax.plot(all_walks)
    plt.show()

#### Create 100 random walks based on the rules of dice_game.

In [0]:
all_walks = simulate_games(100)
all_Walks_t = prepare_data(all_walks)
plot_distribution(all_walks_t)
plot_all_walks(all_walks_t)