<a id='top'></a>

# CSCI 3202, Spring 2018
# Assignment 5
# Due:  Wednesday 11 April 2018 by 12:00 PM

<br>

### Your name: Samuel Leon

<br>

**Note:** Some packages to load, helper functions and unit tests are defined at [the bottom of this notebook](#helpers). They're also defined up here, because I care.

Shortcuts:  [top](#top) || [1](#p1) | [1a](#p1a) | [1b](#p1b) | [1c](#p1c) | [1d](#p1d) | [1e](#p1e) | [1f](#p1f) | [1g](#p1g) || [2](#p2) | [2a](#p2a) | [2b](#p2b) | [2c](#p2c) | [2d](#p2d) | [2e](#p2e) || [3](#p3) | [3a](#p3a) | [3b](#p3b) | [3c](#p3c) | [3d](#p3d) | [3e](#p3e) | [3f](#p3f) | [3g](#p3g) || [helpers](#helpers)

In [6]:
from scipy import stats
import unittest
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

---

<a id='p1'></a>[Back to top](#top)

## Problem 1:  Bayesian network to model heart disease

The following Bayesian network is based loosely on a study that examined heart disease risk factors in 167 elderly individuals in South Carolina.  This study is [linked here](https://piazza.com/class_profile/get_resource/jc4v74a5uu5wa/jeyiv7kvs7r7ck) and posted to Piazza under the Resources tab.  Note that this figure uses Y and N to represent Yes and No, whereas in class we used the equivalent T and F to represent True and False Boolean values.

<img src="http://www.cs.colorado.edu/~tonyewong/home/resources/hw05_bayesnet_heartdisease.png" style="width: 650px;"/>

<a id='p1a'></a>

### (1a) 

Create a `BayesNet` object to model this.  Below are the codes for the (conditional) probability `P` function and `BayesNode` class as well, that we used in class on Friday (16 March) to represent the variable nodes and calculate probabilities. You can code this however you want, subject to the following constraints:
1. the nodes are represented using the `BayesNode` class and can work with the `P` function for probabilities,
1. your `BayesNet` structure keeps track of which nodes are in the Bayes net, as well as
1. which nodes are the parents/children of which other nodes.

Some *suggested* skeleton codes for a class structure are given. The suggestions for methods to implement are in view of the fact that we will need to calculate some probabilities, which is going to require us to `find_node`s and `find_values` that nodes can take on.

In [7]:
## For the sake of brevity...
T, F = True, False

## From class:
def P(var, value, evidence={}):
    '''The probability distribution for P(var | evidence), 
    when all parent variables are known (in evidence)'''
    if len(var.parents)==1:
        # only one parent
        row = evidence[var.parents[0]]
    else:
        # multiple parents
        row = tuple(evidence[parent] for parent in var.parents)
    return var.cpt[row] if value else 1-var.cpt[row]

## Also from class:
class BayesNode:
    
    def __init__(self, name, parents, values, cpt):
        if isinstance(parents, str):
            parents = parents.split()
            
        if len(parents)==0:
            # if no parents, empty dict key for cpt
            cpt = {(): cpt}
        elif isinstance(cpt, dict):
            # if there is only one parent, only one tuple argument
            if cpt and isinstance(list(cpt.keys())[0], bool):
                cpt = {(v): p for v, p in cpt.items()}

        self.variable = name
        self.parents = parents
        self.cpt = cpt
        self.values = values
        self.children = []
        
    def __repr__(self):
        return repr((self.variable, ' '.join(self.parents)))    

    
##===============================================##
## Suggested skeleton codes for a BayesNet class ##
##===============================================##

class BayesNet:
    '''Bayesian network containing only boolean-variable nodes.'''

    def __init__(self, nodes):
        '''Initialize the Bayes net by adding each of the nodes,
        which should be a list BayesNode class objects ordered
        from parents to children (`top` to `bottom`, from causes
        to effects)'''
        
        self.bn_var = nodes 

                
    def add(self, node):
        '''Add a new BayesNode to the BayesNet. The parents should all
        already be in the net, and the variable itself should not be'''
        assert node.variable not in self.variables
        assert all((parent in self.variables) for parent in node.parents)
        self.bn_var.append(node) 

            
    def find_node(self, var):
        '''Find and return the BayesNode in the net with name `var`'''
        for node in self.bn_var:
            if (node.variable == var):
                return node
        

        
    def find_values(self, var):
        '''Return the set of possible values for variable `var`'''
        node = self.find_node(var)
        return node.values
    
    def __repr__(self):
        return 'BayesNet({})'.format(self.nodes)

#### Unit tests

In [8]:
tests_to_run = unittest.TestSuite()
tests_to_run.addTest(Tests_Problem1("test_onenode"))
tests_to_run.addTest(Tests_Problem1("test_twonode"))
unittest.TextTestRunner().run(tests_to_run)

..
----------------------------------------------------------------------
Ran 2 tests in 0.006s

OK


<unittest.runner.TextTestResult run=2 errors=0 failures=0>

<a/ id='p1b'></a>

### (1b)

Craft a function `get_prob(X, e, bn)` to return the **normalized** probability distribution of variable `X` in Bayes net `bn`, given the evidence `e`.  That is, return $P(X \mid e)$. The arguments are:
* `X` is some representation of the variable you are querying the probability distribution of. Either a string (the variable name from the `BayesNode` or a `BayesNode` object itself are good options.
* `e` is some representation of the evidence your probability is conditioned on. When given an empty argument (or `None`) for `e`, `get_prob` should return the marginal distribution $P(X)$.
* `bn` is your `BayesNet` object.

You may do this using the `enumeration` algorithm from class (pseudocode is in the book), or by brute force (i.e., use a few `for` loops). Either way, you should be using your `BayesNet` object to keep track of all the nodes and relationships between nodes so your `get_prob` function knows these things.

In [9]:
'''
e_var is the whole list of nodes like: [('Smoking ^ Alcohol', ''), ('Moderate Exercise', ''), ('High Blood Pressure', 'Smoking ^ Alcohol Moderate Exercise'), ('Atheroscierosis', ''), ('Family History', ''), ('Heart Disease', 'Atheroscierosis High Blood Pressure Family History'), ('Angina Pectoris', 'Heart Disease'), ('Rapid Heartbeats', 'Heart Disease')]
e_var[0] is the first tuple in the list ('Smoking ^ Alcohol', '')
e_var[0].values is a tuple (True, False)
e_var[0].variable is the name, Smoking ^ Alcohol in this example

enumerater takes the list of values, evar, and a list of evidence e, and sums the probabilites as they're iterated through.
We do this using a parent node and a summand, wherein the parent has the probability of all prior nodes and the child
is the next subsequent step
'''
def iterate_list(list_obj):
    if len(list_obj) == 0: return list_obj
    else: return list_obj[1:]
    
def enumerater(e_var, e):    
    evidence = e.copy()
    if len(e_var) == 0: return 1

    if e_var[0].variable in evidence:
        parent = P(e_var[0], evidence[e_var[0].variable], evidence)
        e_temp = e_var.copy()
        e_temp.pop(0)
        
        child = enumerater(e_temp, evidence)
        return (parent*child)
    else:
        summand = 0 
        for i in e_var[0].values: #(True, False)
            evidence[e_var[0].variable] = i #Assign true/false to names
            parent = P(e_var[0], evidence[e_var[0].variable], evidence)
            child = enumerater(iterate_list(e_var), evidence)
            summand += (parent*child)
        return summand
                                   
def get_prob(X, e, bn):
    evidence = e.copy()
    X = bn.find_node(X)
    list_vals = []
    for i in X.values:
        evidence[X.variable] = i
        list_vals.append(enumerater(bn.bn_var, evidence))
        
    total = sum(list_vals)
    for i in range(len(list_vals)):
        list_vals[i] /= total
    #turn everything into a proper distribution
        
    dict_list = {}
    for i in range(len(X.values)):
        dict_list[X.values[i]] = list_vals[i]
    print(dict_list)
    return dict_list

Use your `get_prob` function to calculate the following probabilities. Print them to the screen and compare to the original Bayes net figure given to make sure the output passes these "unit tests".

1. The marginal probability of `Family History` is $P(FH=T)=0.15$
2. The probability of *not* experiencing `Angina Pectoris`, given `Heart Disease` is observed, is $P(Ang=F \mid HD=T)=1-0.85=0.15$
3. The probability of `High Blood Pressure`, given a person does `Smoke and/or use Alcohol` but does not get `Moderate Exercise`, is $P(HBP=T \mid Sm=T, ME=F)=0.72$

In [10]:
ath = BayesNode('Atheroscierosis', '', (T,F), 0.53)
fam_hist = BayesNode('Family History', '', (T,F), 0.15)
M_E = BayesNode('Moderate Exercise', '', (T,F), 0.5)

S_A = BayesNode('Smoking ^ Alcohol', '', (T,F), 0.2)
H_BP = BayesNode('High Blood Pressure', ['Smoking ^ Alcohol', 'Moderate Exercise'], (T,F), {(T,T) : 0.6, (T,F) : 0.72, (F,T) : 0.33, (F,F) : 0.51})
heart = BayesNode('Heart Disease', ['Atheroscierosis', 'High Blood Pressure', 'Family History'], (T,F), {(T,T,T):0.92, (T,T,F):0.91, (T,F,T):0.81, (T,F,F):0.77, (F,T,T):0.75, (F,T,F):0.69, (F,F,T):0.38, (F,F,F):0.23})
ang = BayesNode('Angina Pectoris', ['Heart Disease'], (T,F), {T:0.85, F:0.4})
rapid_h = BayesNode('Rapid Heartbeats', ['Heart Disease'], (T,F), {T:0.99, F:0.3})


nodes = [S_A, M_E, H_BP, ath, fam_hist, heart, ang, rapid_h]
Bnet = BayesNet(nodes)

p1 = get_prob('Family History', {}, Bnet)
p2 = get_prob('Angina Pectoris', {'Heart Disease':True}, Bnet)
p3 = get_prob('High Blood Pressure', {'Smoking ^ Alcohol':True, 'Moderate Exercise':False}, Bnet)

print("P(Family History)", p1)
print("P(Angina Pectoris | Heart Disease)", p2)
print("P(High Blood Pressure | Smokes, Alcohol | - Moderate Exercise)", p3)

{True: 0.15, False: 0.85}
{True: 0.85, False: 0.15000000000000002}
{True: 0.7199999999999999, False: 0.28}
P(Family History) {True: 0.15, False: 0.85}
P(Angina Pectoris | Heart Disease) {True: 0.85, False: 0.15000000000000002}
P(High Blood Pressure | Smokes, Alcohol | - Moderate Exercise) {True: 0.7199999999999999, False: 0.28}


<a/ id='p1c'></a>

### (1c)

Calculate the probability of observing someone with `High Blood Pressure`, $P(HBP=T)$, *by hand*, showing all work in Markdown/LateX below.

**Your answer:**

$P(HBP = T)$

$\Sigma$ **BELOW**


$P(+Sm)*P(+ME)*P(HBP|+Sm, +ME) = (.2*.5*.6)$

$P(+Sm)*P(-ME)*P(HBP|+Sm, -ME) = (.2*.5*.72)$

$P(-Sm)*P(+ME)*P(HBP|-Sm, +ME) = (.8*.5*.33)$

$P(-Sm)*P(-ME)*P(HBP |-Sm, -ME) = (.8*.5*.51)$

In [11]:
print((.2*.5*.6) + (.2*.5*.72) + (.8*.5*.33) + (.8*.5*.51))

0.468


**Verify** your calculation using your `get_prob` function.

In [12]:
get_prob('High Blood Pressure', {}, Bnet)

{True: 0.4680000000000001, False: 0.532}


{False: 0.532, True: 0.4680000000000001}

<a/ id='p1d'></a>

### (1d)

Now calculate the following probabilities using your `get_prob` function.

[i] The probability of an arbitrary individual having `Heart Disease`, $P(HD=T)$

In [13]:
get_prob('Heart Disease', {}, Bnet)

{True: 0.6617765600000001, False: 0.33822343999999993}


{False: 0.33822343999999993, True: 0.6617765600000001}

[ii] The probability that an individual does *not* have `Heart Disease`, given that `Rapid Heartbeat` was observed, $P(HD=F \mid Rapid=T)$

In [14]:
get_prob('Heart Disease', {'Rapid Heartbeats':True}, Bnet)

{True: 0.865895362727999, False: 0.13410463727200098}


{False: 0.13410463727200098, True: 0.865895362727999}

[iii] The probability of an individual having `High Blood Pressure` if they have `Heart Disease` and a `Family History`, $P(HBP=T \mid HD=T, FH=T)$

In [15]:
get_prob('High Blood Pressure', {'Heart Disease':True, 'Family History':True}, Bnet)

{True: 0.5486791513343575, False: 0.4513208486656426}


{False: 0.4513208486656426, True: 0.5486791513343575}

[iv] The probability that an individual is a `Smoker/Alcohol User` if they have `Heart Disease`, $P(Sm=T \mid HD=T)$

In [16]:
get_prob('Smoking ^ Alcohol', {'Heart Disease':True}, Bnet)

{True: 0.2163440784303391, False: 0.7836559215696609}


{False: 0.7836559215696609, True: 0.2163440784303391}

[v] How would you expect the probability in [iv] to change if you also know the individual has `High Blood Pressure`?  Verify your hypothesis by calculating the relevant probability.

**Your answer:**

I would expect the probability of Smoking and Alcohol to go up, as it would seem to be more likely that they either do not Exercise or they use Alcohol/Smoke if someone has High Blood Pressure.

In [17]:
get_prob('Smoking ^ Alcohol', {'Heart Disease':True, 'High Blood Pressure':True}, Bnet)

{True: 0.28205128205128205, False: 0.717948717948718}


{False: 0.717948717948718, True: 0.28205128205128205}

[vi] How would you expect the probability in [v] to change if you also know that the individual does *not* get `Moderate Exercise` (in addition to having `Heart Disease` and `High Blood Pressure`)?  Explain your answer using concepts from class.  Verify your answer by calculating the relevant probability.

I would expect the probability that they are Smokers/Drinkers would be lower, as it's still a decent chance to have Heart Disease if you don't have moderate exercise- but that does not necessarily indicate that you are a smoker/drinker.

**Your answer:**

In [18]:
get_prob('Smoking ^ Alcohol', {'Heart Disease':True, 'High Blood Pressure':True, 'Moderate Exercise':False}, Bnet)

{True: 0.2608695652173913, False: 0.7391304347826088}


{False: 0.7391304347826088, True: 0.2608695652173913}

---

<a id='p2'></a>[Back to top](#top)

<img src="https://inhabitat.com/wp-content/blogs.dir/1/files/2014/02/norman-bike-riding-dog.png" style="width: 350px;"/>

## Problem 2:  Bayesian network to model decision-making

Let's consider using a Bayesian network to model our decision about whether or not to ride our bike to work today.  This decision depends heavily on the weather, so let's focus on that.

In class, we focused on Boolean variables.  For example, we might base our biking decision on whether or not it is raining.  But in reality, it probably matters *how hard* it is raining.  So suppose we break the variable `Rain` up into three discrete bins: `none`, `light` and `heavy`.

The temperature also factors into our decision.  There is definitely a sweet spot, where temperatures are neither too warm nor too cold, so it is very likely we would enjoy riding our bike.  So we can model the variable `Temperature` also using three discrete bins: `cold`, `moderate` and `warm`.

So a Bayesian network to model our decision for whether or not to bike to work could be as follows, where the first letter of each discrete bin is used to denote that variable value (i.e., `R=h` stands for heavy rain conditions).

<img src="http://www.cs.colorado.edu/~tonyewong/home/resources/bayesnet_biking2.png" style="width: 650px;"/>

<a/ id='p2a'></a>

### (2a)

Modify the `P` probability function to be able to handle these ternary parent nodes.

In [19]:
def P(var, value, evidence={}):
    '''The probability distribution for P(var | evidence), 
    when all parent variables are known (in evidence)'''
    if len(var.parents)==1:
        # only one parent
        row = evidence[var.parents[0]]
    else:
        # multiple parents
        row = tuple(evidence[parent] for parent in var.parents)
    if len(var.values) == 2:
        if value:
            return var.cpt[row]
        return (1 - var.cpt[row])
    elif isinstance(var.cpt[row], float):
        print(var.values)
        return var.cpt[row]
    return var.cpt[row][value]

Set up `BayesNode` objects for each of `Rain`, `Temp` and `Bike`, and create a `BayesNet` object to model the Bayesian network for this decision.  Again, you can use whatever structure you wish for your `BayesNet`, but please use the `BayesNode` class.  You may need to make minor modifications to the `BayesNode` class (e.g., changing/adding attributes), although none are strictly necessary.

In [20]:
Rain = BayesNode('Rain', '', ('none','light','heavy'), {'none':0.8, 'light':0.15, 'heavy':0.05})
Temp = BayesNode('Temp', '', ('cold', 'moderate', 'warm'), {'cold':0.3, 'moderate':0.6, 'warm':0.1})
Bike = BayesNode('Bike', ['Rain', 'Temp'], (T,F), {('none','cold'):0.7, ('none','moderate'):0.99, ('none','warm'):0.9, ('light','cold'):0.4, ('light','moderate'):0.6, ('light','warm'):0.5, ('heavy','cold'):0.2, ('heavy','moderate'):0.4, ('heavy','warm'):0.3})

b_net_nodes = [Rain, Temp, Bike]
b_net = BayesNet(b_net_nodes)

**Verify** that your modified probability function `P` is working by checking the following "unit tests". Print the output to screen and compare to what you expect from the figure above.

1. The marginal probability of no rain is $P(Rain=n)=0.8$
1. The marginal probability of light rain is $P(Rain=l)=0.15$
1. The marginal probability of heavy rain is $P(Rain=h)=0.05$
1. The probability of biking given that it is raining heavily and the temperature is cold, is $P(Bike=T \mid Rain=h, Temp=c)=0.2$

In [21]:
get_prob('Rain', {}, b_net)
get_prob('Bike', {'Rain':'heavy', 'Temp':'cold'}, b_net)

{'none': 0.7999999999999999, 'light': 0.14999999999999997, 'heavy': 0.049999999999999996}
{True: 0.2, False: 0.8}


{False: 0.8, True: 0.2}

<a/ id='p2b'></a>

### (2b)

Make any necessary modifications to your `get_prob` function from Problem 1, so that you can use it to calculate marginal probabilities and conditional probabilities for this problem. It is possible that your function does not require any modifications.

In [22]:
#I do not need to construct additional Pylons.

Use `get_prob` to calculate $P(Bike)$, the probability distribution for whether or not you will ride your bike on any given day.

In [23]:
get_prob('Bike', {}, b_net)

{True: 0.8112, False: 0.18880000000000002}


{False: 0.18880000000000002, True: 0.8112}

Use `get_prob` to calculate the probability that you will ride your bike, given that it is lightly raining.

In [24]:
get_prob('Bike', {'Rain':'light'}, b_net)

{True: 0.53, False: 0.47}


{False: 0.47, True: 0.53}

<a/ id='p2c'></a>

### (2c)

We are trapped indoors because some jerk gave us a ton of Intro to Artificial Intelligence homework to do.  Suppose we look out the window and see people biking. They sure do look like they're having fun! *Given* this information, we can actually make inferences regarding the temperature outside!  What is the probability distribution for temperature, given that we observe people biking?

First, compute this using your `get_prob` function.

In [25]:
get_prob('Temp', {'Bike':T}, b_net)

{'cold': 0.2329881656804734, 'moderate': 0.6671597633136096, 'warm': 0.09985207100591718}


{'cold': 0.2329881656804734,
 'moderate': 0.6671597633136096,
 'warm': 0.09985207100591718}

In [49]:
x = get_prob('Bike', {'Temp':'cold'}, b_net)[1]
x2 = get_prob('Bike', {'Temp':'moderate'}, b_net)[1]
x3 = get_prob('Bike', {'Temp':'warm'}, b_net)[1]

print("Prob(Temp | Bike): ", x*x2*x3)

{True: 0.63, False: 0.37}
{True: 0.902, False: 0.09800000000000002}
{True: 0.8100000000000002, False: 0.18999999999999995}
Prob(Temp | Bike):  0.4602906000000001


<a/ id='p2d'></a>

### (2d)

Confirm your answer to **2c** by hand, showing *all* relevant work below in a LateX/Markdown cell.

$P(Temp | Bike) = \frac{P(Bike | Temp) * P(Temp)}{P(Bike)}$

This step is a gigantic B%#%

We need to calculate P(Bike) to calculate this. To do so, we would need to calculate everything as we did in **1.C)**. This is something I do not wish to do, as it is a great deal of work. So, I will write out the equation and then use the relevant probability from my code.


$P(Bike) = P(n)*P(c)*P(B | n, c) + P(\neg{n})*P(c)*P(B | \neg{n}, c)
+ P(n)*P(\neg{c})*P(B | n, \neg(c)) + P(\neg{n})*P(\neg{c})*P(B | \neg{n}, \neg{c}) + P(l)*P(c)*P(B | l, c) + P(\neg{l})*P(c)*P(B | \neg{l}, c)
+ P(l)*P(\neg{c})*P(B | l, \neg(c)) + P(\neg{l})*P(\neg{c})*P(B | \neg{l}, \neg{c}) P(h)*P(c)*P(B | h, c) + P(\neg{h})*P(c)*P(B | \neg{h}, c)
+ P(h)*P(\neg{c})*P(B | h, \neg(c)) + P(\neg{h})*P(\neg{c})*P(B | \neg{h}, \neg{c})$

Yay. That's adequately ugly.

So, to calculate that *monstrosity*, we plug in the appropriate values. This comes out to be $18\%$. (Roughly). 
Then, we do some twiddel-y hand wave-y magic and plug in some numbers to the equation below.

$$
P(C | Bike) = [P(n) * P(b | n, c) + P(l) * P(b | l, c) + P(h) * P(h | h, c)] * P(c)
$$
$$
P(W | Bike) = [P(n) * P(b | n, w) + P(l) * P(b | l, w) + P(h) * P(h | h, w)]*P(w)
$$
$$
P(M | Bike) = [P(n) * P(b | n, m) + P(l) * P(b | l, n) + P(h) * P(h | h, m)]*P(m)
$$

Then, we need to normalize the values over eachother. So, we sum the values together and divide each by the total. 

I do the calculations below.

**Your answer:**


In [27]:
pbc = (0.8*0.7 + 0.15*0.4 + 0.05*0.2)*0.3
pbm = (0.8*0.99 + 0.15*0.6 + 0.05*0.4)*0.6
pbw = (0.8*0.9 + 0.15*0.5 + 0.05*0.3)*0.1

total = pbc + pbm + pbw
pbc/= total
pbm/= total
pbw/= total
print("P(B| c): {}, P(B | m): {}, P(B | w): {}".format(pbc, pbm, pbw))

P(B| c): 0.23298816568047337, P(B | m): 0.6671597633136096, P(B | w): 0.09985207100591718


**Hey! Those numbers match up! Yay! It worked.**

<a/ id='p2e'></a>

### (2e)

Finally, confirm your confirmation of the probability distribution for `Temp` by using approximate Bayesian computation and 10,000 samples.  That is, use the **prior sampling** and **"rejection sampling"** techniques from class to estimate the probabilities associated with each possible value for `Temp`, given that there are people biking outside.

In [28]:
def sample_2parent(row, variable, parents):
    return np.random.choice([T,F], p=[P(variable, T, {parents[0]: row[parents[0]], parents[1]: row[parents[1]]}), 
                                      P(variable, F, {parents[0]: row[parents[0]], parents[1] : row[parents[1]]})])
#thank 
#god 
#for 
#the 
#notebook

n_sample = 10000
p_dist_Temp = [0.3, 0.6, 0.1]
p_dist_Rain = [0.8, 0.15, 0.05]

Temp_list = ['cold', 'moderate', 'warm']
Rain_list = ['none', 'light', 'heavy']
sample_Temp = np.random.choice(Temp_list, size = n_sample, p=p_dist_Temp)

dfSample = pd.DataFrame({'Temp': sample_Temp})
dfSample['Rain'] = np.random.choice(Rain_list, size = n_sample, p=p_dist_Rain)
dfSample['Bike'] = dfSample.apply(sample_2parent, axis=1, variable = Bike, parents=Bike.parents)



As a "Unit Test", check what the probability of riding your bike is, given no other information.  Make sure this approximately matches your answer to **2b**.

In [29]:
df = dfSample[(dfSample['Bike'] == True)]
print('Probability it was cold: {}'.format((sum(df['Temp'] == 'cold'))/len(df))) 
print('Probability it was moderate: {}'.format((sum(df['Temp'] == 'warm'))/len(df)))
print('Probability it was warm: {}'.format((sum(df['Temp'] == 'moderate'))/len(df)))

Probability it was cold: 0.22198888202594194
Probability it was moderate: 0.10389129092032119
Probability it was warm: 0.6741198270537369


---

<a id='p3'></a>

## Problem 3:  Markov models - random walk to Taco Bell

Your friend Chris went to a party last night and drank way too much Fun Juice.  Chris is feeling a bit hungry, so he leaves the party, which is at the corner of 6th Street East and 3rd Street North, and heads for Taco Bell, which is located at the corner of 2nd Street East and 1st Street North.  A figure depicting this neighborhood is given below.

<img src="http://www.cs.colorado.edu/~tonyewong/home/resources/random_walk_to_taco_bell.png" style="width: 650px;"/>

Fun Juice makes Chris' sense of direction quite poor, so at each intersection along his way, he picks any one of the available directions with equal probability.  Chris at least knows not to go north of 4th Street North, south of 0th Street North, east of 7th Street East, or west of 0th Street East, and has the common decency not to cut through anyone's yard (i.e., he only walks along streets).

Suppose Chris only cares about traveling between from one intersection to another, and considers one *move* to be walking one block, from one intersection to an adjacent intersection.

Since this grid is precisely a Cartesian coordinate grid, let the bottom-left corner of the neighborhood be represented by $(0,0)$.  Then the available moves from that location are to walk East (right, in the $+x$ direction) to $(1,0)$ or North to $(0,1)$.

<a id='p3a'></a>

### (3a)

Create a class for `Neighborhood` and for `Agent`, to represent the neighborhood and the agent trying to get to Taco Bell:

`Neighborhood(n_northsouth, n_eastwest, taco_bell)`:
* `n_northsouth` and `n_eastwest` provide the number of streets running north-south and the number of streets running east-west, respectively.  In the given figure, there are 5 streets running east-west, for example.
* `taco_bell` is a tuple providing the coordinates of Taco Bell in this neighborhood.
* has attributes for:
  * number of streets running north-south
  * number of streets running east-west
  * all of the intersections in the neighborhood
  * the location of the Taco Bell in the neighborhood
* Implement in your code a check to make sure the location of the Taco Bell is within the neighborhood's coordinates.  Assume that the south-west corner of the neighborhood is always $(0,0)$.

`Agent(name, loc, neighborhood)`:
* In the constructor, provide the agent with a `name` (string), initial location as a tuple (`loc`), and a `Neighborhood` to live in. Store these as attributes.
* Fill in the rest of the needed methods for the agent:
  * `available_moves()` returns a list of tuples, representing the locations the agent can walk to from its current location
  * `random_move()` returns one of the possible moves
  * `walk(move)` updates the agent's location, if they make the given argument `move`
  * `at_taco_bell()` returns True if the agent is at the same intersection as the neighborhood Taco Bell, and False otherwise

In [30]:
from random import randint as randint
class Neighborhood:
    def __init__(self, n_northsouth, n_eastwest, taco_bell):
        '''Set up the layout of the neighborhood by giving the # streets
        running North/South (n_northsouth) and the # streets running 
        East/West). Based on these, store all the available locations 
        (intersections) in the neighborhood, and make sure the given
        coordinates for Taco Bell are valid'''
        
        self.northsouth = n_northsouth
        self.eastwest = n_eastwest
        self.intersections = []
        self.tacobell_loc = taco_bell
        
        for i in range(0,n_northsouth):
            for j in range(0,n_eastwest):
                self.intersections.append(tuple((i,j)))
        
        
class Agent:
    def __init__(self, name, loc, neighborhood):
        
        self.name = name
        self.current_loc = loc
        self.neighborhood = neighborhood
        self.moves_made = [loc]
        
        
    def available_moves(self):
        '''Return a list of available intersections the agent can move to,
        based on the layout of the agent's neighborhood'''
        
        moves = []
        x,y = self.current_loc
        
        if x > 0: 
            moves.append(tuple((x-1,y))) 
            if x < self.neighborhood.northsouth:
                moves.append(tuple((x+1,y)))
        else: moves.append(tuple((x+1,y)))
            
        if y > 0:
            moves.append(tuple((x,y-1)))
            if y < self.neighborhood.eastwest:
                moves.append(tuple((x,y+1)))
        else: moves.append(tuple((x,y+1)))
            
        return moves
        

    def random_move(self):
        '''Return a random move out of the available moves
        from the agent`s current location'''
        
        potential_moves = self.available_moves()
        random_loc = randint(0, len(potential_moves)-1)
        self.moves_made.append(potential_moves[random_loc])
        return potential_moves[random_loc]
        
    def walk(self):
        
        '''Update the agent to a new location'''
        
        self.current_loc = self.random_move()
        
        
    def at_taco_bell(self):
        '''Return True if the agent is at Taco Bell, and False otherwise'''
        
        if self.current_loc == self.neighborhood.tacobell_loc:
            return True
        return False
        
    
    def __repr__(self):
        return '{} at {}'.format(self.name, self.current_loc)

#### Unit tests

In [31]:
tests_to_run = unittest.TestSuite()
tests_to_run.addTest(Tests_Problem3('test_moves_corner'))
tests_to_run.addTest(Tests_Problem3('test_moves_center'))
tests_to_run.addTest(Tests_Problem3('test_tacos'))
unittest.TextTestRunner().run(tests_to_run)

...
----------------------------------------------------------------------
Ran 3 tests in 0.004s

OK


<unittest.runner.TextTestResult run=3 errors=0 failures=0>

<a/ id='p3b'></a>

### (3b)

Create a neighborhood and Chris agent to represent the situation above.

Then, run an ensemble of 1,000 simulations to obtain a sample for the number of blocks Chris must travel in order to arrive at Taco Bell at (2,1), starting from the party at (6,3).  Report the expected number of blocks traveled.

In [34]:
neighborhood = Neighborhood(8,5, (2,1))
num_trials = 1000
path_taken = []
total_blocks_traversed = 0

for i in range(0,1000):
    chris = Agent('Chris', (2,5), neighborhood)
    while not chris.at_taco_bell():
        chris.walk()    
    total_blocks_traversed += len(chris.moves_made)-1
    path_taken.append(chris.moves_made)

average = total_blocks_traversed / num_trials
print("Avg num blocks to return to Taco Bell: ", average)

Avg num blocks to return to Taco Bell:  95.81


**Reflection:**  The sequence of states (coordinates) that Chris passed through in his travels is a **Markov chain** - each new state depended only on the previous one.  This process, called a **random walk**, can be a useful way to explore a state space.


<a/ id='p3c'></a>

### (3c)

Let us explore one of the characteristics of a Markov chain that we often find ourselves interested in:  the **expected time of first return** to a state.  In particular, let's examine Chris' **expected time of first return to Taco Bell**.

Build an ensemble of 1,000 simulations of how many blocks Chris must travel to return to Taco Bell, given that he starts at Taco Bell in the neighborhood depicted in the figure above.

We can estimate the expected time of first return to Taco Bell using the **mean** number of blocks Chris must travel in order to make it back to those tasty tacos. Report this estimate based on your simulation results.

In [35]:
neighborhood = Neighborhood(8, 5, (2,1))
n_trials = 1000
total = 0

for i in range(n_trials):
    agent_chris = Agent("Chris", (2,1), neighborhood)
    agent_chris.walk()
    while not agent_chris.at_taco_bell():
        agent_chris.walk()
    total += len(agent_chris.moves_made) - 1
average = total/n_trials
print("Avg num blocks to return to Taco Bell: ", average)


Avg num blocks to return to Taco Bell:  47.646


<a id='p3d'></a>

### (3d)

Your friend Dan also is leaving the party.  Dan leaves just after Chris.  Dan, however, has a preference for heading west (left, in the $-x$ direction) when he can, for that is where Ralphie roams.  In fact, his preference for being nearer to Ralphie is so strong that if it is possible for him to move west, then he does so 60% of the time, with the other 40% of the probability distributed equally among his other options.

Revise the `Agent` class - or better yet, *sub-class it*, to represent Dan's strange yet endearing decision-making for which random direction to move in.

In [36]:
class doubleAgent(Agent):
   
    def random_move(self):
        '''Return a random move out of the available moves
        from the agent`s current location'''
        options = self.available_moves()
        if (options[0][0] == self.current_loc[0]-1): #if west is available
            buffalo_calling = np.random.choice(['Yeh', 'Neh'], p = [0.6, 0.4])
            if buffalo_calling == 'Yeh': x = 0
        x = randint(0, len(options)-1)
        self.moves_made.append(options[x])
        return options[x]

In [40]:
neighborhood = Neighborhood(8,5, (2,1))

agent_dan = doubleAgent('Dan', (6,3), neighborhood)
agent_dan.walk()
print(agent_dan.current_loc)

(6, 2)


Now simulate 1,000 samples for how many blocks Dan will need to travel in order to make it to Taco Bell from the party.  Report the expected number of blocks Dan must travel to get some delicious chalupas.

In [41]:
neighborhood = Neighborhood(8, 5, (2,1))
n_trials = 1000
total = 0

for i in range(n_trials):
    agent_dan = Agent("Dan", (6,3), neighborhood)
    agent_dan.walk()
    while not agent_dan.at_taco_bell():
        agent_dan.walk()
    total += len(agent_dan.moves_made) - 1
average = total/n_trials
print("Avg num blocks to get to Taco Bell: ", average)

Avg num blocks to get to Taco Bell:  125.072


<a/ id='p3e'></a>

### (3e)

What is Dan's expected time of first return to Taco Bell, as measured by the number of blocks traveled?

In [42]:
neighborhood = Neighborhood(8, 5, (2,1))
n_trials = 1000
total = 0

for i in range(n_trials):
    agent_dan = Agent("Dan", (2,1), neighborhood)
    agent_dan.walk()
    while agent_dan.at_taco_bell() == False:
        agent_dan.walk()
    total += len(agent_dan.moves_made) - 1
average = total/n_trials
print("Avg num blocks to return to Taco Bell: ", average)

Avg num blocks to return to Taco Bell:  46.528


### (3f)

Consider the smaller neighborhood depicted below, with only 3 north-south streets, 2 east-west streets, and a Taco Bell located at $(1,1)$.

<img src="http://www.cs.colorado.edu/~tonyewong/home/resources/random_walk_to_taco_bell_small.png" style="width: 250px;"/>

There are 6 distinct states in the state space, one for each of the 6 intersections.  Recall that the transition probability matrix denotes the probability of moving from the state given by the row of the matrix to the state given by the column.

Recall that the transition probability matrix is given by

$$T = \left[\begin{array}{cccccc} 
  q((0,0),(0,0)) & q((0,0),(1,0)) & q((0,0),(2,0)) & q((0,0),(0,1)) & q((0,0),(1,1)) & q((0,0),(2,1)) \\
  q((1,0),(0,0)) & q((1,0),(1,0)) & q((1,0),(2,0)) & q((1,0),(0,1)) & q((1,0),(1,1)) & q((1,0),(2,1)) \\
  q((2,0),(0,0)) & q((2,0),(1,0)) & q((2,0),(2,0)) & q((2,0),(0,1)) & q((2,0),(1,1)) & q((2,0),(2,1)) \\
  q((0,1),(0,0)) & q((0,1),(1,0)) & q((0,1),(2,0)) & q((0,1),(0,1)) & q((0,1),(1,1)) & q((0,1),(2,1)) \\
  q((1,1),(0,0)) & q((1,1),(1,0)) & q((1,1),(2,0)) & q((1,1),(0,1)) & q((1,1),(1,1)) & q((1,1),(2,1)) \\
  q((2,1),(0,0)) & q((2,1),(1,0)) & q((2,1),(2,0)) & q((2,1),(0,1)) & q((2,1),(1,1)) & q((2,1),(2,1)) \\  
\end{array} \right]$$

where $q(s_1, s_2)$ is the probability of moving from state $s_1$ to state $s_2$ in one step.

**For example:**  If Dan is currently at $(1,0)$, then:
* $q((1,0),(0,0))=0.6$
* $q((1,0),(2,0))=0.2$
* $q((1,0),(1,1))=0.2$
* $q((1,0),(0,1))=q((1,0),(2,1))=0$
* $q((1,0),(1,0))=0$ because Dan must move somewhere.

This row is already filled in for you.

Finish filling in the rest of the transition probability matrices below for each of Chris and Dan.

In [43]:
                # (0,0)
T_dan = np.array([[ 0 , 0.5, 0  , 0.5, 0  , 0  ],  #(0,0)
                  [0.6, 0  , 0.2, 0  , 0.2, 0  ], 
                  [ 0 , 0.6, 0  , 0  , 0  , 0.4], 
                  [0.5, 0  , 0  , 0  , 0.5, 0  ],
                  [ 0 , 0  , 0.2, 0.6, 0.2, 0  ], 
                  [ 0 , 0  , 0.4, 0  , 0.6, 0  ]]) #(2,1)

                #   (0,0)
T_chris = np.array([[ 0 , 0.5, 0  , 0.5, 0  , 0  ], #(0,0)
                    [1/3, 0  , 1/3, 0  , 1/3, 0  ], 
                    [ 0 , 0.5, 0  , 0  , 0  , 0.5], 
                    [0.5, 0  , 0  , 0  , 0.5, 0  ], 
                    [ 0 , 1/3, 1/3, 1/3, 0  , 0  ], 
                    [ 0 , 0  , 0.5, 0  , 0.5, 0  ]]) #(2,1)

Use your transition probability matrices to estimate the probabilities (independently) that each of Chris and Dan will be at Taco Bell in 2 moves, given that they starts at $(0,0)$.

In [44]:
print("Dan: \n", np.matmul(T_dan, T_dan))
print("Chris: \n", np.matmul(T_chris, T_chris))

Dan: 
 [[ 0.55  0.    0.1   0.    0.35  0.  ]
 [ 0.    0.42  0.04  0.42  0.04  0.08]
 [ 0.36  0.    0.28  0.    0.36  0.  ]
 [ 0.    0.25  0.1   0.55  0.1   0.  ]
 [ 0.3   0.12  0.04  0.12  0.34  0.08]
 [ 0.    0.24  0.12  0.36  0.12  0.16]]
Chris: 
 [[ 0.41666667  0.          0.16666667  0.          0.41666667  0.        ]
 [ 0.          0.44444444  0.11111111  0.27777778  0.          0.16666667]
 [ 0.16666667  0.          0.41666667  0.          0.41666667  0.        ]
 [ 0.          0.41666667  0.16666667  0.41666667  0.          0.        ]
 [ 0.27777778  0.16666667  0.11111111  0.          0.27777778  0.16666667]
 [ 0.          0.41666667  0.16666667  0.16666667  0.          0.25      ]]


What is the probability that either are at Taco Bell after 3 moves, if they start from $(0,0)$?  Why is this?

In [45]:
chris = (T_chris[0,1]*T_chris[1,4])+(T_chris[0,3]*T_chris[3,4])
dan = (T_dan[0,1]*T_dan[1,4])+(T_dan[0,3]*T_dan[3,4])
print("Chris: {}, Dan: {}".format(chris, dan))

Chris: 0.41666666666666663, Dan: 0.35


**Your answer:**

At any point in our undirected graph we are, at most, 2 moves from Taco Bell. If we were to take 3 steps in any direction (including stepping backwards/retracing our steps), we would *never* be at Taco Bell, as Taco Bell is on an even number of edges away (and we would, therefore, overshoot Taco Bell).

However, if we were to change the parity of our graph's edges, it would be a different story. As of now, however, the probability would be 0.

<a/ id='p3g'></a>

### (3g)

If Chris and Dan both spend all of their time walking around and eating at Taco Bell, what is the long-run proportion of Chris and Dan's time that each will spend at Taco Bell?  That is, if you run the random walk simulation for a very, very long time, what is your estimate for the proportion of states in the Markov chain that are at Taco Bell?

First, estimate these probabilities using long Markov chain random walk simulations for each of Chris and Dan. About 100,000 moves should be enough to stabilize your estimates.

In [46]:
neighborhood = Neighborhood(7,2,(1,1))
agent_chris = Agent('Chris', (0,0), neighborhood)
agent_dan = Agent('Dan', (0,0), neighborhood)
trials = 100000
dan = 0
chris = 0

for i in range(trials):
    agent_chris.walk()
    agent_dan.walk()
    if agent_chris.at_taco_bell():
        chris += 1
    if agent_dan.at_taco_bell():
        dan += 1
        
print("Proportion of time Chris is at Taco Bell: {}%".format(chris/trials))
print("Proportion of time Dan is at Taco Bell: {}%".format(dan/trials))

Proportion of time Chris is at Taco Bell: 0.05398%
Proportion of time Dan is at Taco Bell: 0.05378%


Now estimate these probabilities using your transition probability matrices.  You should notice something a little bit strange when you compare these results against those from the simulations above.  Provide a couple sentences interpreting these results.

It would seem that the results are following a schema of some sort. They're periodically resetting (every three columns, in this case). 

My best guess would be to assume that this is due to the nature of the graphical structure of our city. Since we are alternating from an even and odd parity edge at each step, the probability that we reach Taco Bell varies. 

Likewise, Dan's is a little more turbulent as he's more likely to move West if he can, and therefore will have a harder time reaching Taco Bell without first getting in front of it. 

Chris, on the other hand, has a much more periodic structure, as he moves randomly. 

**Your answer:**

In [47]:
C = T_chris
D = T_dan
for i in range(100000):
    C = np.matmul(C, T_chris)
    D = np.matmul(D, T_dan)
print('100,000 moves: Chris = \n{}'.format(C))
print('100,000 moves: Dan = \n{}'.format(D))

100,000 moves: Chris = 
[[ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]
 [ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]
 [ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]
 [ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]
 [ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]
 [ 0.14718615  0.23376623  0.19047619  0.13852814  0.19480519  0.0952381 ]]
100,000 moves: Dan = 
[[ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]
 [ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]
 [ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]
 [ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]
 [ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]
 [ 0.22556391  0.16917293  0.09398496  0.2481203   0.22556391  0.03759398]]


<br><br><br>

<a id='helpers'></a>

---

[Back to top](#top)

## Some things that might be useful

Easiest way to start:  Click this cell, go to "Cell" in the toolbar above, and click "Run All Below"

In [4]:
from scipy import stats
import unittest
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Unit tests

In [5]:
class Tests_Problem1(unittest.TestCase):
    def setUp(self):
        self.p1 = BayesNode('p1', '', [T,F], 0.3)
        self.p2 = BayesNode('p2', '', [T,F], 0.6)
        self.c  = BayesNode('c', ['p1', 'p2'], [T,F], {(T,T):0.1, (T,F):0.2, (F,T):0.3, (F,F):0.4})
    def test_onenode(self):
        self.assertEqual(P(self.p1, T), 0.3)
    def test_twonode(self):
        self.assertEqual(P(self.c, F, {'p1':T, 'p2':F}), 0.8)
        
class Tests_Problem3(unittest.TestCase):
    def test_moves_corner(self):
        nh = Neighborhood(2,2, (1,1))
        chris = Agent('Chris', (0,0), nh)
        self.assertTrue(((1,0) in chris.available_moves()) and ((0,1) in chris.available_moves()))
    def test_moves_center(self):
        nh = Neighborhood(3,3, (1,1))
        chris = Agent('Chris', (1,1), nh)
        self.assertTrue(((1,0) in chris.available_moves()) and ((0,1) in chris.available_moves()) and
                        ((1,2) in chris.available_moves()) and ((2,1) in chris.available_moves()))
    def test_tacos(self):
        nh = Neighborhood(3,3, (1,1))
        chris = Agent('Chris', (1,1), nh)
        self.assertTrue(chris.at_taco_bell())

[Back to top](#top)