# Bayesian Theory Of Mind Problem Set


1. [Building The Model](#Distance-Calc)
  1. [Initializing The Model](#init)
  1. [Distance-Calc](#Distance-Calc)
  2. [Getting Points Within The Range](#Within-Range)
2. [The Mind Model](#Mind-Model)
  1. [Updating Beliefs](#updating-beliefs)
  2. [Transition Matrix](#transition-matrix)
  3. [Inferring Intent](#inferring-intent)
3. [Analysis of Bayesian Theory Of Mind](#analysis)

## Building The Model

<img src="scenario.png"/>


Before we start constructing a way to infer intent, we first need to find a way to represent our "world", the space in which we will infer intent. In this case, our world is a 15x15 square grid that our agent (in this case, Mark Watney) can traverse around. Scattered in known locations on the grid are resources A B and C. Mark is trying to retrieve these resources, but we're not sure which one he wants. This is where the intent comes in!

To be more technical and specific, we will write some class Mind that keeps track of our model's estimation of Mark's intent, belief about the world, and how likely he's going for a certain resource given his location.

Below, we'll initialize this Mind class, then dive a bit deeper into how to use it.



### Initialize The Mind (5 pts) <a id="temporal-word-problem"/>

Our first step is to initialize the Bayesian Theory of Mind. Below is a skeleton implementation of the class we'll use to model our BToM. We've initialized some stuff for you, but you'll have to do the rest. Let's initialize both our beliefs and our intents as a *uniform distribution*. That is, the probability for any belief or any intent starts out the same. 



In [106]:
class Mind:
    def __init__(self):
        # where resources are
        self.world = [(10,0), (0,9), (6,10)] # this array is the coordinates of the location of the resources
        self.map_length = 15 # How large the grid is
        self.actual_world = 'ABC' # Resources
        self.transition_matrix = []
    
        self.beliefs_worlds = ['ABC', 'ACB', 'BAC', 'BCA', 'CAB', 'CBA'] # Possible Beliefs
        self.prev_position = (0,0)
        self.position = (0,0)

        self.beliefs = [1/6, 1/6, 1/6, 1/6, 1/6, 1/6]
        self.intents = {'A': 1/3,'B': 1/3,'C': 1/3}
        
#         self.beliefs = None #[1/6, 1/6, 1/6, 1/6, 1/6, 1/6]
#         self.intents = None 

In [108]:
from nose.tools import assert_equal

my_mind = Mind()

assert_equal(my_mind.beliefs, [1/6, 1/6, 1/6, 1/6, 1/6, 1/6])
assert_equal(my_mind.intents, {'A': 1/3,'B': 1/3,'C': 1/3})

print ("success")


success


### Distance Calculation (5 pts) <a id="distance-calc"/>

In order for our model to understand anything, we need to be able to calculate distances between objects on our field. Write a function here to calculate the distance between two points p1 and p2 (both tuples of the form x,y)

In [109]:
# helper to get distance
def get_distance(p1, p2):
    #pass
    return ((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2)**.5

Let's test that our Distance Calculation Works as expected

In [110]:
from nose.tools import assert_equal

assert_equal(get_distance((1,0), (2,0)), 1)
assert_equal(get_distance((1,1), (2,2)), (2)**0.5)
assert_equal(get_distance((1,0), (1,2)), 2)

print ("success")


success


### Getting Points Within A Range (10 pts) <a id="within-range"/>

Now, our next step is to be able to find the points that Mark can see when he is at some location. Implement the method below, that given a Mind object and location, returns all the resources that Mark can see. (5 squares away in x or y direction)

In [111]:
def within_range(mind, position):
    """ 
    Given a mind object and location, outputs the (x,y) coordinates of resources near the location
    
    Input: mind - a mind object which contains location of all the resources on map
           position - current position on map
    Output: locs - a list of locations (in x,y) available from current position
    """
    # should return resource positions that are visible from current position
#     locs = []
    
#     return locs

        # should return resource positions that are visible from current position
    locs = []
    for i in range(len(mind.world)):
        loc = mind.world[i]
        # print ("---")
        # print (position)
        # print (loc)
        # print (abs(position[0]-loc[0])<=5)
        # print ("---")
        if abs(position[0]-loc[0])<=5 and abs(position[1]-loc[1])<=5:
            locs.append(loc)
    return locs


In [112]:
from nose.tools import assert_equal

my_mind = Mind()

assert_equal(within_range(my_mind, (8,0)), [(10, 0)])
assert_equal(within_range(my_mind, (0,0)), [])
assert_equal((6,10) in within_range(my_mind, (3,9)) and (0,9) in within_range(my_mind, (3,9)), True)
print ("success")


success


## The Mind Model

Next, we need to actually write some useful functions that will use the mind model to get intent


### Updating Beliefs (20 pts) <a id="updating-beliefs"/>

In this section, we'll work on updating what our BToM thinks that Mark believes.

The function below is passed a mind instance, a resource, and a position that we just saw that resource. We should update our probability distribution of what we think Mark believes given this new information based on the approach given in the tutorial!

Hint: If the position of the resource we just discovered matches some possible state of the world, Mark is also much more likely to think that that is the true state of the world.



In [113]:

# updates belief in the possible arrangements of resources in world
def update_beliefs(mind, position, resource):
    """ Computes the new beliefs
    
    Input: mind - the BToM
           position - Position of the resource we just discovered
           resouce - Which resource is it (A, B, or C)"""
    # YOUR CODE HERE
    
    i = mind.world.index(position)
    increasing_beliefs = []
    decreasing_beliefs = []
    for j in range(len(mind.beliefs_worlds)):
        world = mind.beliefs_worlds[j]
        if world[i] == resource:
            increasing_beliefs.append(j)
        else:
            decreasing_beliefs.append(j)

    # decrease belief in other worlds by half
    sum_decrease_beliefs = 0
    for b in decreasing_beliefs:
        mind.beliefs[b] /= 2
        sum_decrease_beliefs += mind.beliefs[b]

    # increase belief in that world by w/e leftover
    leftover = 1-sum_decrease_beliefs
    increasing_each = leftover / len(increasing_beliefs)
    for b in increasing_beliefs:
        mind.beliefs[b] += increasing_each
#     raise NotImplementedError
    

In [114]:
from nose.tools import assert_equal

my_mind = Mind()

update_beliefs(my_mind, (10,0), 'A')
assert_equal(my_mind.beliefs[0], 0.5)
update_beliefs(my_mind, (0,9), 'B')
assert_equal(my_mind.beliefs[0], 0.8125)

print ("success")


success


### Updating Transition Matrix (20 pts) <a id="transition-matrix"/>

Next, we'll try to conclude the probability Mark is looking for resource X if he's currently at some location Y. We've already written the update method for you, so you can assume when this runs that the current position and previous positions are already set correctly.

Hint: You'll probably need to use current position and previous position. What would make it more likely you're going for resource X- going towards it or away it?

SOMEONE ELSE HELP ME FILL THIS IN I DON'T WANT TO DO IT WRONG


In [115]:
# gets the probabilities you are looking for resource x at location y based on current position and prev position
def update_transition_matrix(mind):
    
    '''
    Given a mind object, update its transition probability matrix after accounting for current position and previous position.
    
    Input: mind - a mind object
    Output: None, but should mutate the mind object.
    
    '''
    
#     raise NotImplementedError
    dists = []

    for resource_pos in mind.world:
        dist1 = get_distance(resource_pos, mind.prev_position)
        dist2 = get_distance(resource_pos, mind.position)
        diff_dist = dist1-dist2
        dists.append(diff_dist)
    print (dists)
    # -2 2 2  R1 R2 R3
    # 2, 6, 6
    # 2/14, 6/14, 6/14

    max_value = max(dists)
    min_value = min(dists)
    range_values = max_value - min_value

    # so there are no negatives
    # print (dists)
    for i in range(len(dists)):
        dists[i] += range_values
    # print (dists)
    # print ("------")
    probabilities = []
    for d in dists:
        probabilities.append(d/sum(dists))

    mind.transition_matrix = probabilities



In [116]:
from nose.tools import assert_equal

my_mind = Mind()



my_mind.position = (1,0)
update_transition_matrix(my_mind)
assert_equal(my_mind.transition_matrix, [0.44756872175704115, 0.2177541879876714, 0.3346770902552874])
my_mind.prev_position = (1,0)
my_mind.position = (1,1)

update_transition_matrix(my_mind)
assert_equal(my_mind.transition_matrix,[0.19990528833005725, 0.4109589772800559, 0.38913573438988686])


print ("success")


[1.0, -0.0553851381374173, 0.48156390219165246]
[-0.0553851381374173, 0.9931273898388682, 0.8847097465119482]
success


### Inferring Intent (20 pts) <a id="inferring-intent"/>

Last thing! We need to put it all together and infer the intent of Mark using the functions we've defined!

In this function, we're passed in a Mind object. Using it, we should get the total probability (normalized) that Mark's intent is to get a particular resource. Look at the tutorial for more on how this works!

More specifically, you should mutate mind.intents and update it with the new intents! We've already implemented the update based on observation method for you, but now you must finish it with update_intents.


In [117]:
# updates intents
def update_intents(mind):
    # based on belief, transition, old intents, and rewards
    '''
    
    Input: mind - a mind object
    Output: None, but should mutate the mind object.
    '''
    
    
    scores = {'A': 0, 'B': 0, 'C': 0}
    for i in range(len(mind.beliefs_worlds)):
        world = mind.beliefs_worlds[i]
        belief_prob = mind.beliefs[i]
        for i in range(3):
            resource = world[i]
            trans_prob = mind.transition_matrix[i]
            score = belief_prob * trans_prob * 100 * mind.intents[resource]
            scores[resource] += score
    # print (scores)

    sum_scores = scores['A'] + scores['B'] + scores['C']

    new_intents = {}
    for s in scores:
        new_intents[s] = scores[s] / sum_scores

    mind.intents = new_intents

In [118]:
# the entire model pretty much runs here. this calls the other functions
def receive_observation(mind, x, y):
    mind.position = (x,y)
    near_locations = within_range(mind, (x,y))
    for loc in near_locations:
        i = mind.world.index(loc)
        resource = mind.actual_world[i]
        point = (x,y)
        update_beliefs(mind, loc, resource)

    # update transition
    update_transition_matrix(mind)

    # update intents
    update_intents(mind)

    mind.prev_position = (x,y)

In [120]:
from nose.tools import assert_equal

my_mind = Mind()

receive_observation(my_mind,1, 0)
receive_observation(my_mind,2, 0)
receive_observation(my_mind,3, 0)
receive_observation(my_mind,4, 0)
receive_observation(my_mind,5, 0)
receive_observation(my_mind,6, 0)
receive_observation(my_mind,5, 0)
receive_observation(my_mind,4, 0)
receive_observation(my_mind,3, 0)

#{'A': 0.035648663780556554, 'B': 0.48217566810972184, 'C': 0.48217566810972157}

assert_equal(abs(my_mind.intents['A'] - 0.035648663780556554) < 0.001, True)

assert_equal(abs(my_mind.intents['B'] - 0.48217566810972184) < 0.001, True)

assert_equal(abs(my_mind.intents['C'] - 0.48217566810972157) < 0.001, True)

print ("successs")

[1.0, -0.0553851381374173, 0.48156390219165246]
[1.0, -0.16415931915546977, 0.41001027322994155]
[1.0, -0.26728852321225105, 0.3300231053584568]
[1.0, -0.3620248212909658, 0.2422674817249817]
[1.0, -0.44677233919089687, 0.14816340606467904]
[1.0, -0.5210236854049679, 0.049875621120889946]
[-1.0, 0.5210236854049679, -0.049875621120889946]
[-1.0, 0.44677233919089687, -0.14816340606467904]
[-1.0, 0.3620248212909658, -0.2422674817249817]
successs


## Analysis of Bayesian Theory of Mind (10 pts) <a id="distance-calc"/>

In the box below, highlight some of the advantages and disadvantages of Bayesian Theory of Mind. This is open ended! What makes it different from other approaches, and why does that make it better? Are there any drawbacks you can think of?