Notebook to accompany lecture on Decision Theory </br>
By Jimmy Mulder, Hogeschool Utrecht

Imports

In [None]:
import random
import numpy as np
import matplotlib.pyplot as plt
import math

data = (np.random.randn(100) * 2 + 30)

Binomial example: adapt this code so that it matches the figure in slide 7

In [None]:
coinflip_outcomes = random.choices(["heads","tails"], weights=[50,50], k=5)
print (coinflip_outcomes)
num_heads = coinflip_outcomes.count("heads")
plt.hist(num_heads)


Discrete example: Add to this code to recreate the figure in slide 10

In [None]:
die_roll = random.randint(1,6)
print(die_roll)
plt.hist(die_roll)

Lottery example: try to recreate the plots on slide 11. 
Increase the ticket cost and add code to calculate your odds of making a profit

In [None]:
number_of_players = 10
number_of_tickets = 5
ticket_cost = 0
all_outcomes = [None] * number_of_players
for i in range(0, number_of_players):
    lottery_outcomes = random.choices([0, 5, 20, 100], weights=[84,10,5,1], k=number_of_tickets)
    all_outcomes[i] = sum(lottery_outcomes)
plt.hist(all_outcomes, bins=50)


Continuous example: try varying the bin size (i.e. decrease the interval) and see what happens

In [None]:
from scipy.stats import norm
bin_size = 0.5
plt.hist(data, density = True, bins=np.arange(25,36,bin_size))

mu, std = norm.fit(data) 

# Plot the PDF.
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
  
plt.plot(x, p, 'k', linewidth=2)
  
plt.show()

Let's dig a little deeper into the marshmallow experiment. First, decide for yourself what the utility function for marshmallows is. For me, it looks something like this:

In [None]:
def MM_utility(number_of_marshmallows):
    if (number_of_marshmallows < 10):
        return math.log(number_of_marshmallows,2)
    else:
        return math.log(number_of_marshmallows,2) - math.pow(number_of_marshmallows - 10, 1.1)

MMlist = []
for i in range (1,20):
    MMlist.append(MM_utility(i))

plt.plot(range(1,20),MMlist)
plt.xlabel("number of marshmallows")
plt.ylabel("funpoints")

As you can see in this plot, I enjoy 2 marsmallows more than 1, but after 10 marshmallows I start to get nauseous, and after about 13 I regret eating any at all. <br>
Try to come up with your own utility function!

In [None]:
def MM_utility(number_of_marshmallows):
    return 0 #add your code here

MMlist = []
for i in range (1,20):
    MMlist.append(MM_utility(i))

plt.plot(range(1,20),MMlist)
plt.xlabel("number of marshmallows")
plt.ylabel("funpoints")

Now lets think about the experiment: In which case is it rational for you to wait, and when is it more rational to simply eat the marshmallow right away? Let's vary the number of marshmallows, the amount of trust you have in the psychologists, and how long you have to wait/how much you hate waiting! Then go back and change your utility function and try again.

In [None]:
Utility_one_MM = MM_utility(1)

amount_of_extra_marshmallows = 1 # try varying this variable.
Utility_of_extra_marshmallows = MM_utility(amount_of_extra_marshmallows + 1)
trust_level = 0.8 # odds that the experimenter keeps their promise. try varying this variable.

waiting_penalty = 1.4 # an arbitrary amount of funpoints. try varying this variable.

expected_utility_of_waiting = Utility_of_extra_marshmallows * trust_level - waiting_penalty

if (Utility_one_MM > expected_utility_of_waiting):
    print("eat the marshmallow!")
else:
    print("have some patience!")

Advanced topics: exploration vs exploitation. In this excercise, we'll play the slot machines and try to find the optimal strategy. Follow the instructions in the comments to recreate the experiment we discussed during the lecture. Then vary the payout schemes and the exploration phase. How are these two concepts related to each other?

In [None]:
def play_slot_machine(machine_number):
    slot_machine_1_payout = 100
    slot_machine_1_payout_chance = 0.1 # slot machine 1 gives a 10% chance of winning 100 dollars. Now add 2 more machines!

    if (machine_number == 1):
        return random.choices([slot_machine_1_payout,0], weights=[slot_machine_1_payout_chance,1 - slot_machine_1_payout_chance], k=1)[0] # add the other 2 machines

total_rounds = 1000
exploration_rounds_per_machine = 100

for i in range(1, exploration_rounds_per_machine):
    print(play_slot_machine(1)) # adapt or add to this code to use and store the output of all 3 machines, so you can calculate the average profit of each machine!

exploration_phase_profit = 0 # change!

best_machine_number = 1 # write logic for this

exploitation_phase_profit = 0 # leave this at 0
for i in range(1, total_rounds - exploration_rounds_per_machine * 3): # for the remaining rounds, exploit the best machine
    exploitation_phase_profit += play_slot_machine(best_machine_number)

print("total profit is: " + str(exploration_phase_profit + exploitation_phase_profit))