<a href="https://colab.research.google.com/github/erinbugbee/2023CLIHC-SpeedyIBL-Workshop/blob/main/03_Exercise_IowaGambling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exercise: Using SpeedyIBL for the Iowa Gambling Task

In the original paper (Bechara et al., 1994), the following procedure was followed:

There were 4 decks of cards (A, B, C, and D).

Participants had to choose in total 100 cards, one at the time.

Each time they choose a card, they get feedback about winning and/or losing some money.

Participants did not know what each card would yield in advance (i.e., like a lottery).

Participants started with a "loan" of \$2000 and were told to make a profit.

Decks A and B always yielded \$100.

Decks C and D always yielded \$50.

For each card chosen, there is a 50\% chance of having to pay a penalty as well. 

For decks A and B, the penalty is \$250.

For decks C and D, the penalty is \$50.

Learn more about the task here: https://www.psytoolkit.org/experiment-library/igt.html

In [None]:
# TODO: Install speedyibl

# TODO: Import speedyibl

# TODO: Define an agent

# TODO: Define the options

import random
# Define a reward function
def reward(choice):
    # Choice A or B
    if choice == "A" or choice == "B":
        r = 100
        if random.random() <= 0.5:
            r -= 250
    # Choice C or D
    else:
        r = 50
        if random.random() <= 0.5:
            r -= 50
    return r

In [None]:
# Run experiments
import time # to calculate time
runs = 1000 # number of runs (participants)
trials = 100 # number of trials (episodes)


def run(agent, reward, n_runs, trials):
  average_p = [] # to store average of performance (proportion of maximum reward expectation choice)
  average_r = []
  average_time = [] # to save time
  for r in range(n_runs):
    pmax = []
    rewards = []
    ttime = [0]
    agent.reset() #clear the memory for a new run
    for i in range(trials):
      start = time.time()
      # TODO: Agent chooses one option from the list of four options

      # determine the reward that agent can receive
      re = reward(choice)
      # TODO: Store the observed reward to memory of the agent

      end = time.time()
      ttime.append(ttime[-1] + end - start)
      pmax.append(choice == "C" or choice == "D")
      rewards.append(re)
    average_p.append(pmax) # save performance of each run
    average_r.append(rewards)
    average_time.append(ttime) # save time of each run
  return average_r, average_p

In [None]:
# Run for 1000 runs and 100 trials with defined reward function
average_r, average_p = run(agent, reward, runs, trials)

In [None]:
# For plotting
import matplotlib.pyplot as plt
import numpy as np

# Plot PMAX over rounds, which is the proportion of choices that are the options with the maximum reward expectation (Choice C or D)
plt.plot(range(trials), np.mean(np.asarray(average_p), axis=0), "o-", color = "darkgreen", markersize=2, linestyle = "--", label = "speedyIBL")
plt.xlabel("Round")
plt.ylabel("PMAX")
plt.title("Performance")
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Plot Average Reward over rounds
plt.plot(range(trials), np.mean(np.asarray(average_r), axis=0), "o-", color = "darkgreen", markersize = 2, linestyle = "--", label = "speedyIBL")
plt.xlabel("Round")
plt.ylabel("Average Reward")
plt.title("Performance")
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Initialize agent with default utility of 10 instead of 110
agent = Agent(default_utility=10)
average_r, average_p = run(agent, reward, runs, trials)

In [None]:
# Plot PMAX over rounds with default utility 10
plt.plot(range(trials), np.mean(np.asarray(average_p), axis=0), "o-", color = "darkgreen", markersize=2, linestyle = "--", label = "speedyIBL")
plt.xlabel("Round")
plt.ylabel("PMAX")
plt.title("Performance")
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Plot Average Reward over rounds with default utility 10
plt.plot(range(trials), np.mean(np.asarray(average_r), axis=0), "o-", color = "darkgreen", markersize = 2, linestyle = "--", label = "speedyIBL")
plt.xlabel("Round")
plt.ylabel("Average Reward")
plt.title("Performance")
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# TODO: Vary the noise (default = 0.25) and decay (default = 0.5) parameters and plot the PMAX and Average Reward over rounds with default utility 110


# TODO: Run the simulation

In [None]:
# TODO: Plot PMAX over rounds

In [None]:
# TODO: Plot Average Reward over rounds