<a href="https://colab.research.google.com/github/ucchol/Applied-Cognitive-Modeling-CS--5390/blob/main/Cognitive_attributes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1. define the attributes
2. define the similarity functions
3. define the options

In [1]:
pip install pyibl --upgrade

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyibl
  Downloading pyibl-5.0-py3-none-any.whl (18 kB)
Collecting ordered-set
  Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Collecting pyactup>=2.0
  Downloading pyactup-2.0.9-py3-none-any.whl (18 kB)
Collecting pylru
  Downloading pylru-1.2.1-py3-none-any.whl (16 kB)
Installing collected packages: pylru, ordered-set, pyactup, pyibl
Successfully installed ordered-set-4.1.0 pyactup-2.0.9 pyibl-5.0 pylru-1.2.1


The choices an agent decides between are not limited to atomic entities as we’ve used in the above. They can be structured using “attributes.” Such attributes need to be declared when the agent is created.

As a concrete example, we’ll have our agent decide which of two buttons, 'left' or 'right', to push. But one of these buttons will be illuminated. Which is illuminated at any time is decided randomly, with even chances for either. Pushing the left button earns a base reward of 1, and the right button of 2; but when a button is illuminated its reward is doubled.

We’ll define our agent to have two attributes, "button" and "illuminatted". The first is which button, and the second is whether or not that button is illumiunated. In this example the the "button" value is the decision to be made, and "illuminatted" value represents the context, or situation, in which this decision is being made.

We’ll start by creating an agent, and two choices, one for each button.

In [5]:
from pyibl import Agent
from random import random
import math
import sys
a = Agent(["button", "illuminated"], default_utility=5)
left = { "button": "left", "illuminated": False }
right = { "button": "right", "illuminated": False }

While we’ve created them both with the button un-illuminated, the code that actually runs the experiment will turn one of them on, randomly.

In [3]:
def push_button():
    if random() <= 0.5:
        left["illuminated"] = True
    else:
        left["illuminated"] = False
    right["illuminated"] = not left["illuminated"]
    result = a.choose([left, right])
    reward = 1 if result["button"] == "left" else 2
    if result["illuminated"]:
        reward *= 2
    a.respond(reward)
    return result

push_button()

{'button': 'right', 'illuminated': False}

Now we’ll reset the agent, and then run it 2,000 times, counting how many times each button is picked, and how many times an illuminated button is picked.

In [4]:
a.reset()
results = {'left': 0, 'right': 0, True: 0, False: 0}
for _ in range(2000):
    result = push_button()
    results[result["button"]] += 1
    results[result["illuminated"]] += 1

results

{'left': 493, 'right': 1507, True: 1514, False: 486}

As we might have expected the right button is favored, as are illuminated ones, but since an illuminated left is as good as a non-illuminated right neither completely dominates.

# Partial matching

In the previous examples experience from prior experiences only applied if the prior decisions, or their contexts, matched exactly the ones being considered for the current choice. But often we want to choose the option that most closely matches, though not necessarily exactly, for some definition of “closely.” To do this we define a similarity function for those attributes we want to partially match, and specify a mismatch_penalty parameter.

In this example there will be a continuous function, f(), that maps a number between zero and one to a reward value. At each round the model will be passed five random numbers between zero and one, and be asked to select the one that it expects will give the greatest reward. We’ll start by defining an agent that expects choices to have a single attribute, n.

In [6]:
a = Agent(["n"])

We’ll define a similarity function for that attribute, a function of two variables, two different values of the attribute to be compared. When the attribute values are the same the value should be 1, and when they are maximally dissimilar, 0. The similarity function we’re supplying is scaled linearly, and its value ranges from 0, if one of its arguments is 1 and the other is 0, and otherwise scales up to 1 when they are equal. So, for example, 0.31 and 0.32 have a large similarity, 0.99, but 0.11 and 0.93 have a small similarity, 0.18.

In [9]:
a.similarity(["n"], lambda x, y: 1 - abs(x - y))
print(a.similarity)

<bound method Agent.similarity of <Agent agent-3 139866362101280>>


The mismatch_penalty is a non-negative number that says how much to penalize past experiences for poor matches to the options currently under consideration. The larger its value, the more mismatches are penalized. We’ll experiment with different values of the mismatch_penalty in our model

Let’s define a function that will run our model, with the number of iterations, the mismatch_penalty, and the reward function supplied as parameters. Note that we reset the agent at the beginning of this function. We then supply one starting datum for the model to use, the value of the reward function when applied to zero. After asking the agent to choose one of five, randomly assigned values, our run_model function will work out which would have given the highest reward, and keep track of how often the model did make that choice. We’ll round that fraction of correct choices made down to two decimal places to make sure it is displayed nicely.

In [10]:
def run_model(trials, mismatch, f):
    a.reset()
    a.mismatch_penalty = mismatch
    a.populate([{"n": 0}], f(0))
    number_correct = 0
    fraction_correct = []
    for t in range(trials):
        options = [ {"n": random()} for _ in range(5) ]
        choice = a.choose(options)
        best = -sys.float_info.max
        best_choice = None
        for o in options:
            v = f(o["n"])
            if o == choice:
                a.respond(v)
            if v > best:
                best = v
                best_choice = o
        if choice == best_choice:
            number_correct += 1
        fraction_correct.append(float(f"{number_correct / (t + 1):.2f}"))
    return fraction_correct