<a href="https://colab.research.google.com/github/Kinds-of-Intelligence-CFI/measurement-layout-tutorial/blob/main/tutorial-notebooks/4_BuildingGoodBenchmarks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building Good Benchmarks

**Lead Presenter**: Kozzy Voudouris

In this tutorial, we introduce some key concepts that should guide the development of purpose-built benchmarks for use with measurement layouts. In this notebook, we incrementally build a measurement layout for studying the cognitive capability of object permanence in a complex three-dimensional environment. Finally, We evaluate some agents on a suite of tests.

## Preamble

First, let's import the libraries, functions, and data that we will need.

In [None]:
!pip install arviz --quiet
!pip install erroranalysis --quiet
!pip install numpy --quiet
!pip install pymc --quiet

In [None]:
import arviz as az
import erroranalysis as ea
import gc
import graphviz
import math
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import random as rm
import seaborn as sns

from IPython.display import Image
from scipy import stats
from sklearn.model_selection import train_test_split
from google.colab import files
from pymc import model

print(f"Running on PyMC v{pm.__version__}") #Note, colab imports an older version of PyMC by default. This won't cause problems for this tutorial, but may do if you use a different backend (e.g., gpu) and a jax/numpyro sampler. In which case, run `!pip install 'pymc>5.9' --quiet`

In [None]:
data_url = 'https://raw.githubusercontent.com/Kinds-of-Intelligence-CFI/measurement-layout-tutorial/main/data/4_PCTB_data.csv'
pctb_dataset = pd.read_csv(data_url)

Let's inspect the dataset. There are **551 instances**, **8 metafeatures**, and the performances of **7 agents**.

The metafeatures are as follows:
1. `basicTask` - is the task a basic task? Values (discrete, binary): `0` (No), `1` (Yes).
2. `pctbGridTask` - is the task a PCTB Grid task? Values (discrete binary): `0` (No), `1` (Yes).
3. `mainGoalSize` - what size is the goal in the task? Value range: `0.5` - `5.0`.
4. `goalPosition` - what is the relative position of the goal with respect to the agent's starting position? Value range: `-1.5` - `1.5` (0 is centre point).
5. `goalOccluded` - is the goal occluded when the agent starts the episode? Values (discrete binary): `0` (No), `1` (Yes).
6. `minDistToGoal` - how far is the goal from the agent? This is calculated is the manhattan distance to the goal, avoiding any obstacles/pits. Value range: `9.0` - `47.0`.
7. `minNumTurnsGoal` - how many right-angle turns would the agent take on the trajectory described by `minDistToGoal`. Value range: `0.0` - `3.0`.
8. `numChoices` - how many choices does the agent have in the task? Value range: `1.0` - `12.0`.

The agents performed each of these tasks, and whether they obtained the goal (`1`) or not (`0`) was recorded. The agents are as follows:
1. `Random_Agent` - An agent which randomly samples one of the 9 actions in the Animal-AI Environment (`no action`, `forwards`, `backwards`, `left rotate`, `right rotate`, `forwards left`, `forwards right`, `backwards left`, `backwards right`). It then takes that action for a number of steps sampled from $U(1, 20)$.
2. `Heuristic_Agent` - An agent that navigates towards green goals, following a rigid rule.
3. `Dreamer_basic` - A dreamer-v3 agent trained for 2M steps on a set of 300 basic tasks, of which all tasks where `basicTask == 1` are a subset.
4. `Dreamer_basic_control` - A dreamer-v3 agent trained for 10M steps on a set of 2372 basic and practice tasks, of which all tasks where `basicTask == 1 or (pctbGridTask == 1 and goalOccluded == 0)` are a subset.
5. `PPO_basic` - A PPO agent trained for 2M steps on a set of 300 basic tasks, of which all tasks where `basicTask == 1` are a subset.
6. `PPO_basic_control` - A PPO agent trained for 10M steps on a set of 2372 basic and practice tasks, of which all tasks where `basicTask == 1 or (pctbGridTask == 1 and goalOccluded == 0)` are a subset.

In [None]:
pctb_dataset

# Building A Complex Measurement Layout

We will incrementally build a complex measurement layout for studying object permanence. We will evaluate this measurement layout by using the two reference agents, `Random_Agent` and `Heuristic_Agent`, as baselines. Finally, we will apply this measurement layout to the dreamer and PPO agents.

## Object Permanence

First, let's set up a simple measurement layout for the *Object Permanence* ability.

The leaf node of the measurement layout is success in this case, so we want to define it as a Bernoulli, taking a probability of success. Therefore, we'll need a logistic function. Because we are dealing with bounded capabilities, the logistic function means that the probability of success on a task with a minimum demand for an agent with maximum ability is 0.999. Alternatively, for an agent with minimum ability performing on a task with maximum demand, we get a probability of success of 0.001. This is a nice parameterisation of the logistic function for our case of bounded capabilities.

In [None]:
def logistic999(x, min, max):    # This logistic function ensures that if x is at -(max-min), we get prob 0.001, and if x is at (max-min), we get prob 0.999
  x = x - min
  max = max - min
  x = 6.90675478 * x / max
  return 1 / (1 + np.exp(-x))

In [None]:
def setupOPModel(data, agent_col_name: str):

  # get results column
  results = data[agent_col_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility


  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    taskSuccess = pm.Bernoulli("taskSuccess", objPermP, observed = results)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupOPModel(pctb_dataset, 'Random_Agent')
gv = pm.model_graph.model_to_graphviz(m)
gv

Let's run this measurement layout on two of our agents: the `Random_Agent` agent and the `Heuristic_Agent`.

In [None]:
model_random, abilityMin, abilityMax = setupOPModel(pctb_dataset, 'Random_Agent')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

model_heuristic, abilityMin, abilityMax = setupOPModel(pctb_dataset, 'Heuristic_Agent')
with model_heuristic:
  data_heuristic = pm.sample(1000, target_accept=0.95)

Let's compare the inferred object permanence capability of these agents, by plotting their capabilities as forest plots:

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['objPermAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['objPermAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

We can also look at the summary statistics from the two measurement layouts:

In [None]:
summary_random = az.summary(data_random['posterior']['objPermAbility'])
summary_random

In [None]:
summary_heuristic = az.summary(data_heuristic['posterior']['objPermAbility'])
summary_heuristic

We are inferring that the heuristic agent has higher object permanence than the random agent, but both agents are relatively low on object permanence.

## Introducing Navigation

These tasks are fundamentally search tasks. The agent must navigate towards the reward. As such, an agent with object permanence but poor at navigation may fail many of these tasks. Moreover, an agent without object permanence, but that is good at navigating, may accidentally obtain the reward on occasion.

The simplest way to frame this is to say that navigation and object permanence are non-compensatory - being good at navigation does not compensate for being bad at object permanence, and vice versa. For the purposes of this tutorial, we can proceed with this formulation, although it may be more accurate to implement an asymmetric compensatory relationship between these too (since, arguably, navigation is more compensatory for object permanence than vice versa).

Navigation demands can be implemented in terms of how far away the goal is along with the circuitousness of the route. We can define this as the product of distance and number of turns.

Let's extend the measurement layout to include navigation:

In [None]:
def setupOPNavModel(data, agent_col_name: str):

  # get results column
  results = data[agent_col_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())
  minNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())
  maxNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility

  abilityMin["navAbility"] = minNavAbility
  abilityMax["navAbility"] = maxNavAbility


  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    navAbility = pm.Uniform("navAbility", minNavAbility, maxNavAbility)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)
    numTurnsGoal = pm.MutableData("minTurnsToGoal", data["minNumTurnsGoal"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    navP = pm.Deterministic("navP", logistic999(navAbility - (goalDist * numTurnsGoal), min = minNavAbility, max = maxNavAbility))

    # Define final margin with non-compensatory interaction

    finalP = pm.Deterministic("finalP", (objPermP * navP))

    taskSuccess = pm.Bernoulli("taskSuccess", finalP, observed = results)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupOPNavModel(pctb_dataset, 'Random_Agent')
gv = pm.model_graph.model_to_graphviz(m)
gv

Let's run this new measurement layout with the same two agents:

In [None]:
model_random, abilityMin, abilityMax = setupOPNavModel(pctb_dataset, 'Random_Agent')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

model_heuristic, abilityMin, abilityMax = setupOPNavModel(pctb_dataset, 'Heuristic_Agent')
with model_heuristic:
  data_heuristic = pm.sample(1000, target_accept=0.95)

Let's compare the inferred object permanence and navigation capabilities of these agents, by plotting their capabilities as forest plots:

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['objPermAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['objPermAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['navAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['navAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

The cognitive profile for the heuristic agent is as expected: high navigation capability, low object permanence capability. However, the random agent is not as expected: we are inferring a high object permanence capability (albeit with very wide credibility intervals). In fact, navigation is more complex than simply the distance to the goal combined with how circuitous the route is. Perhaps there is evidence of a bias towards goals in certain directions. We can implement this next:

## Introducing Directional Bias

The position of the goal relative to the agent's starting position is encoded as a float centred at 0, with left of centre being negative.

In [None]:
def setupOPNavDirectionalModel(data, agent_col_name: str):

  # get results column
  results = data[agent_col_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())
  minNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())
  maxNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility

  abilityMin["navAbility"] = minNavAbility
  abilityMax["navAbility"] = maxNavAbility

  abilityMin["rightLeftBias"] = -np.inf
  abilityMax["rightLeftBias"] = np.inf

  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    navAbility = pm.Uniform("navAbility", minNavAbility, maxNavAbility)

    rightLeftBias = pm.Normal("rightLeftBias", 0,1)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)
    numTurnsGoal = pm.MutableData("minTurnsToGoal", data["minNumTurnsGoal"].values)
    rightLeftPosition = pm.MutableData("rightLeftPosition", data["goalPosition"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    rightLeftEffect = pm.Deterministic("rightLeftEffect", rightLeftBias * rightLeftPosition)

    navP = pm.Deterministic("navP", logistic999((navAbility - (goalDist * numTurnsGoal)) + rightLeftEffect, min = minNavAbility, max = maxNavAbility))

    # Define final margin with non-compensatory interaction

    finalP = pm.Deterministic("finalP", (objPermP * navP))

    taskSuccess = pm.Bernoulli("taskSuccess", finalP, observed = results)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupOPNavDirectionalModel(pctb_dataset, 'Random_Agent')
gv = pm.model_graph.model_to_graphviz(m)
gv

In [None]:
model_random, abilityMin, abilityMax = setupOPNavDirectionalModel(pctb_dataset, 'Random_Agent')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

model_heuristic, abilityMin, abilityMax = setupOPNavDirectionalModel(pctb_dataset, 'Heuristic_Agent')
with model_heuristic:
  data_heuristic = pm.sample(1000, target_accept=0.95)

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['objPermAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['objPermAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['navAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['navAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['rightLeftBias']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=-3, right=3)

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['rightLeftBias']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=-3, right=3)

The estimated object permanence capability for the random agent has been pulled down slightly, but it is still curiously high. Perhaps visual acuity can explain some of the extra variance in performance.

## Introducing Visual Acuity

As in the previous tutorial, we define visual acuity in terms of the size and distance of the goal (when it is visible) from the agent at the start of the episode.

In [None]:
def setupOPNavDirectionalVisualModel(data, agent_col_name: str):

  # get results column
  results = data[agent_col_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())
  minNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).min())
  minVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())
  maxNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).max())
  maxVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility

  abilityMin["navAbility"] = minNavAbility
  abilityMax["navAbility"] = maxNavAbility

  abilityMin["rightLeftBias"] = -np.inf
  abilityMax["rightLeftBias"] = np.inf

  abilityMin["visualAcuityAbility"] = minVAcuityAbility
  abilityMax["visualAcuityAbility"] = maxVAcuityAbility

  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    navAbility = pm.Uniform("navAbility", minNavAbility, maxNavAbility)

    rightLeftBias = pm.Normal("rightLeftBias", 0,1)

    vAcuityAbility = pm.Uniform("visualAcuityAbility", minVAcuityAbility, maxVAcuityAbility)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)
    numTurnsGoal = pm.MutableData("minTurnsToGoal", data["minNumTurnsGoal"].values)
    rightLeftPosition = pm.MutableData("rightLeftPosition", data["goalPosition"].values)
    goalSize = pm.MutableData("goalSize", data["mainGoalSize"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    rightLeftEffect = pm.Deterministic("rightLeftEffect", rightLeftBias * rightLeftPosition)

    navP = pm.Deterministic("navP", logistic999((navAbility - (goalDist * numTurnsGoal)) + rightLeftEffect, min = minNavAbility, max = maxNavAbility))

    visualAcuityP = pm.Deterministic("visualAcuityP", logistic999((np.log(vAcuityAbility) - np.log(goalDist/goalSize)), min = minVAcuityAbility, max = maxVAcuityAbility))

    # Define final margin with non-compensatory interaction

    finalP = pm.Deterministic("finalP", (objPermP * navP * visualAcuityP))

    taskSuccess = pm.Bernoulli("taskSuccess", finalP, observed = results)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'Random_Agent')
gv = pm.model_graph.model_to_graphviz(m)
gv

In [None]:
model_random, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'Random_Agent')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

model_heuristic, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'Heuristic_Agent')
with model_heuristic:
  data_heuristic = pm.sample(1000, target_accept=0.95)

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['objPermAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['objPermAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['objPermAbility'], right=abilityMax['objPermAbility'])

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['navAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['navAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['navAbility'], right=abilityMax['navAbility'])

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['rightLeftBias']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=-3, right=3)

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['rightLeftBias']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=-3, right=3)

In [None]:
forest_plot_random = az.plot_forest(data=data_random['posterior'][['visualAcuityAbility']])
axes_random = forest_plot_random.ravel()[0]
axes_random.set_xlim(left=abilityMin['visualAcuityAbility'], right=abilityMax['visualAcuityAbility'])

forest_plot_heuristic = az.plot_forest(data=data_heuristic['posterior'][['visualAcuityAbility']])
axes_heuristic = forest_plot_heuristic.ravel()[0]
axes_heuristic.set_xlim(left=abilityMin['visualAcuityAbility'], right=abilityMax['visualAcuityAbility'])

Introducing visual acuity tidies a lot of things up. We get a cognitive profile that we expect for both agents. Both have quite good navigation abilities, low object permanence, and some differences in the left/right bias and visual acuities.

## Making Inferences About Real Agents

Let's now run this measurement layout on our four DRL agents, and then inspect some diagnostics to determine how well the model is fitting.

In [None]:
model_ppo_basic, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'PPO_basic')
with model_ppo_basic:
  data_ppo_basic = pm.sample(1000, target_accept=0.95)

model_ppo_basic_control, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'PPO_basic_control')
with model_ppo_basic_control:
  data_ppo_basic_control = pm.sample(1000, target_accept=0.95)

model_dreamer_basic, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'Dreamer_basic')
with model_dreamer_basic:
  data_dreamer_basic = pm.sample(1000, target_accept=0.95)

model_dreamer_basic_control, abilityMin, abilityMax = setupOPNavDirectionalVisualModel(pctb_dataset, 'Dreamer_basic_control')
with model_dreamer_basic_control:
  data_dreamer_basic_control = pm.sample(1000, target_accept=0.95)

In [None]:
def capability_forest_plot(capability_idata, ax_min, ax_max):
  """
  capability_idata : the posterior for the capability
  ax_min : the minimum value on the x axis for the forest plot
  ax_max = the maximum value on the x axis for the forest plot
  """
  forest_plot = az.plot_forest(data = capability_idata)
  axes = forest_plot.ravel()[0]
  axes.set_xlim(left = ax_min, right = ax_max)

  return None

The dreamer agents have slightly higher object permanence than the PPO agents:

In [None]:
capability_forest_plot(data_ppo_basic['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_ppo_basic_control['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_dreamer_basic['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_dreamer_basic_control['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])

They all have high navigation abilities:

In [None]:
capability_forest_plot(data_ppo_basic['posterior'][['navAbility']], ax_min = abilityMin['navAbility'], ax_max = abilityMax['navAbility'])
capability_forest_plot(data_ppo_basic_control['posterior'][['navAbility']], ax_min = abilityMin['navAbility'], ax_max = abilityMax['navAbility'])
capability_forest_plot(data_dreamer_basic['posterior'][['navAbility']], ax_min = abilityMin['navAbility'], ax_max = abilityMax['navAbility'])
capability_forest_plot(data_dreamer_basic_control['posterior'][['navAbility']], ax_min = abilityMin['navAbility'], ax_max = abilityMax['navAbility'])

They are not particularly biased to one side:

In [None]:
capability_forest_plot(data_ppo_basic['posterior'][['rightLeftBias']], ax_min = -3, ax_max = 3)
capability_forest_plot(data_ppo_basic_control['posterior'][['rightLeftBias']], ax_min = -3, ax_max = 3)
capability_forest_plot(data_dreamer_basic['posterior'][['rightLeftBias']], ax_min = -3, ax_max = 3)
capability_forest_plot(data_dreamer_basic_control['posterior'][['rightLeftBias']], ax_min = -3, ax_max = 3)

The visual acuity of dreamer is higher than that of PPO, but not significantly so. Both agents received the same 64x64 RGB pixel input.

In [None]:
capability_forest_plot(data_ppo_basic['posterior'][['visualAcuityAbility']], ax_min = abilityMin['visualAcuityAbility'], ax_max = abilityMax['visualAcuityAbility'])
capability_forest_plot(data_ppo_basic_control['posterior'][['visualAcuityAbility']], ax_min = abilityMin['visualAcuityAbility'], ax_max = abilityMax['visualAcuityAbility'])
capability_forest_plot(data_dreamer_basic['posterior'][['visualAcuityAbility']], ax_min = abilityMin['visualAcuityAbility'], ax_max = abilityMax['visualAcuityAbility'])
capability_forest_plot(data_dreamer_basic_control['posterior'][['visualAcuityAbility']], ax_min = abilityMin['visualAcuityAbility'], ax_max = abilityMax['visualAcuityAbility'])

## Model Diagnostics

PyMC is powerful and intuitive enough that we can straightforwardly design measurement layouts. But how do we know if the data are appropriate for them, or whether there are any issues with how they were fitted? There are several diagnostics we can run on measurement layouts, of which some of the most useful are presented here.

First, we can look at the traces for each of the capabilities. Below, we can visualise the traces. For each chain (we have used 2 chains above), there is a posterior distribution (see the left plot). We want them to look relatively similar to each other, as this means that each chain converged to a similar posterior. On the right, we see a time series plot indicating how often each of the values were sampled in the chain. We want there to be relative homogeneity here, suggesting that all values were sampled a similar number of times. Note, depending on the prior, there might be spikes for certain values (if, for instance, the prior is a Cauchy distribution or something similarly heavy-tailed).

In [None]:
az.plot_trace(data_ppo_basic['posterior'][['objPermAbility']])
az.plot_trace(data_ppo_basic_control['posterior'][['objPermAbility']])
az.plot_trace(data_dreamer_basic['posterior'][['objPermAbility']])
az.plot_trace(data_dreamer_basic_control['posterior'][['objPermAbility']])

In [None]:
az.plot_trace(data_ppo_basic['posterior'][['navAbility']])
az.plot_trace(data_ppo_basic_control['posterior'][['navAbility']])
az.plot_trace(data_dreamer_basic['posterior'][['navAbility']])
az.plot_trace(data_dreamer_basic_control['posterior'][['navAbility']])

In [None]:
az.plot_trace(data_ppo_basic['posterior'][['rightLeftBias']])
az.plot_trace(data_ppo_basic_control['posterior'][['rightLeftBias']])
az.plot_trace(data_dreamer_basic['posterior'][['rightLeftBias']])
az.plot_trace(data_dreamer_basic_control['posterior'][['rightLeftBias']])

In [None]:
az.plot_trace(data_ppo_basic['posterior'][['visualAcuityAbility']])
az.plot_trace(data_ppo_basic_control['posterior'][['visualAcuityAbility']])
az.plot_trace(data_dreamer_basic['posterior'][['visualAcuityAbility']])
az.plot_trace(data_dreamer_basic_control['posterior'][['visualAcuityAbility']])

A second diagnostic is an energy plot, which also enables us to check whether the MCMC algorithm (usually, as here, NUTS) has explored the full posterior distribution. In the energy plot, we simply want the distribution of marginal energy during sampling, and distribution of energy transitions between steps (see [here](https://www.pymc.io/projects/docs/en/latest/learn/core_notebooks/pymc_overview.html#model-checking) and [here](https://arxiv.org/abs/1604.00695)), to overlap and look similar. Everything looks good for our four agents with our complex measurement layout:

In [None]:
az.plot_energy(data_ppo_basic)
az.plot_energy(data_ppo_basic_control)
az.plot_energy(data_dreamer_basic)
az.plot_energy(data_dreamer_basic_control)

In the literature on Bayesian statistics, several convergence diagnostics have been proposed. [Vehtari et al. (2021)](https://projecteuclid.org/journals/bayesian-analysis/volume-16/issue-2/Rank-Normalization-Folding-and-Localization--An-Improved-R%cb%86-for/10.1214/20-BA1221.full) present a comprehensive overview. Two diagnostics that we can use out of the box to gauge the convergence of multiple chains when fitting a measurement layout are $\hat{R}$ and *Effective Sample Size* (ESS).

$\hat{R}$ is, roughly, the ratio of the variance mixed across all chains compared to the root mean squared variance of the variance in each individual chain. If the chains are not converging, then the between-chain variance should be higher than the within-chain variance, so values higher than 1 indicate lack of convergence. In practice, Vehtari et al. suggest values higher than 1.01 indicate a lack of convergence. Below, we present $\hat{R}$ for each capability.

ESS is, roughly, "how many independent draws contain the same information as the dependent sample obtained by the MCMC algorithm. The higher the ESS the better" (Vehtari et al., 2021, p. 672). We can distinguish between $ESS_{bulk}$ and $ESS_{tail}$ too, where the latter is the ESS in the tails of the posterior distribution, outside of the credibility interval. This is especially useful if credibility intervals are to be used downstream in inference. Below, note that the sample size for each chain is 2000 (1000 warm up, 1000 sample; unless you have changed it).

In [None]:
az.summary(data_ppo_basic['posterior'][['objPermAbility', 'navAbility', 'rightLeftBias', 'visualAcuityAbility']])

In [None]:
az.summary(data_ppo_basic_control['posterior'][['objPermAbility', 'navAbility', 'rightLeftBias', 'visualAcuityAbility']])

In [None]:
az.summary(data_dreamer_basic['posterior'][['objPermAbility', 'navAbility', 'rightLeftBias', 'visualAcuityAbility']])

In [None]:
az.summary(data_dreamer_basic_control['posterior'][['objPermAbility', 'navAbility', 'rightLeftBias', 'visualAcuityAbility']])

# Extending To The Multivariate Case

The test battery has been extended to incorporate another kind of object permanence test, as well as another response variable: whether the agent chose the correct choice or not.

Let's inspect the dataset. There are **551 instances**, **8 metafeatures**, and the performances of **7 agents** across two measures.

The metafeatures are as follows:
1. `basicTask` - is the task a basic task? Values (discrete, binary): `0` (No), `1` (Yes).
2. `pctbGridTask` - is the task a PCTB Grid task? Values (discrete binary): `0` (No), `1` (Yes).
3. `cvChickTask` - is the task a CV Chick task? Values (discrete binary): `0` (No), `1` (Yes).
4. `mainGoalSize` - what size is the goal in the task? Value range: `0.5` - `5.0`.
5. `goalPosition` - what is the relative position of the goal with respect to the agent's starting position? Value range: `-1.5` - `1.5` (0 is centre point).
6. `goalOccluded` - is the goal occluded when the agent starts the episode? Values (discrete binary): `0` (No), `1` (Yes).
7. `lavaPresence` - is there lava in the arena? Values (discrete, binary): `0` (No), `1` (Yes).
8. `minDistToGoal` - how far is the goal from the agent? This is calculated is the manhattan distance to the goal, avoiding any obstacles/pits. Value range: `9.0` - `58.0`.
9. `minNumTurnsGoal` - how many right-angle turns would the agent take on the trajectory described by `minDistToGoal`. Value range: `0.0` - `11.0`.
10. `minDistToCorrectChoice` - how far is the choice point from the agent? This is calculated is the manhattan distance to the goal, avoiding any obstacles/pits. For tasks without a forced choice component, it is equivalent to the distance to the goal.  Value range: `8.0` - `47.0`.
11. `minNumTurnsChoice` - how many right-angle turns would the agent take on the trajectory described by `minDistToCorrectChoice`. Value range: `0.0` - `3.0`.
12. `numChoices` - how many choices does the agent have in the task? Value range: `1.0` - `12.0`.

The agents performed each of these tasks, and whether they obtained the goal (`1`) or not (`0`) was recorded, as well as whether they chose the correct choice in a forced choice task (`1`) or not (`0`). For tasks where there was no choice to be made, this value was equivalent to whether they succeeded on the task. There are two columns for each agent, `*_Success` and `*_Choice`. The agents are as follows:
1. `Random_Agent` - An agent which randomly samples one of the 9 actions in the Animal-AI Environment (`no action`, `forwards`, `backwards`, `left rotate`, `right rotate`, `forwards left`, `forwards right`, `backwards left`, `backwards right`). It then takes that action for a number of steps sampled from $U(1, 20)$.
2. `Heuristic_Agent` - An agent that navigates towards green goals, following a rigid rule.
3. `Dreamer_basic` - A dreamer-v3 agent trained for 2M steps on a set of 300 basic tasks, of which all tasks where `basicTask == 1` are a subset.
4. `Dreamer_basic_control` - A dreamer-v3 agent trained for 10M steps on a set of 2372 basic and practice tasks, of which all tasks where `basicTask == 1 or (pctbGridTask == 1 and goalOccluded == 0) or (cvchickTask == 1 and goalOccluded == 0)` are a subset.
5. `PPO_basic` - A PPO agent trained for 2M steps on a set of 300 basic tasks, of which all tasks where `basicTask == 1` are a subset.
6. `PPO_basic_control` - A PPO agent trained for 10M steps on a set of 2372 basic and practice tasks, of which all tasks where `basicTask == 1 or (pctbGridTask == 1 and goalOccluded == 0) or (cvchickTask == 1 and goalOccluded == 0)` are a subset.

In [None]:
data_url = 'https://raw.githubusercontent.com/Kinds-of-Intelligence-CFI/measurement-layout-tutorial/main/data/4_PCTBCVChick_data.csv'
pctbcvchick_dataset = pd.read_csv(data_url)

pctbcvchick_dataset

First, we have a new metafeature for lava, let's incorporate that into the univariate model. To do so, we will make use of the Beta prior because the `lavaPresence` is a binary metafeature:

In [None]:
def setupOPNavDirectionalVisualLavaModel(data, agent_col_name: str):

  # get results column
  results = data[agent_col_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())
  minNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).min())
  minVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())
  maxNavAbility = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).max())
  maxVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility

  abilityMin["navAbility"] = minNavAbility
  abilityMax["navAbility"] = maxNavAbility

  abilityMin["rightLeftBias"] = -np.inf
  abilityMax["rightLeftBias"] = np.inf

  abilityMin["visualAcuityAbility"] = minVAcuityAbility
  abilityMax["visualAcuityAbility"] = maxVAcuityAbility

  abilityMin["lavaAbility"] = 0
  abilityMax["lavaAbility"] = 1

  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    navAbility = pm.Uniform("navAbility", minNavAbility, maxNavAbility)

    rightLeftBias = pm.Normal("rightLeftBias", 0,1)

    vAcuityAbility = pm.Uniform("visualAcuityAbility", minVAcuityAbility, maxVAcuityAbility)

    lavaAbility = pm.Beta("lavaAbility", 1, 1)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)
    numTurnsGoal = pm.MutableData("minTurnsToGoal", data["minNumTurnsGoal"].values)
    rightLeftPosition = pm.MutableData("rightLeftPosition", data["goalPosition"].values)
    goalSize = pm.MutableData("goalSize", data["mainGoalSize"].values)
    lavaPresence = pm.MutableData("lavaPresence", data["lavaPresence"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    rightLeftEffect = pm.Deterministic("rightLeftEffect", rightLeftBias * rightLeftPosition)

    navP = pm.Deterministic("navP", logistic999((navAbility - (goalDist * numTurnsGoal)) + rightLeftEffect, min = minNavAbility, max = maxNavAbility))

    visualAcuityP = pm.Deterministic("visualAcuityP", logistic999((np.log(vAcuityAbility) - np.log(goalDist/goalSize)), min = minVAcuityAbility, max = maxVAcuityAbility))

    lavaP = pm.Deterministic("lavaP", logistic999(lavaAbility - lavaPresence, min = 0, max = 1))


    # Define final margin with non-compensatory interaction

    finalP = pm.Deterministic("finalP", (objPermP * navP * visualAcuityP * lavaP))

    taskSuccess = pm.Bernoulli("taskSuccess", finalP, observed = results)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupOPNavDirectionalVisualLavaModel(pctbcvchick_dataset, 'Random_Agent_Success')
gv = pm.model_graph.model_to_graphviz(m)
gv

Let's try it out on the random agent and see what lava ability is inferred:

In [None]:
model_random, abilityMin, abilityMax = setupOPNavDirectionalVisualLavaModel(pctbcvchick_dataset, 'Random_Agent_Success')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

In [None]:
az.plot_trace(data_random['posterior'][['lavaAbility']])

Now let's introduce the second response variable, `correctChoice`, and the corresponding metafeatures. There is the navigation demand of navigating to the goal, which is quite high in both the PCTB and CV Chick tasks. However, there is also the navigation demand of navigating to the choice point, which is high for the PCTB tasks (equivalent to navigating to the goal), but low for the CV Chick tasks (they only need to navigate to the end of the ramp). Let's implement that:

In [None]:
def setupMultivariateModel(data, agent_success_name: str, agent_choice_name: str):

  # get results column
  successes = data[agent_success_name]
  choices = data[agent_choice_name]

  # define bounds
  abilityMin = {}
  abilityMax = {}

  minSuccessNav = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).min())
  minChoiceNav = ((data["minDistToCorrectChoice"] * data["minNumTurnsChoice"]).min())

  maxSuccessNav = ((data["minDistToGoal"] * data["minNumTurnsGoal"]).max())
  maxChoiceNav = ((data["minDistToCorrectChoice"] * data["minNumTurnsChoice"]).max())

  minPermAbility = ((data["minDistToGoal"] * data["numChoices"]).min())
  minNavAbility = min([minSuccessNav, minChoiceNav])
  minVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).min())

  maxPermAbility = ((data["minDistToGoal"] * data["numChoices"]).max())
  maxNavAbility = max([maxSuccessNav, maxChoiceNav])
  maxVAcuityAbility = ((data["minDistToGoal"]/data["mainGoalSize"]).max())

  abilityMin["objPermAbility"] = minPermAbility
  abilityMax["objPermAbility"] = maxPermAbility

  abilityMin["navAbility"] = minNavAbility
  abilityMax["navAbility"] = maxNavAbility

  abilityMin["rightLeftBias"] = -np.inf
  abilityMax["rightLeftBias"] = np.inf

  abilityMin["visualAcuityAbility"] = minVAcuityAbility
  abilityMax["visualAcuityAbility"] = maxVAcuityAbility

  abilityMin["lavaAbility"] = 0
  abilityMax["lavaAbility"] = 1

  m = pm.Model()
  with m:

    # Define abilities and their priors

    objPermAbility = pm.Uniform("objPermAbility", minPermAbility, maxPermAbility)

    navAbility = pm.Uniform("navAbility", minNavAbility, maxNavAbility)

    rightLeftBias = pm.Normal("rightLeftBias", 0,1)

    vAcuityAbility = pm.Uniform("visualAcuityAbility", minVAcuityAbility, maxVAcuityAbility)

    lavaAbility = pm.Beta("lavaAbility", 1, 1)

    # Define environment variables as MutableData

    goalDist = pm.MutableData("goalDistance", data["minDistToGoal"].values)
    numChoices = pm.MutableData("numChoices", data["numChoices"].values)
    opTest = pm.MutableData("goalOccluded", data["goalOccluded"].values)
    numTurnsGoal = pm.MutableData("minTurnsToGoal", data["minNumTurnsGoal"].values)
    rightLeftPosition = pm.MutableData("rightLeftPosition", data["goalPosition"].values)
    goalSize = pm.MutableData("goalSize", data["mainGoalSize"].values)
    lavaPresence = pm.MutableData("lavaPresence", data["lavaPresence"].values)

    choiceDist = pm.MutableData("choiceDistance", data["minDistToCorrectChoice"].values)
    numTurnsChoice = pm.MutableData("minTurnsToChoice", data["minNumTurnsChoice"].values)

    # Margins

    objPermMargin = (objPermAbility - (goalDist * numChoices * opTest))
    objPermP = pm.Deterministic("objPermP", logistic999(objPermMargin, min = minPermAbility, max = maxPermAbility))

    rightLeftEffect = pm.Deterministic("rightLeftEffect", rightLeftBias * rightLeftPosition)

    lavaP = pm.Deterministic("lavaP", logistic999(lavaAbility - lavaPresence, min = 0, max = 1))

    navSuccessP = pm.Deterministic("navSuccessP", logistic999((navAbility - (goalDist * numTurnsGoal)), min = minNavAbility, max = maxNavAbility) * lavaP)
    navChoiceP = pm.Deterministic("navChoiceP", logistic999((navAbility - (choiceDist * numTurnsChoice)) + rightLeftEffect, min = minNavAbility, max = maxNavAbility))

    visualAcuityP = pm.Deterministic("visualAcuityP", logistic999((np.log(vAcuityAbility) - np.log(goalDist/goalSize)), min = minVAcuityAbility, max = maxVAcuityAbility))

    # Define final margin with non-compensatory interaction

    choiceP = pm.Deterministic("choiceP", (navChoiceP * objPermP))
    successP = pm.Deterministic("successP", (visualAcuityP * navSuccessP * objPermP))

    taskSuccess = pm.Bernoulli("taskSuccess", successP, observed=successes)
    taskChoice = pm.Bernoulli("taskChoice", choiceP, observed=choices)

  return m, abilityMin, abilityMax

In [None]:
m, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'Random_Agent_Success', 'Random_Agent_Choice')
gv = pm.model_graph.model_to_graphviz(m)
gv

Let's run this measurement layout with our battery of 6 agents:

In [None]:
model_random, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'Random_Agent_Success', 'Random_Agent_Choice')
with model_random:
  data_random = pm.sample(1000, target_accept=0.95)

model_heuristic, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'Heuristic_Agent_Success', 'Heuristic_Agent_Choice')
with model_heuristic:
  data_heuristic = pm.sample(1000, target_accept=0.95)

model_ppo_basic, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'PPO_basic_Success', 'PPO_basic_Choice')
with model_ppo_basic:
  data_ppo_basic = pm.sample(1000, target_accept=0.95)

model_ppo_basic, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'PPO_basic_Success', 'PPO_basic_Choice')
with model_ppo_basic:
  data_ppo_basic = pm.sample(1000, target_accept=0.95)

model_ppo_basic_control, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'PPO_basic_control_Success', 'PPO_basic_control_Choice')
with model_ppo_basic_control:
  data_ppo_basic_control = pm.sample(1000, target_accept=0.95)

model_dreamer_basic, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'Dreamer_basic_Success', 'Dreamer_basic_Choice')
with model_dreamer_basic:
  data_dreamer_basic = pm.sample(1000, target_accept=0.95)

model_dreamer_basic_control, abilityMin, abilityMax = setupMultivariateModel(pctbcvchick_dataset, 'Dreamer_basic_control_Success', 'Dreamer_basic_control_Choice')
with model_dreamer_basic_control:
  data_dreamer_basic_control = pm.sample(1000, target_accept=0.95)

In [None]:
capability_forest_plot(data_random['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_heuristic['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_ppo_basic['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_ppo_basic_control['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_dreamer_basic['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])
capability_forest_plot(data_dreamer_basic_control['posterior'][['objPermAbility']], ax_min = abilityMin['objPermAbility'], ax_max = abilityMax['objPermAbility'])

Feel free to continue playing around with these models in the remainder of the session.