# Chapter 1: Introduction to CogniBench
CogniBench is a framework for benchmarking cognitive models using behavioral data. It is built mainly on top of [sciunit](https://github.com/scidash/sciunit) and [gym](https://github.com/openai/gym) libraries. It uses the same test-model-capability categorization implemented in sciunit to run test suites consisting of several tests on a set of models. For a full list of features, please refer to README.md file or cognibench documentation website.

## A first example: Testing multiple models interactively
As a toy example, we test three models interactively. Let us first import the models. cognibench offers some single-subject model implementations. Tests can be used with single- or multi-subject models. In this section, we showcase testing multi-subject models and the automatic multi-subject creation from single-subject implementation.

In [1]:
# Required to suppress sciunit config not found logs
import sciunit
import os
sciunit.settings['CWD'] = os.getcwd()

In [2]:
from cognibench.models.decision_making import RandomRespondModel, NWSLSModel
from cognibench.models.utils import multi_from_single_cls

MultiRandomRespondModel = multi_from_single_cls(RandomRespondModel)
MultiNWSLSModel = multi_from_single_cls(NWSLSModel)

First, we need data to run the tests. In general, the type of data highly depends on the particular model. In this example, we assume that data is stored in `observations` variable. In later chapters, we will explain this part in more detail.

In [3]:
observations = [{'stimuli': [], 'actions': [], 'rewards': []}]

cognibench offers the `cognibench.testing.InteractiveTest` class for interactive tests. This class tests a given model using `(stimulus, action, reward)` tuples by first showing the stimulus to the model, then getting the prediction and finally returning the true `(stimulus, action, reward)` tuple. Therefore, the model has the chance to update itself after making a prediction on a stimulus.

`InteractiveTest` class requires the observations, the score type to use and in this case the boolean switch to signal that we are using multi-subject models.

In [4]:
from cognibench.testing import InteractiveTest
from cognibench.scores import NLLScore
from cognibench.utils import partialclass

test = InteractiveTest(
    name='Interactive negative log-likelihood test',
    observation=observations,
    score_type=partialclass(NLLScore, min_score=0, max_score=1e4),
    multi_subject=True,
    optimize_models=False
)

Now, the models. All of the models we have imported in this tutorial operate on discrete action and discrete observation spaces. Therefore, we need to specify the dimension for these spaces. In addition, each model require certain parameters. Here we assume that parameters are already set beforehand.

In [5]:
ndim_action = 5
ndim_observation = 8
num_subjects = 5

multi_rr = MultiRandomRespondModel(n_subj=num_subjects, n_action=ndim_action, n_obs=ndim_observation)
multi_nwsls = MultiNWSLSModel(n_subj=num_subjects, n_action=ndim_action, n_obs=ndim_observation)

And we run the test on the list of models. Since we don't have any observations in this example, both scores are 0.

In [6]:
test_suite = sciunit.TestSuite([test], name='Test suite')
model_list = [multi_rr, multi_nwsls]
test_suite.judge(model_list)

Unnamed: 0,Interactive negative log-likelihood test
RandomRespondModel,0
NWSLSModel,0
