#### Computational Modeling of Cognition and Behavior

# Introduction

***


## 1.1 Models and Theories in Science
***

<br>

1. Data never speak for themselves, a model is unobservable, it requires understanding and explanation.

2. Verbal theorizing can't replace quantitative analysis.

3. There are always many models, we must select between them.

4. Model selection requires both:
    - Quantitave evaluation.
    - Intellectual judgment.

## 1.2 Quantitative Modeling in Cognition
***

### 1.2.1 Models and Data

<br>

Appropriate models can explain a great deal of variance, which is not intuitivly obvious when looking at the data.

<br>

Models fit into two broad catagories:

1. Models that describe data

2. Models that explain an underlying cognitive process

<br>

### 1.2.2 Data Description

<br>

Models are used to summarise and communicate data, a mean could be considered a model.

Select a model which best represents the data, for example median or trimmed mean can be better when data has skew.

Is learning a "**Power Law**" or exponential improvement? Improvements in new skills are large in the first trial, but then plateau. Best to try and define as a function. 

$$
RT=N^{-\beta}
$$

$RT$ - Time to perform task 

$N$ - Number of learning trials to date

$\beta$ - Learning rate

<br>

Following Palmeri (1997)'s data (see fig 1.5), Heathcote et al. (2000) suggested the following exponential function is a better fit:

$$
RT=e^{-\alpha N}
$$

$\alpha$ - Learning rate

To fit the data, they also included an asymptote ($+A$) and a multiplier ($\times B$) (see fig 1.5).

However, both models are very similar, and in this case, the Power function has a slightly lower route mean-squared deviation (RMSD) than the exponential model.

<br>

Each model has different implications:

- **Power function**: The rate of learning *decreases* with increasing practice.

- **Exponential function**: The learning rate, relative to what remains to be learned remains constant with practice. Learning continues to enhance your knowledge by a constant fraction.

<br>

Heathcote et al. (2000) analysed a larger body of data and found the exponential model is a better description of skill acquisition. Resulting in a paradigm shift.

1. Choice of model gives implications about the process

2. Model choice must be done given strict quantitative criteria (Chapter 10)

3. Heathcote et al.'s model selection considered individual subjects, not just averages across participants. Raising the issue: How best to apply a model for data with multiple participants? (Chapter 5)

<br>

Questions may also be asked regarding the rate of forgetting. Two conclusions from Wixted (2004):

1. "The degree of learning does not affect the rate of forgetting".

2. The rate of loss *decelerates* over time.

<br>

Follows the same pattern as before, requiring quantitative data to make a selection and the choice of model has psychological implications.

<br>

**Normative Behaviour** : "How people would behave if following the rules of logic or probability

People tend to violate normative expectations, even in very simple examples.

<br>

Descriptive models contain no psychological conent, they simply describe the data. Some models do, such as "process models", which explain and underlying cognitive process

<br>

<br>

### 1.2.2 Cognitive Process Models

<br>

**Generalized Context Model (GCM)**: Classfies stimuili. Stores every category example enocuntered during training in memory and reffers to those to catagerise test stimuli.

<br>

$i$: A particular test stimulus. <br>

$j$: A particular example stimulus. <br>

$I$, $J$: The number of elements in the respective sets <br>

$\mathfrak{I}$: the set of test stimuli <br>

$\mathfrak{J}$: the set of examples <br>

$j = 1,2,...,J$ hence $j \in \mathfrak{J}$ 

<br>

Note: Lowercase ($i,j$); specific set elements. Upercase ($I,J$); number of elements in a set.

Test stimuli are compared to all stored examples, *similarity* beween $i$ and each $j$ is determined. GCM assumes similarity is proximity in perceptual space. Formally defined using the Pythagorean theorem.

$$
d_{ij} = \sqrt{ \left(\sum_{k=1}^{K}(x_{ik} - x_{jk})^2 \right)}
$$

$d_{ij}$: distance between $i$ and $j$ <br>
$k$: dimension <br>
$x_{ik}$: value of dimension $k$ for test item $i$ <br>
$x_{ij}$: value of dimension $k$ for the stored example $j$ 

<br>


The number of dimensions is arbitrary, 2 easy to demonstrate but original example used 4 dimensions to characterize cartoon faces (eye height, eye seperation, nose length, mough height). 

GCM postulates similarity is defined:

$$
s_{ij} = e^{-c \cdot d_{ij}}
$$

$c$: a parameter <br>
$s_{ij}$: similarity

<br>

Meaning similarity has an exponential relationship with the distance (see fig 1.7). The GCM can generalize to never before seen data. 

Using this method, all test stimuli can be given a similarity score to each memorized example. Now a decision can be made. Activiations can be summed seperatly accross examples from each catagory. The relative magnitude of the sums is as follows: Summed catagagory $A$ similarity, over the sum of catagory $A$ and $B$ similarity;

$$
P(R_{i} = A|i) = \frac{\left(\sum_{j\in A}^{} s_{ij} \right)}{\left(\sum_{j\in A}^{} s_{ij} \right) + \left(\sum_{j\in B}^{} s_{ij} \right)}
$$

<br>

$A,B$: Catagories <br>
$P(R_{i} = A|i)$: The probability of classifying stimulus $i$ into catagory $A$. 

The choice of mathematics is derived from 'deeper' principles such as "the universal law of generalization" (Shepard 1987) and theoretical approaches developed by Luce (1963).

<br>

## 1.3 Potential Problems: Scope and Falsifiability

***

<br>

Theories should be falsifiable/testable; there are some possible outcomes which are not compatible with the theory's predictions. 

<br>

**Outcome space**: When the 'outcome space' of a theory is small, the likelihood of a match between predictions and data is much less. For this reason, when data matches predicitons of a theory with small 'outcome space', the support for the theory is stronger than if it had a larger outcome space. (Dunn, 2000, for more formalized view)<br>

**Quality of data**: When all data fits within the prediction, it provides stronger support for a theory than when data varies outside of the predicted outcome space.

<br>

## 1.4 Modeling as a “Cognitive Aid” for the Scientist

***

<br>
    
A discussion of the importance of the replicability of studies.

<br>

Communication between researchers is leaky and incomplete. Ideas can morph over time and by degrees of seperation from the 'source'. The analogy chosen can have a significant impact on people's understanding of an idea. Sometimes only some features of a model are shared in peoples understanding. When a person believes they are testing a model, the originator of the theory may reject the test and believe their theory is predicts something different. 

<br>

The specificity of computational models reduces the ambiguity in communication.

<br>

Computational models determine if intuitions of a theorized system match it's actual consequences. It clarifies theories. Implamentation exposes where decisions must be made about methods and mechanisms. 

<br>

## 1.5 In Vivo Modeling: “Cognitive Aid” or “Cognitive Burden”?

***

<br>

Start simple. Get eperience with simple examples before moving to complex examples.

Multinomial processing tree (MPT) models, have wide applications in cognitive research. They can be used on aggregate data or per individual. Often there is not enogh data on individuals so a heirachical structure, (e.g., Matzke et al., 2015; Smith and Batchelder, 2010) can allow for adjustments to be made to individuals (paramters drawn from a distrubution), while using aggregate data).



<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
