In [1]:
library(brms)
theme_set(theme_default())

Loading required package: Rcpp
Loading required package: ggplot2
Loading 'brms' package (version 2.4.0). Useful instructions
can be found by typing help('brms'). A more detailed introduction
to the package is available through vignette('brms_overview').
Run theme_set(theme_default()) to use the default bayesplot theme.


# Load data

In [2]:
# load RData into environment object ("ex")
load("./prepped_data.RData", ex <- new.env())

In [3]:
# Load data from environment.
# For testing, we're using df_slice, which has only subjects 1 an 2.
df <- ex$df_slice
#df <- ex$df_total

In [4]:
head(df)

Answer,Block,CD,Choice,ED,RT,Trlnum,conNT,conNT_cent,item_id,newNT,newNT_cent,sub,task
1,0,1.1409781,0,15.32376,1.76202,0,8.711341,-0.02945527,16.0,6.996014,0.002536885,1,disc
0,1,1.1409781,1,15.32376,2.242939,33,8.711341,-0.02945527,16.0,6.996014,0.002536885,1,disc
1,2,1.1409781,0,15.32376,1.766368,39,8.711341,-0.02945527,16.0,6.996014,0.002536885,1,disc
1,0,1.1409781,0,15.32376,2.55949,38,8.711341,-0.02945527,16.0,6.996014,0.002536885,2,disc
1,2,1.1409781,0,15.32376,2.417423,51,8.711341,-0.02945527,16.0,6.996014,0.002536885,2,disc
0,1,0.7320226,3,14.63461,2.49244,11,8.747425,0.00662891,62.0,6.990727,-0.002750086,1,disc


In [5]:
str(df)

'data.frame':	623 obs. of  14 variables:
 $ Answer    : Factor w/ 2 levels "0","1": 2 1 2 2 2 1 2 2 1 1 ...
 $ Block     : Factor w/ 3 levels "0","1","2": 1 2 3 1 3 2 3 3 1 2 ...
 $ CD        : num  1.14 1.14 1.14 1.14 1.14 ...
 $ Choice    : int  0 1 0 0 0 3 0 0 1 2 ...
 $ ED        : num  15.3 15.3 15.3 15.3 15.3 ...
 $ RT        : num  1.76 2.24 1.77 2.56 2.42 ...
 $ Trlnum    : int  0 33 39 38 51 11 63 40 2 31 ...
 $ conNT     : num  8.71 8.71 8.71 8.71 8.71 ...
 $ conNT_cent: num  -0.0295 -0.0295 -0.0295 -0.0295 -0.0295 ...
 $ item_id   : Factor w/ 64 levels "0.0","1.0","10.0",..: 9 9 9 9 9 60 60 60 63 63 ...
 $ newNT     : num  7 7 7 7 7 ...
 $ newNT_cent: num  0.00254 0.00254 0.00254 0.00254 0.00254 ...
 $ sub       : Factor w/ 2 levels "1","2": 1 1 1 2 2 1 1 2 1 1 ...
 $ task      : Factor w/ 2 levels "disc","name": 1 1 1 1 1 1 1 1 1 1 ...


# Modelling

## Overview

We want to predict accuracy ("Answer") by the following:
- Random effects structure:
    - $(Block*NT|sub) + (1|item) + (1|task)$
    - Although we might also want to compare with a varying slope for task over subjects
- Fixed effects structure:
    - $NT*Block$
    - NT can be: conNT, newNT, conNT AND newNT (incremental prediction)
    - $H_0$: Model without NT
    - We'll use the centered version of the neural typicality measures for interpretability

## Model building

### Null model

In [6]:
# Build a null model, which assumes full random effects structure and a fixed effect for learning across blocks,
# but no influence of any neural typicality measure.
answer_nullmodel <- brm(Answer ~ Block + (Block|sub) + (1|item_id) + (1|task),
                        data = df,
                        family = bernoulli,
                        file = "answer_nullmodel",
                        chains = 2, cores = 2)

In [7]:
#summary(answer_nullmodel)

### Newfound neural typicality

In [8]:
# Add the newfound neural typicality measure to the model
answer_newNT <- brm(Answer ~ newNT_cent*Block + (newNT_cent*Block|sub) + (1|item_id) + (1|task),
                    data = df,
                    family = bernoulli,
                    file = "answer_newNT",
                    chains = 2, cores = 2)

In [9]:
summary(answer_newNT)

“There were 19 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.

 Family: bernoulli 
  Links: mu = logit 
Formula: Answer ~ newNT_cent * Block + (newNT_cent * Block | sub) + (1 | item_id) + (1 | task) 
   Data: df_slice (Number of observations: 623) 
Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 2000

Group-Level Effects: 
~item_id (Number of levels: 64) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     1.32      0.21     0.99     1.81        248 1.00

~sub (Number of levels: 2) 
                                         Estimate Est.Error l-95% CI u-95% CI
sd(Intercept)                                3.54      5.02     0.05    20.46
sd(newNT_cent)                               9.90      9.97     0.34    36.38
sd(Block1)                                   3.54      5.01     0.03    17.24
sd(Block2)                                   3.11      4.49     0.04    14.52
sd(newNT_cent:Block1)                       10.43     11.35     0.26    38.06
sd(newNT_cent:Bloc

### Conserved neural typicality

In [10]:
answer_conNT <- brm(Answer ~ conNT_cent*Block + (conNT_cent*Block|sub) + (1|item_id) + (1|task),
                    data = df,
                    family = bernoulli,
                    file = "answer_conNT",
                    chains = 2, cores = 2)

In [11]:
summary(answer_conNT)

“There were 18 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.

 Family: bernoulli 
  Links: mu = logit 
Formula: Answer ~ conNT_cent * Block + (conNT_cent * Block | sub) + (1 | item_id) + (1 | task) 
   Data: df (Number of observations: 623) 
Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 2000

Group-Level Effects: 
~item_id (Number of levels: 64) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     1.37      0.21     0.99     1.84        448 1.00

~sub (Number of levels: 2) 
                                         Estimate Est.Error l-95% CI u-95% CI
sd(Intercept)                                2.59      3.48     0.04    12.90
sd(conNT_cent)                               6.16      6.39     0.18    23.22
sd(Block1)                                   2.88      4.03     0.04    13.61
sd(Block2)                                   2.99      4.48     0.04    14.80
sd(conNT_cent:Block1)                        8.03      7.09     0.38    26.32
sd(conNT_cent:Block2)   

### Newfound *AND* conserved typicality

In [12]:
answer_bothNT <- brm(Answer ~ conNT_cent*newNT_cent*Block + (conNT_cent*newNT_cent*Block|sub) + (1|item_id) + (1|task),
                    data = df,
                    family = bernoulli,
                    file = "answer_bothNT",
                    chains = 2, cores = 2)

In [60]:
# summary(answer_bothNT)

# Model comparison

In [59]:
# via leave-one-out crossvalidation
comparison <- loo(answer_conNT, answer_newNT, answer_bothNT, answer_nullmodel, file="loo_modelcomp")

# WAIC
#comparison <- waic(answer_conNT, answer_newNT, answer_bothNT, answer_nullmodel)

comparison

                                  LOOIC    SE
answer_conNT                     711.16 24.20
answer_newNT                     712.88 24.63
answer_bothNT                    723.96 26.55
answer_nullmodel                 704.35 23.56
answer_conNT - answer_newNT       -1.72  4.18
answer_conNT - answer_bothNT     -12.81  5.80
answer_conNT - answer_nullmodel    6.81  2.87
answer_newNT - answer_bothNT     -11.09  4.82
answer_newNT - answer_nullmodel    8.53  4.02
answer_bothNT - answer_nullmodel  19.61  6.94

In [64]:
# retrieve individual fit measures

#comparison$answer_conNT
#comparison$answer_conNT$looic

# Visualization

### Posteriors

In [17]:
#plot(answer_newNT, ask = FALSE)

In [None]:
# Only plot posteriors of standard deviation for parameters
#plot(discmodel1, pars="^sd")

### Model predictions

In [19]:
# check distribution of predictions
#pp_check(answer_newNT)

In [21]:
# Prediction by marginalized effects
#plot(marginal_effects(answer_newNT), ask = FALSE)

**Conclusions:**
- the posterior standard deviation for the interaction parameters have most of the mass close to zero.
            sd_sub__Block.Q:ED
            sd_sub__Block.L:ED
- This could indicate that letting them freely varying is not necessary or may even indicate overfitting. We should set up model(s) with these factors as only fixed effects and compare via LOO.

# TODO

## Interpretability

- Effect coding for categorical predictors (Block)

## Hypothesis testing

- Assess contrasts
- Reduce model complexity and compare with LOO

## Outlier evaluation

- Compare models based on data including and excluding outliers.