##### Basic analysis of fit correlation parameters from the correlation task.


Produces analyses associated with Fig 7.

Normative evidence weighting and accumulation in correlated environments. Tardiff et al. (2024).

Nathan Tardiff 05/28/24

In [1]:
#clear memory
rm(list=ls())

## loading data/libraries ##

#load libraries
library(lme4)
library(dplyr)
library(lmerTest)

se <- function(x) sqrt(var(x) / length(x))

switch(Sys.info()[['sysname']],
       Windows = PROJECT_DIR <- paste0('C:/Users/',Sys.getenv('USERNAME'),
                              '/Dropbox/Goldlab/correlated/'),
       Darwin = PROJECT_DIR <- '~/Dropbox/Goldlab/correlated/'
)

DATA_DIR = paste0(PROJECT_DIR,'/data/')
setwd(PROJECT_DIR)


DATA_FILE = 'rho_params_best_2023-12-08.csv'

Loading required package: Matrix


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



Attaching package: ‘lmerTest’


The following object is masked from ‘package:lme4’:

    lmer


The following object is masked from ‘package:stats’:

    step




In [2]:
# functions we need
fisherz <- function(r){
    .5*log((1+r)/(1-r))
}

fishezr <- function(r){
    (exp(2*z)-1)/(exp(2*z)+1)
}

In [3]:
rho_df <- read.table(paste0(DATA_DIR,DATA_FILE),sep=',', header=TRUE, 
                    stringsAsFactors=FALSE,na.strings = 'NaN')

if (any(is.na(rho_df))) {
    stop('Misisng data detected!')
}

head(rho_df)

Unnamed: 0_level_0,rho_cond,param,subject,value,rho,scale_dev
Unnamed: 0_level_1,<dbl>,<chr>,<chr>,<dbl>,<dbl>,<dbl>
1,0.2,Rn,5bd781291fd7c80001bb1fad,-0.2734775,-0.2,0.9529707
2,0.2,Rn,5d645bf6912c630018e269e3,-0.118976,-0.2,1.0494189
3,0.2,Rn,5e705a1be6c65a62c56a3143,-0.107345,-0.2,1.0563232
4,0.2,Rn,5ef37588bfe86a0ca12ba515,-0.1656104,-0.2,1.0212673
5,0.2,Rn,5f2dae50cee5310bdd43d160,-0.3064545,-0.2,0.9310918
6,0.2,Rn,5f2de3a7b874e712ac7bc516,-0.1751329,-0.2,1.015423


In [4]:
# convert correlations to fisher-z scale
rho_df$rhoz <- fisherz(rho_df$rho)
rho_df$valuez <- fisherz(rho_df$value)

rho_df$devz <- rho_df$valuez-rho_df$rhoz

## What is the slope of objective z(rho) ~ subjective z(rho)

In [5]:
#test slope
rho.lm.0 <- lmer(valuez~rhoz + (1|subject), 
                 data=rho_df, 
                 control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))

In [6]:
#subjects significantly underestimate correlation
summary(rho.lm.0)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: valuez ~ rhoz + (1 | subject)
   Data: rho_df
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

REML criterion at convergence: -100.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.5565 -0.4864  0.0487  0.4701  3.6620 

Random effects:
 Groups   Name        Variance Std.Dev.
 subject  (Intercept) 0.005547 0.07448 
 Residual             0.028495 0.16881 
Number of obs: 200, groups:  subject, 100

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept) -0.03656    0.01407 99.00000  -2.599   0.0108 *  
rhoz         0.70744    0.01728 99.00000  40.933   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
rhoz 0.000 

## test deviation of objective rho from subjective rho
Is subjective z(rho) - objective z(rho) significantly different from zero?

Since positive and negative correlation differences move in opposite directions, we flip the sign of the negative correlation differences.

In [7]:
rho_df$devz_ab <- rho_df$devz
rho_df$devz_ab[rho_df$rho<0] <- -rho_df$devz_ab[rho_df$rho<0]

In [8]:
rho_df$rho.fs <- as.factor(rho_df$rho)
contrasts(rho_df$rho.fs) <- contr.sum(8)
contrasts(rho_df$rho.fs)

rho_df$param.fs <- as.factor(rho_df$param)
contrasts(rho_df$param.fs) <- contr.sum(2)
contrasts(rho_df$param.fs)

my_simple<-contr.treatment(2)-matrix(rep(1/2, 2), ncol=1)
rho_df$param.f <- as.factor(rho_df$param)
contrasts(rho_df$param.f) <- my_simple
contrasts(rho_df$param.f)
#rho_df[c('rho','rho.fs')]

0,1,2,3,4,5,6,7
-0.8,1,0,0,0,0,0,0
-0.6,0,1,0,0,0,0,0
-0.4,0,0,1,0,0,0,0
-0.2,0,0,0,1,0,0,0
0.2,0,0,0,0,1,0,0
0.4,0,0,0,0,0,1,0
0.6,0,0,0,0,0,0,1
0.8,-1,-1,-1,-1,-1,-1,-1


0,1
Rn,1
Rp,-1


Unnamed: 0,2
Rn,-0.5
Rp,0.5


In [9]:
#NOT separated by pos/neg corr
rhof.lm.0 <- lmer(devz_ab~1 + (1|subject), 
                 data=rho_df, 
                 control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))

In [10]:
summary(rhof.lm.0)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: devz_ab ~ 1 + (1 | subject)
   Data: rho_df
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

REML criterion at convergence: -55.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2905 -0.4180  0.1159  0.5979  3.2886 

Random effects:
 Groups   Name        Variance Std.Dev.
 subject  (Intercept) 0.00127  0.03564 
 Residual             0.04187  0.20461 
Number of obs: 200, groups:  subject, 100

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)  -0.1815     0.0149 99.0000  -12.18   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In [11]:
mean(rho_df$devz_ab)
mean(rho_df$devz_ab[rho_df$param=='Rn'])
mean(rho_df$devz_ab[rho_df$param=='Rp'])

## What is the effect of this deviation in the bound scaling domain?
Because bound scaling errors are differentially impactful for positive and negative correlations, we can ask whether bound scaling errors are worse for positive or negative correlations, given the subjective correlations. 

To do so, we log-transform the bound scaling values since they are ratios, and we again flip the signs for the negative correlations so deviations are in the same "direction."

In [12]:
#what about in terms of bound scaling?
rho_df$scale_dev_log <- log10(rho_df$scale_dev)
rho_df$scale_dev_logab <- rho_df$scale_dev_log
rho_df$scale_dev_logab[rho_df$rho < 0] <- -rho_df$scale_dev_logab[rho_df$rho < 0]

In [13]:
#now let's test for scale factor deviation
rho_sc.lm.0 <- lmer(scale_dev_logab~1 + param.f + (1|subject), 
                 data=rho_df, 
                 control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))

boundary (singular) fit: see help('isSingular')



In [14]:
summary(rho_sc.lm.0)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: scale_dev_logab ~ 1 + param.f + (1 | subject)
   Data: rho_df
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

REML criterion at convergence: -750.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.6515 -0.5969  0.1464  0.7075  2.2235 

Random effects:
 Groups   Name        Variance Std.Dev. 
 subject  (Intercept) 1.00e-21 3.162e-11
 Residual             1.26e-03 3.549e-02
Number of obs: 200, groups:  subject, 100

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)  -0.037222   0.002510 198.000000 -14.831  < 2e-16 ***
param.f2      0.021378   0.005019 198.000000   4.259 3.17e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
         (Intr)
param.f2 0.000 
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')


### gut checks because of singularity
Because the intercept random effect variance was zero (boundary issue), we verify the results in two ways.

1) placeing a weakly informative prior on the random effects
2) repeating with multiple optimizers

See: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#singular-models-random-effect-variances-estimated-as-zero-or-correlations-estimated-as---1 and the help page for lme4 isSingular().

In [15]:
#do we gain any insight into singularity by using a prior on covariance?
library(blme)
rho_sc.lm.b <- blmer(scale_dev_logab~1 + param.f + (1|subject), 
                 data=rho_df, 
                 control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))

In [16]:
summary(rho_sc.lm.b)

Cov prior  : subject ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
Prior dev  : 4.0467

Linear mixed model fit by REML ['blmerMod']
Formula: scale_dev_logab ~ 1 + param.f + (1 | subject)
   Data: rho_df
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))

REML criterion at convergence: -749.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.4534 -0.5847  0.1473  0.7025  2.1803 

Random effects:
 Groups   Name        Variance  Std.Dev.
 subject  (Intercept) 8.014e-05 0.008952
 Residual             1.190e-03 0.034494
Number of obs: 200, groups:  subject, 100

Fixed effects:
             Estimate Std. Error t value
(Intercept) -0.037222   0.002598 -14.326
param.f2     0.021378   0.004878   4.382

Correlation of Fixed Effects:
         (Intr)
param.f2 0.000 

In [17]:
rho_sc.lm.all <- allFit(rho_sc.lm.0)

bobyqa : 

boundary (singular) fit: see help('isSingular')



[OK]
Nelder_Mead : 

boundary (singular) fit: see help('isSingular')



[OK]
nlminbwrap : 

boundary (singular) fit: see help('isSingular')



[OK]
nloptwrap.NLOPT_LN_NELDERMEAD : 

boundary (singular) fit: see help('isSingular')



[OK]
nloptwrap.NLOPT_LN_BOBYQA : 

boundary (singular) fit: see help('isSingular')



[OK]


In [18]:
summary(rho_sc.lm.all)

$which.OK
                       bobyqa                   Nelder_Mead 
                         TRUE                          TRUE 
                   nlminbwrap nloptwrap.NLOPT_LN_NELDERMEAD 
                         TRUE                          TRUE 
    nloptwrap.NLOPT_LN_BOBYQA 
                         TRUE 

$msgs
$msgs$bobyqa
[1] "boundary (singular) fit: see help('isSingular')"

$msgs$Nelder_Mead
[1] "boundary (singular) fit: see help('isSingular')"

$msgs$nlminbwrap
[1] "boundary (singular) fit: see help('isSingular')"

$msgs$nloptwrap.NLOPT_LN_NELDERMEAD
[1] "boundary (singular) fit: see help('isSingular')"

$msgs$nloptwrap.NLOPT_LN_BOBYQA
[1] "boundary (singular) fit: see help('isSingular')"


$fixef
                              (Intercept)   param.f2
bobyqa                        -0.03722226 0.02137831
Nelder_Mead                   -0.03722226 0.02137831
nlminbwrap                    -0.03722226 0.02137831
nloptwrap.NLOPT_LN_NELDERMEAD -0.03722226 0.02137831
nloptwrap.NLO

### finally compute sds for rho deviations

In [19]:
rho_sd_rho <- summarise(group_by(rho_df,rho,param),rho_sd=sd(valuez))
rho_sd_rho

summarise(group_by(rho_sd_rho,param),rho_sd_mean=mean(rho_sd))

[1m[22m`summarise()` has grouped output by 'rho'. You can override using the `.groups`
argument.


rho,param,rho_sd
<dbl>,<chr>,<dbl>
-0.8,Rn,0.11073425
-0.6,Rn,0.10754602
-0.4,Rn,0.08345925
-0.2,Rn,0.08785124
0.2,Rp,0.13926349
0.4,Rp,0.17234297
0.6,Rp,0.28840498
0.8,Rp,0.2950518


param,rho_sd_mean
<chr>,<dbl>
Rn,0.09739769
Rp,0.22376581
