# Accented Prediction Analysis Log/Report
## Preps

In [4]:
### Load Libraries
suppressWarnings(suppressMessages(library(dplyr)))
suppressWarnings(suppressMessages(library(tidyr)))
suppressWarnings(suppressMessages(library(lmSupport)))
suppressWarnings(suppressMessages(library(lmerTest)))

### Load Data
dat1=read.table('accPred_PMN_MeanAmps.txt',header=T)
dat1['comp']="PMN"
dat2=read.table('accPred_N4_MeanAmps.txt',header=T)
dat2['comp']="N4"

## Create New variables for later modelling -string split of conditino
dat=rbind(dat1,dat2)
dat['cond']=varRecode(dat$bini,c(1,2,3,4,5),c('Native Expected','Native Unexpected',
                                          'Non-native Expected','Non-native Unexpected','Control'))
#clean up var names, factorise etc
dat=rename(dat,meanAmp=value)
dat=rename(dat,ppt=ERPset)
dat=separate(data=dat, col=cond,into=c('nativeness','expectedness'),sep=' ',remove=FALSE)
dat$nativeness=as.factor(dat$nativeness)
dat$expectedness=as.factor(dat$expectedness)
dat$comp=as.factor(dat$comp)
dat$cond=as.factor(dat$cond)
dat$chindex=as.factor(dat$chindex)

: Too few values at 360 locations: 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 109, 110, 111, 112, 113, 114, 115, 116, ...

## Analysis
### PMN

In [5]:
##Subset Data
pmn=filter(dat, comp=="PMN")
pmn=filter(pmn,cond!="Control")
pmnROI=filter(pmn,chindex%in%c(12,48,11,47,19,32))

## Centre categorical variables for the linear models (helps convergence)
pmnROI['nat.Cent']=varRecode(pmnROI$nativeness, c('Native','Non-native'),c(.5,-.5))
pmnROI['exp.Cent']=varRecode(pmnROI$expectedness, c('Expected','Unexpected'),c(.5,-.5))

Building several models for the data. These will be assessed using information criteria in order to choose which model fits the data best. Models go from simples to most complex.

In [6]:
pm1=lmer(meanAmp~nat.Cent*exp.Cent+(1|ppt),data=pmnROI,REML=F)
pm2=lmer(meanAmp~nat.Cent*exp.Cent+(1+exp.Cent|ppt),data=pmnROI,REML=F)
pm3=lmer(meanAmp~nat.Cent*exp.Cent+(1+nat.Cent|ppt),data=pmnROI,REML=F)
pm4=lmer(meanAmp~nat.Cent*exp.Cent+(1+nat.Cent+exp.Cent|ppt),data=pmnROI,REML=F)
anova(pm1,pm2,pm3,pm4)

Unnamed: 0,Df,AIC,BIC,logLik,deviance,Chisq,Chi Df,Pr(>Chisq)
object,6,1189.125,1212.442,-588.5626,1177.125,,,
..1,8,1159.421,1190.51,-571.7104,1143.421,33.70422,2.0,4.79979e-08
..2,8,1134.55,1165.638,-559.2748,1118.55,24.87136,0.0,0.0
..3,11,1083.368,1126.115,-530.6838,1061.368,57.18198,3.0,2.349987e-12


Looks like the model allowing for individual slopes for main effects of expectedness and nativeness fits data better.
This makes a lot of sense: the way in which individual participants will "react" to differences in nativeness and expectedness are likely to muddy the data if not properly modelled, and we can see that both main effects *should* be included.

In [7]:
summary(pm4)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
  approximations to degrees of freedom [lmerMod]
Formula: meanAmp ~ nat.Cent * exp.Cent + (1 + nat.Cent + exp.Cent | ppt)
   Data: pmnROI

     AIC      BIC   logLik deviance df.resid 
  1083.4   1126.1   -530.7   1061.4      349 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.5534 -0.6483 -0.0306  0.5641  3.2836 

Random effects:
 Groups   Name        Variance Std.Dev. Corr       
 ppt      (Intercept) 0.9435   0.9713              
          nat.Cent0.5 1.2896   1.1356   -0.85      
          exp.Cent0.5 0.8795   0.9378   -0.53  0.48
 Residual             0.8629   0.9289              
Number of obs: 360, groups:  ppt, 15

Fixed effects:
                         Estimate Std. Error        df t value Pr(>|t|)  
(Intercept)              -0.07449    0.26923  16.04000  -0.277   0.7856  
nat.Cent0.5               0.33286    0.32426  18.15000   1.027   0.3181  
exp.Cent0.5               0.53350    0.27894

### PMN Interp
- we have a main effect of *expectedness* barely significant (probably because it's sucked by the interaction)
    - not so surprising here, expectedness, independently of nativeness *should* modulate the brain comuptational demands.
- interaction between natineness and expectedness means that the magnitude of the expectedness effect was modulated by the nativeness of the speaker, whereby the expectedness effect was **stronger** in the case of a native speaker than a non-native. This can be understood as a lack or less strong predictions when hearing non-native speech. This interpretation makes sense since the signal being somewhat less reliable the brain may have to deal with more costly un-ruling of expectation rather than having to predict less. It's probably more costly to predict and then to have to cancel that prediction and compute the actual signal than just to compute the actual signal.
- **The interaction needs to be unpacked** (run separate lmes)

### N4 window

In [8]:
n4=filter(dat, comp=="N4")
n4=filter(n4,cond!="Control")
n4ROI=filter(n4,chindex%in%c(12,48,11,47,19,32))

n4ROI['nat.Cent']=varRecode(n4ROI$nativeness, c('Native','Non-native'),c(.5,-.5))
n4ROI['exp.Cent']=varRecode(n4ROI$expectedness, c('Expected','Unexpected'),c(.5,-.5))

In [9]:
n4m1=lmer(meanAmp~nat.Cent*exp.Cent+(1|ppt),data=n4ROI,REML=F)
n4m2=lmer(meanAmp~nat.Cent*exp.Cent+(1+exp.Cent|ppt),data=n4ROI,REML=F)
n4m3=lmer(meanAmp~nat.Cent*exp.Cent+(1+nat.Cent|ppt),data=n4ROI,REML=F)
n4m4=lmer(meanAmp~nat.Cent*exp.Cent+(1+nat.Cent+exp.Cent|ppt),data=n4ROI,REML=F)

In [10]:
anova(n4m1,n4m2,n4m3,n4m4)

Unnamed: 0,Df,AIC,BIC,logLik,deviance,Chisq,Chi Df,Pr(>Chisq)
object,6,1340.798,1364.114,-664.3989,1328.798,,,
..1,8,1310.406,1341.495,-647.2031,1294.406,34.39161,2.0,3.403743e-08
..2,8,1317.485,1348.574,-650.7425,1301.485,0.0,0.0,1.0
..3,11,1278.14,1320.887,-628.0701,1256.14,45.34482,3.0,7.81592e-10


Again here the model with the main effect slopes per participant seems to fit the data better. We will therefore go with that once also since it parallels the PMN model and therefore allows for more direct comparison

In [11]:
summary(n4m4)

Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
  approximations to degrees of freedom [lmerMod]
Formula: meanAmp ~ nat.Cent * exp.Cent + (1 + nat.Cent + exp.Cent | ppt)
   Data: n4ROI

     AIC      BIC   logLik deviance df.resid 
  1278.1   1320.9   -628.1   1256.1      349 

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.83375 -0.58329 -0.02255  0.68161  2.53709 

Random effects:
 Groups   Name        Variance Std.Dev. Corr       
 ppt      (Intercept) 1.490    1.221               
          nat.Cent0.5 1.147    1.071    -0.26      
          exp.Cent0.5 1.332    1.154    -0.43 -0.09
 Residual             1.461    1.209               
Number of obs: 360, groups:  ppt, 15

Fixed effects:
                        Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)              -1.6027     0.3399  16.1100  -4.715  0.00023 ***
nat.Cent0.5               0.3314     0.3300  20.6800   1.004  0.32690    
exp.Cent0.5               0.7315   

### N4 Interp
- **main effect of expectedness** alone.
This time window was only doing more work when the item was unexpected. Both conditions are significant. This reflects that at this point in time the brain does do work to interpret and reconcile the word that would have been hearn and not compatible with the expectation.

In the case of the non-native context, the violation is not processed as a problem of expectation but probably more processed when the word is actually upcoming.

### Overall message with the two effects
The fact that the N4 time window is not affected by natineness, shows some light also on the timing of the effect. In all cases does the brain have to process the incongruence with the context, but in one case it's predicting strongly and in the other it's not really predicting (at least not as strongly) and deals with the stuff later (in a more passive view)

This sheds light on the theory of the brain as a passive vs. predictive machine, that **again** things may not be black and white, and that context highly changes how the brain generates its predictions and how it's able to adapt to the computational needs of a situation. Great stuff hey?