# CVC Analysis: Mixed Models and Multiple Regression

This notebook calculates mixed and linear models for the CVC neighborhood measures in separated models (ie. one per neighborhood measure, across age). Prints the output of the model, and the confidence intervals of the estimates to files for processing in a python notebook. 

## Data Loading

In [1]:
# requires memisc, lme4 and lmerTest to already be installed, use "install.packages('')" as needed
setwd("~/Dropbox/experiments/python/current_projects/jcl_multisyllabic_neighborhoods_2021/tables_and_figures/main_article/")
library("lme4")
library("lmerTest")
#library("memisc")

ERROR: Error in setwd("~/Dropbox/experiments/python/current_projects/jcl_multisyllabic_neighborhoods_2021/tables_and_figures/main_article/"): cannot change working directory


Load the cvc data

In [2]:
datadir_valid = "~/Dropbox/experiments/python/current_projects/jcl_multisyllabic_neighborhoods_2021/data/validation/combined/"
cvc <- read.csv(paste(datadir_valid,"cvc_mixed_nomelt.csv",sep=""))

# CVC Mixed Models

## SOND

In [3]:
cvc.mixed.sond <- lmer(PACT ~ age * SOND + (1|phonological),data = cvc)

In [4]:
summary(cvc.mixed.sond)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: PACT ~ age * SOND + (1 | phonological)
   Data: cvc

REML criterion at convergence: 2288.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0602 -0.4389  0.0184  0.4716  3.4821 

Random effects:
 Groups       Name        Variance Std.Dev.
 phonological (Intercept) 0.276    0.5254  
 Residual                 0.189    0.4348  
Number of obs: 1262, groups:  phonological, 539

Fixed effects:
                Estimate Std. Error         df t value Pr(>|t|)   
(Intercept)   -1.148e-01  5.395e-02  1.164e+03  -2.128  0.03354 * 
agesix         1.721e-01  5.369e-02  8.085e+02   3.205  0.00141 **
agethree       1.090e-01  5.327e-02  7.816e+02   2.047  0.04099 * 
SOND           1.952e-02  7.879e-03  1.225e+03   2.478  0.01336 * 
agesix:SOND   -1.839e-02  7.311e-03  8.680e+02  -2.515  0.01207 * 
agethree:SOND -1.148e-02  8.007e-03  7.810e+02  -1.434  0.15193   
---
Signif. codes:  0 ‘***’ 0.

In [5]:
anova(cvc.mixed.sond)

Unnamed: 0,Sum Sq,Mean Sq,NumDF,DenDF,F value,Pr(>F)
age,1.9780808,0.9890404,2,804.5955,5.232153,0.00552528
SOND,0.5533219,0.5533219,1,827.9177,2.927145,0.08747597
age:SOND,1.1961289,0.5980645,2,840.8101,3.163839,0.04276678


In [49]:
confint(cvc.mixed.sond)

Computing profile confidence intervals ...


Unnamed: 0,2.5 %,97.5 %
.sig01,0.484133496,0.568036105
.sigma,0.412404166,0.456609587
(Intercept),-0.220464819,-0.009244458
agesix,0.06689185,0.277121545
agethree,0.004808537,0.213312292
SOND,0.004094482,0.034939081
agesix:SOND,-0.032698112,-0.004084807
agethree:SOND,-0.027154775,0.004186053


In [50]:
write.csv( summary(cvc.mixed.sond)$coefficients,"cvc_tables/cvc_mixed_sond.cvc")

In [51]:
write.csv(confint(cvc.mixed.sond),"cvc_tables/cvc_mixed_sond_confint.csv")

Computing profile confidence intervals ...


In [53]:
anova(cvc.mixed.sond)

Unnamed: 0,Sum Sq,Mean Sq,NumDF,DenDF,F value,Pr(>F)
age,1.9780808,0.9890404,2,804.5955,5.232153,0.00552528
SOND,0.5533219,0.5533219,1,827.9177,2.927145,0.08747597
age:SOND,1.1961289,0.5980645,2,840.8101,3.163839,0.04276678


## PLD20

In [56]:
cvc.mixed.pld20 <- lmer(PACT ~ age * PLD20 + (1|phonological),data = cvc)

In [57]:
summary(cvc.mixed.pld20)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: PACT ~ age * PLD20 + (1 | phonological)
   Data: cvc

REML criterion at convergence: 2267.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0545 -0.4501  0.0313  0.4749  3.5936 

Random effects:
 Groups       Name        Variance Std.Dev.
 phonological (Intercept) 0.2701   0.5197  
 Residual                 0.1904   0.4364  
Number of obs: 1262, groups:  phonological, 539

Fixed effects:
                 Estimate Std. Error         df t value Pr(>|t|)  
(Intercept)       0.55424    0.26010 1218.79214   2.131   0.0333 *
agesix           -0.04620    0.24505  829.70160  -0.189   0.8505  
agethree          0.25601    0.27696  776.11419   0.924   0.3556  
PLD20            -0.35000    0.16105 1223.45752  -2.173   0.0300 *
agesix:PLD20      0.06068    0.15429  821.12611   0.393   0.6942  
agethree:PLD20   -0.12757    0.17113  777.30831  -0.745   0.4562  
---
Signif. codes:  0 ‘***’ 0

In [58]:
write.csv( summary(cvc.mixed.pld20)$coefficients,"cvc_tables/cvc_mixed_pld20.cvc")

In [59]:
write.csv(confint(cvc.mixed.pld20),"cvc_tables/cvc_mixed_pld20_confint.csv")

Computing profile confidence intervals ...


## PFEAT20

In [60]:
cvc.mixed.pfeat20 <- lmer(PACT ~ age * PFEAT20 + (1|phonological),data = cvc)

In [61]:
summary(cvc.mixed.pfeat20)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: PACT ~ age * PFEAT20 + (1 | phonological)
   Data: cvc

REML criterion at convergence: 2270.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0570 -0.4492  0.0293  0.4900  3.6025 

Random effects:
 Groups       Name        Variance Std.Dev.
 phonological (Intercept) 0.2731   0.5226  
 Residual                 0.1904   0.4364  
Number of obs: 1262, groups:  phonological, 539

Fixed effects:
                   Estimate Std. Error         df t value Pr(>|t|)  
(Intercept)         0.28516    0.16986 1146.30245   1.679   0.0935 .
agesix              0.04817    0.16834  817.19077   0.286   0.7748  
agethree            0.08928    0.17146  778.29561   0.521   0.6027  
PFEAT20            -0.31573    0.18000 1158.06750  -1.754   0.0797 .
agesix:PFEAT20      0.01320    0.18313  823.78343   0.072   0.9425  
agethree:PFEAT20   -0.04659    0.18196  778.97356  -0.256   0.7980  
---
Signif. c

In [62]:
write.csv( summary(cvc.mixed.pfeat20)$coefficients,"cvc_tables/cvc_mixed_pfeat20.cvc")

In [63]:
write.csv(confint(cvc.mixed.pfeat20),"cvc_tables/cvc_mixed_pfeat20_confint.csv")

Computing profile confidence intervals ...


## ND

In [64]:
cvc.mixed.nd <- lmer(PACT ~ age * ND + (1|phonological),data = cvc)

In [65]:
summary(cvc.mixed.nd)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: PACT ~ age * ND + (1 | phonological)
   Data: cvc

REML criterion at convergence: 2285.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0524 -0.4491  0.0318  0.4747  3.5945 

Random effects:
 Groups       Name        Variance Std.Dev.
 phonological (Intercept) 0.2703   0.5199  
 Residual                 0.1904   0.4364  
Number of obs: 1262, groups:  phonological, 539

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)  
(Intercept) -1.437e-01  7.154e-02  1.203e+03  -2.009   0.0448 *
agesix       7.523e-02  7.295e-02  8.168e+02   1.031   0.3027  
agethree     4.393e-03  7.425e-02  7.830e+02   0.059   0.9528  
ND           1.725e-02  8.020e-03  1.222e+03   2.151   0.0317 *
agesix:ND   -3.003e-03  7.671e-03  8.200e+02  -0.391   0.6955  
agethree:ND  5.935e-03  8.513e-03  7.766e+02   0.697   0.4859  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 

In [66]:
write.csv( summary(cvc.mixed.nd)$coefficients,"cvc_tables/cvc_mixed_nd.cvc")

In [67]:
write.csv(confint(cvc.mixed.nd),"cvc_tables/cvc_mixed_nd_confint.csv")

Computing profile confidence intervals ...
