# Does your dorm matter for your well-being?

We build models to predict:
1. Spring well-being from fall well-being
1. Spring well-being from fall well-being, demographic items (age, family income, family education, race, gender), and ambient empathy
1. Same, plus random effects by dorm.

# Results:
- Demographics and ambient empathy do not improve model
- Random effect model does not improve fit, and no variance is apportioned to the dorm level

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Configuration" data-toc-modified-id="Configuration-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Configuration</a></span></li><li><span><a href="#Import-and-load" data-toc-modified-id="Import-and-load-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Import and load</a></span></li><li><span><a href="#Quick-summary-of-whole-dorm-well-beings" data-toc-modified-id="Quick-summary-of-whole-dorm-well-beings-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Quick summary of whole-dorm well-beings</a></span></li><li><span><a href="#Standard-regression-models-(not-mixed)" data-toc-modified-id="Standard-regression-models-(not-mixed)-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Standard regression models (not mixed)</a></span><ul class="toc-item"><li><span><a href="#Base-model,-minimal-predictors" data-toc-modified-id="Base-model,-minimal-predictors-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Base model, minimal predictors</a></span></li><li><span><a href="#Add-demographic-covariates" data-toc-modified-id="Add-demographic-covariates-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Add demographic covariates</a></span></li><li><span><a href="#Is-this-a-significant-improvement?-(No)" data-toc-modified-id="Is-this-a-significant-improvement?-(No)-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Is this a significant improvement? (No)</a></span></li></ul></li><li><span><a href="#Mixed-effect-models" data-toc-modified-id="Mixed-effect-models-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Mixed effect models</a></span><ul class="toc-item"><li><span><a href="#REML-model-to-accurately-determine-variance-apportioned-to-dorm-(zero)" data-toc-modified-id="REML-model-to-accurately-determine-variance-apportioned-to-dorm-(zero)-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>REML model to accurately determine variance apportioned to dorm (zero)</a></span></li><li><span><a href="#REML=false-model-to-maximize-predictive-value.-Is-this-a-significant-improvement-over-the-non-mixed-model?-(No)" data-toc-modified-id="REML=false-model-to-maximize-predictive-value.-Is-this-a-significant-improvement-over-the-non-mixed-model?-(No)-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>REML=false model to maximize predictive value. Is this a significant improvement over the non-mixed model? (No)</a></span></li></ul></li><li><span><a href="#Bring-in-network-density-to-the-mixed-model" data-toc-modified-id="Bring-in-network-density-to-the-mixed-model-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Bring in network density to the mixed model</a></span><ul class="toc-item"><li><span><a href="#Prepare-the-data" data-toc-modified-id="Prepare-the-data-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Prepare the data</a></span></li><li><span><a href="#Mixed-models---REML" data-toc-modified-id="Mixed-models---REML-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Mixed models - REML</a></span></li><li><span><a href="#Add-even-more-dorm-level-covariates" data-toc-modified-id="Add-even-more-dorm-level-covariates-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>Add even-more dorm-level covariates</a></span></li></ul></li></ul></div>

## Configuration

In [1]:
DATA_FILE = 'data/postprocessed/final_for_analysis_R.csv'

IMPUTE_MISSING = TRUE
INCLUDE_FALL_WB_AS_PREDICTOR = TRUE
INCLUDE_DEMOS_AS_PREDICTOR = TRUE
# DV = 'Wellbeing_fall'
DV = 'Wellbeing_spring'

if (INCLUDE_FALL_WB_AS_PREDICTOR) {
    stopifnot(DV == 'Wellbeing_spring')
}

## Import and load

In [2]:
library(car)
library(plyr)
library(tidyverse)
library(hexbin)
library(mice)
library(nlme)
library(lme4)
library(lmerTest)

options(width=200)

Loading required package: carData

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.0 ──

[32m✔[39m [34mggplot2[39m 3.2.1     [32m✔[39m [34mpurrr  [39m 0.3.3
[32m✔[39m [34mtibble [39m 2.1.3     [32m✔[39m [34mdplyr  [39m 0.8.3
[32m✔[39m [34mtidyr  [39m 1.0.0     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 1.3.1     [32m✔[39m [34mforcats[39m 0.4.0

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32marrange()[39m   masks [34mplyr[39m::arrange()
[31m✖[39m [34mpurrr[39m::[32mcompact()[39m   masks [34mplyr[39m::compact()
[31m✖[39m [34mdplyr[39m::[32mcount()[39m     masks [34mplyr[39m::count()
[31m✖[39m [34mdplyr[39m::[32mfailwith()[39m  masks [34mplyr[39m::failwith()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m    masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mid()[39m        masks [34mply

In [3]:
df = read.csv(DATA_FILE, na.strings=c("", " ", "NA"), row.names=1)
keep_cols = c(
    'NID', 'Age', 'ParentEducationMax',
    'FinclAid', 'FmlyIncome', 'Gender', 'Race',
    'Ambient_empathy',
    'Wellbeing_fall', 'Wellbeing_spring')
for (name in names(df)) {
    if (endsWith(name, '_dorm')) {
        keep_cols = c(keep_cols, name)
    }
}
df = df[,keep_cols]
dim(df)
head(df)

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,Wellbeing_fall_dorm,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
vgxlTMkQs5,7,18,4.0,0,87500.0,M,white,-0.7154534,-2.06354788,-0.76535414,0.16596155,3.25,131052.6,4.340909,4.863636,5.477273,2.598485
M9obKkDvc0,11,18,3.5,1,,F,south_asian,-0.8199099,-0.01143413,-0.04997158,0.15868155,3.142857,107142.9,3.642857,4.642857,5.428571,2.642857
RdS4vMvQjo,9,18,4.0,1,125000.0,M,white,-0.8994971,0.919656,0.66541099,-0.30092105,3.3,124375.0,2.1,4.9,6.2,2.6
n08loMfJH7,4,18,4.0,0,200000.0,F,east_asian,,0.65342017,0.48656535,-0.06457484,3.316667,115434.8,3.177419,4.951613,4.83871,2.639785
8rsekwqjFy,5,18,2.5,1,125000.0,M,south_asian,-0.4873343,0.6983916,-0.04997158,0.01290587,3.40625,119230.8,3.5625,5.15625,5.125,2.789583
FjTWohEryS,13,18,4.0,1,45000.0,F,east_asian,,0.04290417,-0.1393944,-0.32241874,3.157895,108815.8,3.263158,4.5,5.210526,2.5


In [4]:
if (IMPUTE_MISSING) {
    print("Imputing missing values")
    imp = mice(df)
    df = complete(imp)
    head(df)
} else {
    df = na.omit(df)
}

[1] "Imputing missing values"

 iter imp variable
  1   1  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  1   2  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  1   3  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  1   4  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  1   5  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  2   1  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  2   2  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  2   3  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  2   4  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  2   5  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  3   1  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  3   2  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
  3   3  ParentEducationMax  FinclAid  FmlyIncome  Race  Ambient_empathy
 

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,Wellbeing_fall_dorm,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,7,18,4.0,0,87500,M,white,-0.7154534,-2.06354788,-0.76535414,0.16596155,3.25,131052.6,4.340909,4.863636,5.477273,2.598485
2,11,18,3.5,1,62500,F,south_asian,-0.8199099,-0.01143413,-0.04997158,0.15868155,3.142857,107142.9,3.642857,4.642857,5.428571,2.642857
3,9,18,4.0,1,125000,M,white,-0.8994971,0.919656,0.66541099,-0.30092105,3.3,124375.0,2.1,4.9,6.2,2.6
4,4,18,4.0,0,200000,F,east_asian,-0.9155707,0.65342017,0.48656535,-0.06457484,3.316667,115434.8,3.177419,4.951613,4.83871,2.639785
5,5,18,2.5,1,125000,M,south_asian,-0.4873343,0.6983916,-0.04997158,0.01290587,3.40625,119230.8,3.5625,5.15625,5.125,2.789583
6,13,18,4.0,1,45000,F,east_asian,-1.2311373,0.04290417,-0.1393944,-0.32241874,3.157895,108815.8,3.263158,4.5,5.210526,2.5


## Quick summary of whole-dorm well-beings

In [5]:
df %>% group_by(NID) %>%
    summarize(wb_fall = mean(Wellbeing_fall),
              wb_spring = mean(Wellbeing_spring))

NID,wb_fall,wb_spring
<dbl>,<dbl>,<dbl>
1,0.17958135,0.43067609
2,0.10039586,0.23049091
4,-0.0421375,0.00871215
5,0.05322856,0.01841058
7,0.06902635,0.1716415
8,-0.15094904,-0.45876161
9,-0.09749154,-0.34804765
10,-0.35117851,-0.30147326
11,0.14734051,0.15272015
13,-0.30415259,-0.49261454


## Standard regression models (not mixed)

### Base model, minimal predictors

In [6]:
equation = paste(DV, ' ~  1')
if (INCLUDE_FALL_WB_AS_PREDICTOR) {
    equation = paste(equation, ' + Wellbeing_fall')
}
print(equation)
model1 = lm(as.formula(equation), df)
summary(model1)

[1] "Wellbeing_spring  ~  1  + Wellbeing_fall"



Call:
lm(formula = as.formula(equation), data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.69297 -0.48662  0.06646  0.49158  2.76843 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)    -3.286e-16  5.373e-02    0.00        1    
Wellbeing_fall  6.433e-01  5.387e-02   11.94   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7675 on 202 degrees of freedom
Multiple R-squared:  0.4139,	Adjusted R-squared:  0.411 
F-statistic: 142.6 on 1 and 202 DF,  p-value: < 2.2e-16


### Add demographic covariates

In [7]:
names(df)

In [8]:
if (INCLUDE_DEMOS_AS_PREDICTOR) {
    equation = paste(equation, '+ Age + ParentEducationMax + FinclAid + FmlyIncome + Gender + Race + Ambient_empathy')
    print(equation)
    model2 = lm(as.formula(equation), df)
    summary(model2)
} else {
    model2 = model1
}

[1] "Wellbeing_spring  ~  1  + Wellbeing_fall + Age + ParentEducationMax + FinclAid + FmlyIncome + Gender + Race + Ambient_empathy"



Call:
lm(formula = as.formula(equation), data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.48098 -0.42960  0.04799  0.50004  2.50902 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -3.539e-01  1.062e+00  -0.333    0.739    
Wellbeing_fall      6.154e-01  5.628e-02  10.935   <2e-16 ***
Age                -5.463e-03  5.308e-02  -0.103    0.918    
ParentEducationMax  4.548e-02  9.819e-02   0.463    0.644    
FinclAid            5.063e-03  1.429e-01   0.035    0.972    
FmlyIncome          1.900e-06  1.079e-06   1.760    0.080 .  
GenderM             1.731e-01  1.145e-01   1.512    0.132    
Genderother         3.470e-02  3.641e-01   0.095    0.924    
Raceeast_asian      6.184e-02  2.275e-01   0.272    0.786    
Racehispanic        2.715e-01  2.820e-01   0.963    0.337    
Raceother_or_mixed -5.099e-02  2.277e-01  -0.224    0.823    
Racesouth_asian    -2.108e-01  3.005e-01  -0.701    0.484    
Racewhite           1.

### Is this a significant improvement? (No)

In [9]:
anova(model1, model2)

Unnamed: 0_level_0,Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,202,118.9815,,,,
2,190,110.3275,12.0,8.653916,1.241941,0.2570723


## Mixed effect models

### REML model to accurately determine variance apportioned to dorm (zero)

In [10]:
model3 = lmer(as.formula(paste(equation, '+ (1|NID)')), data=df, REML=TRUE)
summary(model3)

“Some predictor variables are on very different scales: consider rescaling”
boundary (singular) fit: see ?isSingular

“Some predictor variables are on very different scales: consider rescaling”

Correlation matrix not shown by default, as p = 14 > 12.
Use print(obj, correlation=TRUE)  or
    vcov(obj)        if you need it




Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(paste(equation, "+ (1|NID)"))
   Data: df

REML criterion at convergence: 511.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2558 -0.5638  0.0630  0.6562  3.2926 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.0000   0.000   
 Residual             0.5807   0.762   
Number of obs: 204, groups:  NID, 11

Fixed effects:
                     Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)        -3.539e-01  1.062e+00  1.900e+02  -0.333    0.739    
Wellbeing_fall      6.154e-01  5.628e-02  1.900e+02  10.935   <2e-16 ***
Age                -5.463e-03  5.308e-02  1.900e+02  -0.103    0.918    
ParentEducationMax  4.548e-02  9.819e-02  1.900e+02   0.463    0.644    
FinclAid            5.063e-03  1.429e-01  1.900e+02   0.035    0.972    
FmlyIncome          1.900e-06  1.079e-06  1.900e+02   1.760    0.080 .  
GenderM   

### REML=false model to maximize predictive value. Is this a significant improvement over the non-mixed model? (No)

In [11]:
model4 = lmer(as.formula(paste(equation, '+ (1|NID)')), data=df, REML=FALSE)
anova(model4, model2)#, refit=FALSE)

“Some predictor variables are on very different scales: consider rescaling”
boundary (singular) fit: see ?isSingular

“Some predictor variables are on very different scales: consider rescaling”


Unnamed: 0_level_0,Df,AIC,BIC,logLik,deviance,Chisq,Chi Df,Pr(>Chisq)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
model2,15,483.535,533.3068,-226.7675,453.535,,,
model4,16,485.535,538.6249,-226.7675,453.535,0.0,1.0,1.0


## Bring in network density to the mixed model

### Prepare the data

In [12]:
density = read.csv('data/NetworkDensity2018.csv')
density = density[,2:ncol(density)]  # First row is a meaningless row number
head(density)

Unnamed: 0_level_0,Dorm,Network,Density
Unnamed: 0_level_1,<fct>,<fct>,<dbl>
1,FroSoCo,SpendTime,0.0008166282
2,Norcliffe&Adelfa,SocAdvice,0.0002515091
3,Meier&Naranja,EmpSupp,0.0002639293
4,FroSoCo,EmpSupp,0.0006054848
5,Okada,Persuasive,0.0001800929
6,JRo,NegAffPres,0.0001459374


In [13]:
table(density$Dorm)


         Alondra            Cedro          FroSoCo              JRo           Larkin    Meier&Naranja Norcliffe&Adelfa            Okada            Twain           Ujamaa        WestFloMo 
              12               12               12               12               12               12               12               12               12               12               12 

In [14]:
density$NID <- mapvalues(
    density$Dorm, 
    from=c("Alondra", "Cedro", "EAST", "FroSoCo", "JRo", "Kimball", "Larkin", "Okada", "Twain", "Ujamaa", "Meier&Naranja", "Norcliffe&Adelfa", "WestFloMo"), 
    to=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "13", "15"))

The following `from` values were not present in `x`: EAST, Kimball



In [15]:
table(density$NID)


 1  2  4  5  7 11 13  8  9 10 15 
12 12 12 12 12 12 12 12 12 12 12 

In [16]:
table(density$Network)


 CloseFrds    EmpSupp     Gossip      Liked NegAffPres NegEmoSupp Persuasive PosAffPres PosEmoSupp Responsive  SocAdvice  SpendTime 
        11         11         11         11         11         11         11         11         11         11         11         11 

In [17]:
density_close_friends = density %>%
    filter(Network == 'CloseFrds') %>%
    select(NID, Density) %>%
    arrange(NID)
names(density_close_friends) = c('NID', 'DensityCloseFriends')
head(density_close_friends)

Unnamed: 0_level_0,NID,DensityCloseFriends
Unnamed: 0_level_1,<fct>,<dbl>
1,1,0.0005430616
2,2,0.0008209813
3,4,0.0009838725
4,5,0.0007545272
5,7,0.000622665
6,11,0.0003512068


In [18]:
density_bad_news = density %>%
    filter(Network == 'NegEmoSupp') %>%
    select(NID, Density) %>%
    arrange(NID)
names(density_bad_news) = c('NID', 'DensityBadNews')
head(density_bad_news)

Unnamed: 0_level_0,NID,DensityBadNews
Unnamed: 0_level_1,<fct>,<dbl>
1,1,0.0003078994
2,2,0.0007001956
3,4,0.0005728361
4,5,0.0005513034
5,7,0.0003902446
6,11,0.0002327147


In [19]:
head(df)

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,Wellbeing_fall_dorm,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,7,18,4.0,0,87500,M,white,-0.7154534,-2.06354788,-0.76535414,0.16596155,3.25,131052.6,4.340909,4.863636,5.477273,2.598485
2,11,18,3.5,1,62500,F,south_asian,-0.8199099,-0.01143413,-0.04997158,0.15868155,3.142857,107142.9,3.642857,4.642857,5.428571,2.642857
3,9,18,4.0,1,125000,M,white,-0.8994971,0.919656,0.66541099,-0.30092105,3.3,124375.0,2.1,4.9,6.2,2.6
4,4,18,4.0,0,200000,F,east_asian,-0.9155707,0.65342017,0.48656535,-0.06457484,3.316667,115434.8,3.177419,4.951613,4.83871,2.639785
5,5,18,2.5,1,125000,M,south_asian,-0.4873343,0.6983916,-0.04997158,0.01290587,3.40625,119230.8,3.5625,5.15625,5.125,2.789583
6,13,18,4.0,1,45000,F,east_asian,-1.2311373,0.04290417,-0.1393944,-0.32241874,3.157895,108815.8,3.263158,4.5,5.210526,2.5


In [20]:
df = merge(df, density_close_friends, on="NID", all.x=TRUE)
df = merge(df, density_bad_news, on="NID", all.x=TRUE)
df[sample(nrow(df), 5), ]

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,Wellbeing_fall_dorm,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm,DensityCloseFriends,DensityBadNews
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
53,4,18,3.5,1,125000,F,white,-0.5370759,-0.4973256,-0.31824004,-0.02745401,3.333333,118695.7,3.16129,5.064516,4.870968,2.655914,0.0009838725,0.0005728361
121,8,21,3.5,1,125000,other,east_asian,-1.1006794,-0.3548819,-0.76535414,-0.1407524,3.25,106184.2,3.425,5.225,4.725,2.925,0.0006009766,0.0002314677
125,8,19,3.0,1,5000,F,east_asian,0.2686717,-1.1895478,-0.04997158,-0.0990191,3.275,112500.0,3.35,5.125,4.725,2.941667,0.0006009766,0.0002314677
76,5,18,2.0,1,62500,M,hispanic,-0.6772456,0.601827,-0.04997158,0.01894115,3.4375,124038.5,3.6875,5.1875,5.1875,2.76875,0.0007545272,0.0005513034
160,11,18,3.0,1,125000,F,hispanic,-1.0778002,0.1947062,-0.1393944,0.14395724,3.178571,105769.2,3.714286,4.678571,5.357143,2.630952,0.0003512068,0.0002327147


### Mixed models - REML

In [21]:
model5 = lmer(as.formula(paste(equation, '+ DensityCloseFriends + DensityBadNews + (1|NID)')), data=df, REML=TRUE)
summary(model5)

“Some predictor variables are on very different scales: consider rescaling”
“Some predictor variables are on very different scales: consider rescaling”

Correlation matrix not shown by default, as p = 16 > 12.
Use print(obj, correlation=TRUE)  or
    vcov(obj)        if you need it




Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(paste(equation, "+ DensityCloseFriends + DensityBadNews + (1|NID)"))
   Data: df

REML criterion at convergence: 482.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1304 -0.5033  0.0450  0.6496  3.4442 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.002771 0.05264 
 Residual             0.579595 0.76131 
Number of obs: 204, groups:  NID, 11

Fixed effects:
                      Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)         -8.989e-01  1.180e+00  9.374e+01  -0.762    0.448    
Wellbeing_fall       6.049e-01  5.669e-02  1.851e+02  10.671   <2e-16 ***
Age                  1.812e-02  5.707e-02  9.831e+01   0.318    0.752    
ParentEducationMax   5.706e-02  9.869e-02  1.878e+02   0.578    0.564    
FinclAid            -1.057e-02  1.477e-01  1.875e+02  -0.072    0.943    
FmlyIncome           1.361e-06  1.107e

In [22]:
model6 = lmer(as.formula(paste(equation, '+ DensityCloseFriends + DensityBadNews + (1|NID)')), data=df, REML=FALSE)
summary(model6)

“Some predictor variables are on very different scales: consider rescaling”
boundary (singular) fit: see ?isSingular

“Some predictor variables are on very different scales: consider rescaling”

Correlation matrix not shown by default, as p = 16 > 12.
Use print(obj, correlation=TRUE)  or
    vcov(obj)        if you need it




Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(paste(equation, "+ DensityCloseFriends + DensityBadNews + (1|NID)"))
   Data: df

     AIC      BIC   logLik deviance df.resid 
   487.6    547.4   -225.8    451.6      186 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2754 -0.5241  0.0410  0.6739  3.5998 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.0000   0.000   
 Residual             0.5358   0.732   
Number of obs: 204, groups:  NID, 11

Fixed effects:
                      Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)         -8.306e-01  1.125e+00  2.040e+02  -0.738   0.4613    
Wellbeing_fall       6.059e-01  5.446e-02  2.040e+02  11.125   <2e-16 ***
Age                  1.473e-02  5.444e-02  2.040e+02   0.271   0.7870    
ParentEducationMax   5.734e-02  9.476e-02  2.040e+02   0.605   0.5458    
FinclAid            -9.473e-03  1.417e-01 

In [23]:
anova(model6, model2)

Unnamed: 0_level_0,Df,AIC,BIC,logLik,deviance,Chisq,Chi Df,Pr(>Chisq)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
model2,15,482.8305,532.6023,-226.4152,452.8305,,,
model6,18,487.6339,547.3601,-225.817,451.6339,1.196574,3.0,0.7538261


### Add even-more dorm-level covariates

In [22]:
eq = equation
for (col in names(df)) {
    if (endsWith(col, '_dorm')) {
        eq = paste(eq, '+', col)
    }
}
eq = paste(eq, '+ DensityCloseFriends + DensityBadNews + (1|NID)')
print(eq)
model7 = lmer(as.formula(eq), data=df, REML=TRUE)
summary(model7)

[1] "Wellbeing_spring  ~  1  + Wellbeing_fall + Age + ParentEducationMax + FinclAid + FmlyIncome + Gender + Race + Ambient_empathy + Wellbeing_fall_dorm + ParentEducationMax_dorm + FmlyIncome_dorm + Extraversion_dorm + Agreeableness_dorm + Openness_dorm + Empathic_Concern_dorm + DensityCloseFriends + DensityBadNews + (1|NID)"


“Some predictor variables are on very different scales: consider rescaling”
“Some predictor variables are on very different scales: consider rescaling”

Correlation matrix not shown by default, as p = 23 > 12.
Use print(obj, correlation=TRUE)  or
    vcov(obj)        if you need it




Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(eq)
   Data: df

REML criterion at convergence: 493.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1231 -0.5639  0.1040  0.6389  3.3161 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.03329  0.1824  
 Residual             0.56514  0.7518  
Number of obs: 204, groups:  NID, 11

Fixed effects:
                          Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)              3.676e+00  4.981e+00  2.111e+00   0.738    0.534    
Wellbeing_fall           6.452e-01  6.151e-02  1.938e+01  10.491 1.96e-09 ***
Age                      3.231e-02  6.183e-02  1.485e+02   0.523    0.602    
ParentEducationMax       6.282e-02  1.270e-01  8.604e+00   0.495    0.633    
FinclAid                -3.186e-02  1.463e-01  1.744e+02  -0.218    0.828    
FmlyIncome               1.177e-06  1.524e-06  3.119e+01   0.772    0.446    
G

In [27]:
# cor.test(df$DensityCloseFriends, df$Wellbeing_fall)