# Does your dorm matter for your well-being?

We build models to predict:
1. Spring well-being from fall well-being
1. Spring well-being from fall well-being, demographic items (age, family income, family education, race, gender), and ambient empathy
1. Same, plus random effects by dorm.

# Results:
- Demographics and ambient empathy do not improve model
- Random effect model does not improve fit, and no variance is apportioned to the dorm level

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Configuration" data-toc-modified-id="Configuration-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Configuration</a></span></li><li><span><a href="#Import-and-load" data-toc-modified-id="Import-and-load-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Import and load</a></span></li><li><span><a href="#Impute-missing-values" data-toc-modified-id="Impute-missing-values-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Impute missing values</a></span></li><li><span><a href="#Quick-summary-of-whole-dorm-well-beings" data-toc-modified-id="Quick-summary-of-whole-dorm-well-beings-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Quick summary of whole-dorm well-beings</a></span></li><li><span><a href="#Standard-regression-models-(not-mixed)" data-toc-modified-id="Standard-regression-models-(not-mixed)-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Standard regression models (not mixed)</a></span><ul class="toc-item"><li><span><a href="#Base-model,-minimal-predictors" data-toc-modified-id="Base-model,-minimal-predictors-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Base model, minimal predictors</a></span></li><li><span><a href="#Add-demographic-covariates" data-toc-modified-id="Add-demographic-covariates-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Add demographic covariates</a></span></li><li><span><a href="#Is-this-a-significant-improvement?-(No)" data-toc-modified-id="Is-this-a-significant-improvement?-(No)-5.3"><span class="toc-item-num">5.3&nbsp;&nbsp;</span>Is this a significant improvement? (No)</a></span></li><li><span><a href="#Try-just-fall-well-being-and-ambient-empathy,-get-rid-of-noisy-covariates" data-toc-modified-id="Try-just-fall-well-being-and-ambient-empathy,-get-rid-of-noisy-covariates-5.4"><span class="toc-item-num">5.4&nbsp;&nbsp;</span>Try just fall well-being and ambient empathy, get rid of noisy covariates</a></span></li></ul></li><li><span><a href="#Mixed-effect-models" data-toc-modified-id="Mixed-effect-models-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Mixed effect models</a></span><ul class="toc-item"><li><span><a href="#REML-model-to-accurately-determine-variance-apportioned-to-dorm-(zero)" data-toc-modified-id="REML-model-to-accurately-determine-variance-apportioned-to-dorm-(zero)-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>REML model to accurately determine variance apportioned to dorm (zero)</a></span></li><li><span><a href="#REML=false-model-to-maximize-predictive-value.-Is-this-a-significant-improvement-over-the-non-mixed-model?-(No)" data-toc-modified-id="REML=false-model-to-maximize-predictive-value.-Is-this-a-significant-improvement-over-the-non-mixed-model?-(No)-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>REML=false model to maximize predictive value. Is this a significant improvement over the non-mixed model? (No)</a></span></li></ul></li><li><span><a href="#Bring-in-network-density-to-the-mixed-model" data-toc-modified-id="Bring-in-network-density-to-the-mixed-model-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Bring in network density to the mixed model</a></span><ul class="toc-item"><li><span><a href="#Prepare-the-data" data-toc-modified-id="Prepare-the-data-7.1"><span class="toc-item-num">7.1&nbsp;&nbsp;</span>Prepare the data</a></span></li><li><span><a href="#Mixed-models---using-dorm-level-network-densities" data-toc-modified-id="Mixed-models---using-dorm-level-network-densities-7.2"><span class="toc-item-num">7.2&nbsp;&nbsp;</span>Mixed models - using dorm-level network densities</a></span></li><li><span><a href="#Add-even-more-dorm-level-covariates" data-toc-modified-id="Add-even-more-dorm-level-covariates-7.3"><span class="toc-item-num">7.3&nbsp;&nbsp;</span>Add even-more dorm-level covariates</a></span></li></ul></li></ul></div>

## Configuration

In [1]:
DATA_FILE = 'data/postprocessed/final_for_analysis_R.csv'

IMPUTE_MISSING = TRUE
INCLUDE_FALL_WB_AS_PREDICTOR = TRUE
INCLUDE_DEMOS_AS_PREDICTOR = TRUE
# DV = 'Wellbeing_fall'
DV = 'Wellbeing_spring'

if (INCLUDE_FALL_WB_AS_PREDICTOR) {
    stopifnot(DV == 'Wellbeing_spring')
}

## Import and load

In [2]:
library(car)
library(plyr)
library(tidyverse)
library(hexbin)
library(mice)
library(nlme)
library(lme4)
library(lmerTest)

options(width=200)

Loading required package: carData

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.0 ──

[32m✔[39m [34mggplot2[39m 3.2.1     [32m✔[39m [34mpurrr  [39m 0.3.3
[32m✔[39m [34mtibble [39m 2.1.3     [32m✔[39m [34mdplyr  [39m 0.8.3
[32m✔[39m [34mtidyr  [39m 1.0.0     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 1.3.1     [32m✔[39m [34mforcats[39m 0.4.0

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32marrange()[39m   masks [34mplyr[39m::arrange()
[31m✖[39m [34mpurrr[39m::[32mcompact()[39m   masks [34mplyr[39m::compact()
[31m✖[39m [34mdplyr[39m::[32mcount()[39m     masks [34mplyr[39m::count()
[31m✖[39m [34mdplyr[39m::[32mfailwith()[39m  masks [34mplyr[39m::failwith()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m    masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mid()[39m        masks [34mply

In [3]:
df = read.csv(DATA_FILE, na.strings=c("", " ", "NA"), row.names=1)
keep_cols = c(
    'NID', 'Age', 'ParentEducationMax',
    'FinclAid', 'FmlyIncome', 'Gender', 'Race',
    'Ambient_empathy',
    'Wellbeing_fall', 'Wellbeing_spring')
for (name in names(df)) {
    if (endsWith(name, '_dorm')) {
        keep_cols = c(keep_cols, name)
    }
}
df = df[,keep_cols]
dim(df)
head(df)

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm,Wellbeing_fall_dorm
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
vgxlTMkQs5,7,18,4.0,0,87500.0,M,white,-0.7154534,-2.06354788,-0.76535414,3.282609,128875.0,4.326087,4.826087,5.478261,2.57971,0.06902635
M9obKkDvc0,11,18,3.5,1,,F,south_asian,-0.8199099,-0.01143413,-0.04997158,3.166667,107142.9,3.8,4.633333,5.4,2.655556,0.14734051
RdS4vMvQjo,9,18,4.0,1,125000.0,M,white,-0.8994971,0.919656,0.66541099,3.416667,124500.0,2.25,5.083333,6.25,2.583333,-0.09749154
n08loMfJH7,4,18,4.0,0,200000.0,F,east_asian,,0.65342017,0.48656535,3.33871,118958.3,3.203125,4.96875,4.890625,2.645833,-0.0421375
8rsekwqjFy,5,18,2.5,1,125000.0,M,south_asian,-0.4873343,0.6983916,-0.04997158,3.352941,119642.9,3.617647,5.147059,5.176471,2.782353,0.05322856
FjTWohEryS,13,18,4.0,1,45000.0,F,east_asian,,0.04290417,-0.1393944,3.2,105625.0,3.3,4.475,5.25,2.516667,-0.30415259


## Impute missing values

In [4]:
print(nrow(df))
if (IMPUTE_MISSING) {
    print("Imputing missing values")
    imp = mice(df)
    df = complete(imp)
} else {
    print("Dropping rows with any missing values")
    df = na.omit(df)
}
print(nrow(df))
# df = na.omit(df, cols='Ambient_empathy')

[1] 204
[1] "Imputing missing values"

 iter imp variable
  1   1  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  1   2  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  1   3  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  1   4  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  1   5  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  2   1  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  2   2  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  2   3  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  2   4  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  2   5  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  3   1  ParentEducationMax  FinclAid  FmlyIncome  Gender  Race  Ambient_empathy
  3   2  ParentEducationMax  FinclAid  FmlyIncome  

## Quick summary of whole-dorm well-beings

In [5]:
df %>% group_by(NID) %>%
    summarize(wb_fall = mean(Wellbeing_fall),
              wb_spring = mean(Wellbeing_spring))

NID,wb_fall,wb_spring
<dbl>,<dbl>,<dbl>
1,0.17958135,0.43067609
2,0.10039586,0.23049091
4,-0.0421375,0.00871215
5,0.05322856,0.01841058
7,0.06902635,0.1716415
8,-0.15094904,-0.45876161
9,-0.09749154,-0.34804765
10,-0.35117851,-0.30147326
11,0.14734051,0.15272015
13,-0.30415259,-0.49261454


## Standard regression models (not mixed)

In [6]:
base_equation = paste(DV, ' ~ Wellbeing_fall')
demo_equation = paste(base_equation, '+ Age + ParentEducationMax + FinclAid + FmlyIncome + Gender + Race + Ambient_empathy')

### Base model, minimal predictors

In [7]:
print(base_equation)
model_base = lm(as.formula(base_equation), df)
summary(model_base)

[1] "Wellbeing_spring  ~ Wellbeing_fall"



Call:
lm(formula = as.formula(base_equation), data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.69297 -0.48662  0.06646  0.49158  2.76843 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)    -3.286e-16  5.373e-02    0.00        1    
Wellbeing_fall  6.433e-01  5.387e-02   11.94   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7675 on 202 degrees of freedom
Multiple R-squared:  0.4139,	Adjusted R-squared:  0.411 
F-statistic: 142.6 on 1 and 202 DF,  p-value: < 2.2e-16


In [8]:
eq = paste(base_equation, ' + as.factor(NID)')
print(eq)
model_base_with_fixed_dorm = lm(as.formula(eq), df)
Anova(model_base_with_fixed_dorm)

[1] "Wellbeing_spring  ~ Wellbeing_fall  + as.factor(NID)"


Unnamed: 0_level_0,Sum Sq,Df,F value,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Wellbeing_fall,72.3389,1,125.785142,8.906693000000001e-23
as.factor(NID),8.562465,10,1.488868,0.1459475
Residuals,110.418994,192,,


### Add demographic covariates

In [9]:
names(df)

In [10]:
print(demo_equation)
model_demos = lm(as.formula(demo_equation), df)
summary(model_demos)

[1] "Wellbeing_spring  ~ Wellbeing_fall + Age + ParentEducationMax + FinclAid + FmlyIncome + Gender + Race + Ambient_empathy"



Call:
lm(formula = as.formula(demo_equation), data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.42588 -0.40063  0.04666  0.47630  2.71539 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -2.542e-01  1.067e+00  -0.238   0.8118    
Wellbeing_fall      6.130e-01  5.535e-02  11.074   <2e-16 ***
Age                -4.962e-03  5.308e-02  -0.093   0.9256    
ParentEducationMax  7.475e-02  9.822e-02   0.761   0.4475    
FinclAid           -4.128e-02  1.425e-01  -0.290   0.7725    
FmlyIncome          7.470e-07  1.053e-06   0.709   0.4790    
GenderM             1.645e-01  1.135e-01   1.449   0.1491    
Raceeast_asian      4.673e-02  2.283e-01   0.205   0.8380    
Racehispanic        2.308e-01  2.814e-01   0.820   0.4132    
Raceother_or_mixed -6.051e-02  2.283e-01  -0.265   0.7913    
Racesouth_asian    -1.957e-01  3.029e-01  -0.646   0.5189    
Racewhite           1.708e-01  2.357e-01   0.725   0.4695    
Ambient_empathy  

In [11]:
Anova(model_demos)

Unnamed: 0_level_0,Sum Sq,Df,F value,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Wellbeing_fall,71.79871,1,122.6394,2.459692e-22
Age,0.005114533,1,0.008736135,0.9256303
ParentEducationMax,0.3391333,1,0.5792738,0.4475358
FinclAid,0.04908829,1,0.08384772,0.7724634
FmlyIncome,0.2945508,1,0.5031222,0.4789964
Gender,1.22872,1,2.098777,0.1490567
Race,2.469263,5,0.8435496,0.5203703
Ambient_empathy,2.119583,1,3.620461,0.05857735
Residuals,111.8201,191,,


### Is this a significant improvement? (No)

In [12]:
anova(model_base, model_demos)

Unnamed: 0_level_0,Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,202,118.9815,,,,
2,191,111.8201,11.0,7.161327,1.112024,0.3537387


### Try just fall well-being and ambient empathy, get rid of noisy covariates

In [13]:
eq = paste(base_equation, ' + Ambient_empathy')
print(eq)
model_base_with_ambient_empathy = lm(as.formula(eq), df)
summary(model_base_with_ambient_empathy)

[1] "Wellbeing_spring  ~ Wellbeing_fall  + Ambient_empathy"



Call:
lm(formula = as.formula(eq), data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.58067 -0.45194  0.02294  0.49858  2.73649 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.06895    0.06324   1.090   0.2769    
Wellbeing_fall   0.63506    0.05361  11.846   <2e-16 ***
Ambient_empathy  0.14400    0.07100   2.028   0.0439 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7616 on 201 degrees of freedom
Multiple R-squared:  0.4256,	Adjusted R-squared:  0.4199 
F-statistic: 74.48 on 2 and 201 DF,  p-value: < 2.2e-16


In [14]:
anova(model_base, model_base_with_ambient_empathy)

Unnamed: 0_level_0,Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,202,118.9815,,,,
2,201,116.5957,1.0,2.385801,4.112898,0.04387707


## Mixed effect models

### REML model to accurately determine variance apportioned to dorm (zero)

In [15]:
eq = paste(base_equation, '+ (1|NID)')
print(eq)
model_mlm_nid_only_reml = lmer(as.formula(eq), data=df, REML=TRUE)
summary(model_mlm_nid_only_reml)

[1] "Wellbeing_spring  ~ Wellbeing_fall + (1|NID)"


Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(eq)
   Data: df

REML criterion at convergence: 476.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.4175 -0.5976  0.0881  0.6421  3.4976 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.01552  0.1246  
 Residual             0.57520  0.7584  
Number of obs: 204, groups:  NID, 11

Fixed effects:
                 Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)     -0.001463   0.065863   8.912989  -0.022    0.983    
Wellbeing_fall   0.632831   0.053597 200.252358  11.807   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr)
Wellbng_fll -0.002

### REML=false model to maximize predictive value. Is this a significant improvement over the non-mixed model? (No)

In [16]:
model_mlm_nid_only_noreml = lmer(as.formula(paste(base_equation, '+ (1|NID)')), data=df, REML=FALSE)
anova(model_mlm_nid_only_noreml, model_base)#, refit=FALSE)

Unnamed: 0_level_0,Df,AIC,BIC,logLik,deviance,Chisq,Chi Df,Pr(>Chisq)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
model_base,3,474.9398,484.8942,-234.4699,468.9398,,,
model_mlm_nid_only_noreml,4,476.4421,489.7145,-234.221,468.4421,0.4977896,1.0,0.480473


## Bring in network density to the mixed model

### Prepare the data

In [17]:
density = read.csv('data/NetworkDensity2018.csv')
density = density[,2:ncol(density)]  # First row is a meaningless row number
head(density)

Unnamed: 0_level_0,Dorm,Network,Density
Unnamed: 0_level_1,<fct>,<fct>,<dbl>
1,FroSoCo,SpendTime,0.0008166282
2,Norcliffe&Adelfa,SocAdvice,0.0002515091
3,Meier&Naranja,EmpSupp,0.0002639293
4,FroSoCo,EmpSupp,0.0006054848
5,Okada,Persuasive,0.0001800929
6,JRo,NegAffPres,0.0001459374


In [18]:
table(density$Dorm)


         Alondra            Cedro          FroSoCo              JRo           Larkin    Meier&Naranja Norcliffe&Adelfa            Okada            Twain           Ujamaa        WestFloMo 
              12               12               12               12               12               12               12               12               12               12               12 

In [19]:
density$NID <- mapvalues(
    density$Dorm, 
    from=c("Alondra", "Cedro", "EAST", "FroSoCo", "JRo", "Kimball", "Larkin", "Okada", "Twain", "Ujamaa", "Meier&Naranja", "Norcliffe&Adelfa", "WestFloMo"), 
    to=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "13", "15"))

The following `from` values were not present in `x`: EAST, Kimball



In [20]:
table(density$NID)


 1  2  4  5  7 11 13  8  9 10 15 
12 12 12 12 12 12 12 12 12 12 12 

In [21]:
table(density$Network)


 CloseFrds    EmpSupp     Gossip      Liked NegAffPres NegEmoSupp Persuasive PosAffPres PosEmoSupp Responsive  SocAdvice  SpendTime 
        11         11         11         11         11         11         11         11         11         11         11         11 

In [22]:
density_close_friends = density %>%
    filter(Network == 'CloseFrds') %>%
    select(NID, Density) %>%
    arrange(NID)
names(density_close_friends) = c('NID', 'DensityCloseFriends')
head(density_close_friends)

Unnamed: 0_level_0,NID,DensityCloseFriends
Unnamed: 0_level_1,<fct>,<dbl>
1,1,0.0005430616
2,2,0.0008209813
3,4,0.0009838725
4,5,0.0007545272
5,7,0.000622665
6,11,0.0003512068


In [23]:
density_bad_news = density %>%
    filter(Network == 'NegEmoSupp') %>%
    select(NID, Density) %>%
    arrange(NID)
names(density_bad_news) = c('NID', 'DensityBadNews')
head(density_bad_news)

Unnamed: 0_level_0,NID,DensityBadNews
Unnamed: 0_level_1,<fct>,<dbl>
1,1,0.0003078994
2,2,0.0007001956
3,4,0.0005728361
4,5,0.0005513034
5,7,0.0003902446
6,11,0.0002327147


In [24]:
head(df)

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm,Wellbeing_fall_dorm
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,7,18,4.0,0,87500,M,white,-0.7154534,-2.06354788,-0.76535414,3.282609,128875.0,4.326087,4.826087,5.478261,2.57971,0.06902635
2,11,18,3.5,1,125000,F,south_asian,-0.8199099,-0.01143413,-0.04997158,3.166667,107142.9,3.8,4.633333,5.4,2.655556,0.14734051
3,9,18,4.0,1,125000,M,white,-0.8994971,0.919656,0.66541099,3.416667,124500.0,2.25,5.083333,6.25,2.583333,-0.09749154
4,4,18,4.0,0,200000,F,east_asian,-1.1070464,0.65342017,0.48656535,3.33871,118958.3,3.203125,4.96875,4.890625,2.645833,-0.0421375
5,5,18,2.5,1,125000,M,south_asian,-0.4873343,0.6983916,-0.04997158,3.352941,119642.9,3.617647,5.147059,5.176471,2.782353,0.05322856
6,13,18,4.0,1,45000,F,east_asian,-0.9818706,0.04290417,-0.1393944,3.2,105625.0,3.3,4.475,5.25,2.516667,-0.30415259


In [25]:
df = merge(df, density_close_friends, on="NID", all.x=TRUE)
df = merge(df, density_bad_news, on="NID", all.x=TRUE)
df[sample(nrow(df), 5), ]

Unnamed: 0_level_0,NID,Age,ParentEducationMax,FinclAid,FmlyIncome,Gender,Race,Ambient_empathy,Wellbeing_fall,Wellbeing_spring,ParentEducationMax_dorm,FmlyIncome_dorm,Extraversion_dorm,Agreeableness_dorm,Openness_dorm,Empathic_Concern_dorm,Wellbeing_fall_dorm,DensityCloseFriends,DensityBadNews
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<fct>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
182,13,19,4.0,0,200000,M,south_asian,-1.129396,0.1701663,-0.6759313,3.2,105625.0,3.3,4.475,5.25,2.516667,-0.3041526,0.0003007478,0.0002094053
187,13,21,2.0,1,15000,F,white,-0.9155707,0.7005555,1.4702164,3.2,105625.0,3.3,4.475,5.25,2.516667,-0.3041526,0.0003007478,0.0002094053
67,4,19,3.0,1,125000,M,other_or_mixed,-1.4032829,1.4900067,0.665411,3.33871,118958.3,3.203125,4.96875,4.890625,2.645833,-0.0421375,0.0009838725,0.0005728361
196,15,18,3.0,0,200000,F,other_or_mixed,-0.6247803,0.49248,0.1288741,3.625,139166.7,4.0625,5.125,5.21875,2.697917,0.4387861,0.0009849339,0.0002895259
40,4,18,2.5,0,200000,M,other_or_mixed,-0.2582699,0.4014799,1.2913707,3.33871,118958.3,3.203125,4.96875,4.890625,2.645833,-0.0421375,0.0009838725,0.0005728361


In [26]:
eq = paste(base_equation, '+ DensityCloseFriends + DensityBadNews')
print(eq)
Anova(lm(as.formula(eq), data=df))

[1] "Wellbeing_spring  ~ Wellbeing_fall + DensityCloseFriends + DensityBadNews"


Unnamed: 0_level_0,Sum Sq,Df,F value,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
Wellbeing_fall,80.8884733,1,138.2354347,1.328048e-24
DensityCloseFriends,0.2824642,1,0.4827209,0.4879988
DensityBadNews,0.2814723,1,0.4810258,0.4887628
Residuals,117.0300126,200,,


### Mixed models - using dorm-level network densities

In [27]:
eq = paste(base_equation, '+ DensityCloseFriends + DensityBadNews + (1|NID)')
print(eq)
model5 = lmer(as.formula(eq), data=df, REML=TRUE)
summary(model5)

[1] "Wellbeing_spring  ~ Wellbeing_fall + DensityCloseFriends + DensityBadNews + (1|NID)"


“Some predictor variables are on very different scales: consider rescaling”
“Some predictor variables are on very different scales: consider rescaling”


Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(eq)
   Data: df

REML criterion at convergence: 446.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.3057 -0.5628  0.0565  0.6419  3.5304 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.01497  0.1223  
 Residual             0.57457  0.7580  
Number of obs: 204, groups:  NID, 11

Fixed effects:
                     Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)          -0.27675    0.19344   7.79851  -1.431    0.191    
Wellbeing_fall        0.62695    0.05381 198.42100  11.650   <2e-16 ***
DensityCloseFriends 260.03515  386.71276   8.03601   0.672    0.520    
DensityBadNews      282.51068  547.82028   7.33256   0.516    0.621    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) Wllbn_ DnstCF
Wellbng_fll  0.077              
DnstyClsFrn -0.530 -0.0

### Add even-more dorm-level covariates

In [28]:
eq = base_equation
for (col in names(df)) {
    if (endsWith(col, '_dorm')) {
        eq = paste(eq, '+', col)
    }
}
eq = paste(eq, '+ DensityCloseFriends + DensityBadNews + (1|NID)')
print(eq)
model7 = lmer(as.formula(eq), data=df, REML=TRUE)
summary(model7)

[1] "Wellbeing_spring  ~ Wellbeing_fall + ParentEducationMax_dorm + FmlyIncome_dorm + Extraversion_dorm + Agreeableness_dorm + Openness_dorm + Empathic_Concern_dorm + Wellbeing_fall_dorm + DensityCloseFriends + DensityBadNews + (1|NID)"


“Some predictor variables are on very different scales: consider rescaling”
boundary (singular) fit: see ?isSingular

“Some predictor variables are on very different scales: consider rescaling”


Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: as.formula(eq)
   Data: df

REML criterion at convergence: 455

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2460 -0.5452  0.0890  0.6379  3.2261 

Random effects:
 Groups   Name        Variance Std.Dev.
 NID      (Intercept) 0.0000   0.0000  
 Residual             0.5744   0.7579  
Number of obs: 204, groups:  NID, 11

Fixed effects:
                          Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)              9.346e-01  3.993e+00  1.930e+02   0.234    0.815    
Wellbeing_fall           6.102e-01  5.437e-02  1.930e+02  11.223   <2e-16 ***
ParentEducationMax_dorm -2.883e-01  1.906e+00  1.930e+02  -0.151    0.880    
FmlyIncome_dorm          2.812e-05  4.189e-05  1.930e+02   0.671    0.503    
Extraversion_dorm       -2.083e-01  4.924e-01  1.930e+02  -0.423    0.673    
Agreeableness_dorm      -1.626e+00  1.791e+00  1.930e+02  -0.908    0.365    
Ope