powerlmm R package for power calculations for two- and three-level longitudinal multilevel/linear mixed models.
HTML R
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R support new ES Aug 13, 2018
dev cache sim Aug 14, 2018
inst re-export Aug 14, 2018
man re-export Aug 12, 2018
tests tweaks May 3, 2018
vignettes cache sim Aug 14, 2018
.Rbuildignore NEWS.md now supported by CRAN Dec 20, 2017
.gitattributes html fix Sep 8, 2017
.gitignore more dev Apr 14, 2018
.travis.yml ignore vignettes Apr 29, 2018
DESCRIPTION dev Aug 14, 2018
NAMESPACE fix all tests May 2, 2018
NEWS.md update May 4, 2018
README.Rmd bump version Aug 14, 2018
README.md bump version Aug 14, 2018
cran-comments.md release Aug 13, 2018
powerlmm.Rproj cache sim Aug 14, 2018

README.md

powerlmm

Travis-CI Build Status CRAN_Status_Badge

Power Analysis for Longitudinal Multilevel/Linear Mixed-Effects Models.

Overview

The purpose of powerlmm is to help design longitudinal treatment studies (parallel groups), with or without higher-level clustering (e.g. longitudinally clustered by therapists, groups, or physician), and missing data. The main features of the package are:

  • Longitudinal two- and three-level (nested) linear mixed-effects models, and partially nested designs.
  • Random slopes at the subject- and cluster-level.
  • Missing data.
  • Unbalanced designs (both unequal cluster sizes, and treatment groups).
  • Design effect, and estimated type I error when the third-level is ignored.
  • Fast analytical power calculations for all designs.
  • Power for small samples sizes using Satterthwaite’s degrees of freedom approximation.
  • Explore bias, Type I errors, model misspecification, and LRT model selection using convenient simulation methods.

Installation

powerlmm can be installed from CRAN and GitHub.

# CRAN, version 0.4.0
install.packages("powerlmm")

# GitHub, dev version
devtools::install_github("rpsychologist/powerlmm")

Example usage

This is an example of setting up a three-level longitudinal model with random slopes at both the subject- and cluster-level, with different missing data patterns per treatment arm. Relative standardized inputs are used, but unstandardized raw parameters values can also be used.

library(powerlmm)
d <- per_treatment(control = dropout_weibull(0.3, 2),
               treatment = dropout_weibull(0.2, 2))
p <- study_parameters(n1 = 11,
                      n2 = 10,
                      n3 = 5,
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      dropout = d,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

p
#> 
#>      Study setup (three-level) 
#> 
#>               n1 = 11
#>               n2 = 10 x 5 (treatment)
#>                    10 x 5 (control)
#>               n3 = 5      (treatment)
#>                    5      (control)
#>                    10     (total)
#>          total_n = 50     (treatment)
#>                    50     (control)
#>                    100    (total)
#>          dropout =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10 (time)
#>                     0,  0,  1,  3,  6,  9, 12, 16, 20, 25, 30 (%, control)
#>                     0,  0,  1,  2,  4,  5,  8, 10, 13, 17, 20 (%, treatment)
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
plot(p)

get_power(p, df = "satterthwaite")
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = 10 x 5 (treatment)
#>                    10 x 5 (control)
#>               n3 = 5      (treatment)
#>                    5      (control)
#>                    10     (total)
#>          total_n = 50     (control)
#>                    50     (treatment)
#>                    100    (total)
#>          dropout =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10 (time)
#>                     0,  0,  1,  3,  6,  9, 12, 16, 20, 25, 30 (%, control)
#>                     0,  0,  1,  2,  4,  5,  8, 10, 13, 17, 20 (%, treatment)
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 7.971459
#>            alpha = 0.05
#>            power = 68%

Unequal cluster sizes

Unequal cluster sizes is also supported, the cluster sizes can either be random (sampled), or the marginal distribution can be specified.

p <- study_parameters(n1 = 11,
                      n2 = unequal_clusters(2, 3, 5, 20),
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

get_power(p)
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = 2, 3, 5, 20 (treatment)
#>                    2, 3, 5, 20 (control)
#>               n3 = 4           (treatment)
#>                    4           (control)
#>                    8           (total)
#>          total_n = 30          (control)
#>                    30          (treatment)
#>                    60          (total)
#>          dropout = No missing data
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 6
#>            alpha = 0.05
#>            power = 44%

Cluster sizes follow a Poisson distribution

p <- study_parameters(n1 = 11,
                      n2 = unequal_clusters(func = rpois(5, 5)), # sample from Poisson
                      icc_pre_subject = 0.5,
                      icc_pre_cluster = 0,
                      icc_slope = 0.05,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

get_power(p, R = 100, progress = FALSE) # expected power by averaging over R realizations
#> 
#>      Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)
#>                   with missing data and unbalanced designs 
#> 
#>               n1 = 11
#>               n2 = rpois(5, 5) (treatment)
#>                    -           (control)
#>               n3 = 5           (treatment)
#>                    5           (control)
#>                    10          (total)
#>          total_n = 24.96       (control)
#>                    24.96       (treatment)
#>                    49.92       (total)
#>          dropout = No missing data
#> icc_pre_subjects = 0.5
#> icc_pre_clusters = 0
#>        icc_slope = 0.05
#>        var_ratio = 0.02
#>      effect_size = -0.8 (Cohen's d [SD: pretest_SD])
#>               df = 8
#>            alpha = 0.05
#>            power = 48% (MCSE: 1%)
#> 
#> NOTE: n2 is randomly sampled. Values are the mean from R = 100 realizations.

Convenience functions

Several convenience functions are also included, e.g. for creating power curves.

x <- get_power_table(p, 
                     n2 = 5:10, 
                     n3 = c(4, 8, 12), 
                     effect_size = cohend(c(0.5, 0.8), standardizer = "pretest_SD"))
plot(x)

Simulation

The package includes a flexible simulation method that makes it easy to investigate the performance of different models. As an example, let’s compare the power difference between the 2-level LMM with 11 repeated measures, to doing an ANCOVA at posttest. Using sim_formula different models can be fit to the same data set during the simulation.

p <- study_parameters(n1 = 11,
                      n2 = 40, 
                      icc_pre_subject = 0.5,
                      cor_subject = -0.4,
                      var_ratio = 0.02,
                      effect_size = cohend(-0.8, 
                                           standardizer = "pretest_SD"))

# 2-lvl LMM
f0 <- sim_formula("y ~ time + time:treatment + (1 + time | subject)")

# ANCOVA, formulas with no random effects are with using lm()
f1 <- sim_formula("y ~ treatment + pretest", 
                  data_transform = transform_to_posttest, 
                  test = "treatment")

f <- sim_formula_compare("LMM" = f0, 
                         "ANCOVA" = f1)

res <- simulate(p, 
                nsim = 2000, 
                formula = f, 
                cores = parallel::detectCores(logical = FALSE))

We then summarize the results using summary. Let’s look specifically at the treatment effects.

summary(res, para = list("LMM" = "time:treatment",
                         "ANCOVA" = "treatment"))
#> Model:  summary 
#> 
#> Fixed effects: 'time:treatment', 'treatment'
#> 
#>   model M_est theta M_se SD_est Power Power_bw Power_satt
#>     LMM  -1.1  -1.1 0.32   0.32  0.93     0.93          .
#>  ANCOVA -11.0   0.0 3.70   3.70  0.85     0.84       0.84
#> ---
#> Number of simulations: 2000  | alpha:  0.05
#> Time points (n1):  11
#> Subjects per cluster (n2 x n3):  40 (treatment)
#>                                  40 (control)
#> Total number of subjects:  80 
#> ---
#> At least one of the models applied a data transformation during simulation,
#> summaries that depend on the true parameter values will no longer be correct,
#> see 'help(summary.plcp_sim)'

We can also look at a specific model, here’s the results for the 2-lvl LMM.

summary(res, model = "LMM")
#> Model:  LMM 
#> 
#> Random effects 
#> 
#>          parameter  M_est theta est_rel_bias prop_zero is_NA
#>  subject_intercept  99.00 100.0     -0.00560         0     0
#>      subject_slope   2.00   2.0     -0.00310         0     0
#>              error 100.00 100.0      0.00086         0     0
#>        cor_subject  -0.39  -0.4     -0.03200         0     0
#> 
#> Fixed effects 
#> 
#>       parameter   M_est theta M_se SD_est Power Power_bw Power_satt
#>     (Intercept)  0.0048   0.0 1.30   1.30 0.050        .          .
#>            time  0.0011   0.0 0.25   0.25 0.054        .          .
#>  time:treatment -1.1000  -1.1 0.32   0.32 0.930     0.93          .
#> ---
#> Number of simulations: 2000  | alpha:  0.05
#> Time points (n1):  11
#> Subjects per cluster (n2 x n3):  40 (treatment)
#>                                  40 (control)
#> Total number of subjects:  80 
#> ---
#> At least one of the models applied a data transformation during simulation,
#> summaries that depend on the true parameter values will no longer be correct,
#> see 'help(summary.plcp_sim)'

Launch interactive web application

The package’s basic functionality is also implemented in a Shiny web application, aimed at users that are less familiar with R. Launch the application by typing

library(powerlmm)
shiny_powerlmm()