Skip to content

Commit

Permalink
Update package references and reference to brokenstick 2.5.0
Browse files Browse the repository at this point in the history
  • Loading branch information
stefvanbuuren committed Mar 23, 2023
1 parent 684d94c commit febfd96
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 32 deletions.
11 changes: 6 additions & 5 deletions vignettes/manual/manual.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@ vignette: >
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
bibliography: ref.bib
link-citations: true
---

```{r setup, include=FALSE}
if (packageVersion("brokenstick") < "2.4.0") stop("brokenstick 2.4.0 needed")
if (packageVersion("brokenstick") < "2.5.0") stop("brokenstick 2.5.0 needed")
require("brokenstick")
require("dplyr")
require("tidyr")
Expand All @@ -32,7 +33,7 @@ Published as: Stef van Buuren (2023), Broken Stick Model for Irregular Longitudi

# Abstract {-}

Many longitudinal studies collect data that have irregular observation times, often requiring the application of linear mixed models with time-varying outcomes. This paper presents an alternative that splits the quantitative analysis into two steps. The first step converts irregularly observed data into a set of repeated measures through the broken stick model. The second step estimates the parameters of scientific interest from the repeated measurements at the subject level. The broken stick model approximates each subject's trajectory by a series of connected straight lines. The breakpoints, specified by the user, divide the time axis into consecutive intervals common to all subjects. Specification of the model requires just three variables: time, measurement and subject. The model is a special case of the linear mixed model, with time as a linear $B$-spline and subject as the grouping factor. The main assumptions are: subjects are exchangeable, trajectories between consecutive breakpoints are straight, random effects follow a multivariate normal distribution, and unobserved data are missing at random. The **R** package **brokenstick** v2.4.0 offers tools to calculate, predict, impute and visualise broken stick estimates. The package supports two optimisation methods, including options to constrain the variance-covariance matrix of the random effects. We demonstrate six applications of the model: detection of critical periods, estimation of the time-to-time correlations, profile analysis, curve interpolation, multiple imputation and personalised prediction of future outcomes by curve matching.
Many longitudinal studies collect data that have irregular observation times, often requiring the application of linear mixed models with time-varying outcomes. This paper presents an alternative that splits the quantitative analysis into two steps. The first step converts irregularly observed data into a set of repeated measures through the broken stick model. The second step estimates the parameters of scientific interest from the repeated measurements at the subject level. The broken stick model approximates each subject's trajectory by a series of connected straight lines. The breakpoints, specified by the user, divide the time axis into consecutive intervals common to all subjects. Specification of the model requires just three variables: time, measurement and subject. The model is a special case of the linear mixed model, with time as a linear $B$-spline and subject as the grouping factor. The main assumptions are: subjects are exchangeable, trajectories between consecutive breakpoints are straight, random effects follow a multivariate normal distribution, and unobserved data are missing at random. The **R** package **brokenstick** v2.5.0 offers tools to calculate, predict, impute and visualise broken stick estimates. The package supports two optimisation methods, including options to constrain the variance-covariance matrix of the random effects. We demonstrate six applications of the model: detection of critical periods, estimation of the time-to-time correlations, profile analysis, curve interpolation, multiple imputation and personalised prediction of future outcomes by curve matching.

# Introduction

Expand All @@ -46,7 +47,7 @@ The linear mixed model for longitudinal data [@laird1982; @fitzmaurice2011] is t

This paper explores the use of the *broken stick model* to transform irregularly observed data into *repeated measures*. The broken stick model describes a curve by a series of connected straight lines. The model has a long history and is known under many other names, among others, *segmented straight lines* [@bellman1969], *piece-wise regression* [@toms2003], *structural change models* [@bai2003], *broken line smoothing* [@koutsoyiannis2000] and *segmented regression* [@lerman1980]. The term *broken stick* goes back to at least @macarthur1957, who used it in an analogy to indicate the abundance of species. Most of the literature on the broken stick model concentrates on the problem of finding optimal times at which the lines should connect. Instead, the present paper will focus on the problem of summarizing irregular individual trajectories by estimates made at a *pre-specified time grid*. This time grid is identical for all individuals, but it need not be equidistant. Our model formulation is a special case of the linear mixed model, with time modeled as a set of random effects coded as a linear $B$-spline and subjects as the grouping factor. The output of the transformation is a set of repeated measures, where every subject obtains a score on every time point.

Many **R** packages offer tools for interpolation. The **splines** package [@r2020] and the **akima** package [@akima2021] contains classic interpolation methods for one- and two-dimensional smoothing. Most contributed packages concentrate on time series or spatial interpolation. See @li2014 and @lepot2017 for overviews of the different concepts and methodologies. Most interpolation techniques rely on neighboring information, in time, space or both. The broken stick model addresses the problem where many independent replications provide short irregular multivariate time series, say of 5-30 time points. The scientific interest is to dynamically predict and update future observations. The model applies the linear mixed model to increase stability for such series by borrowing information across replicates. As there are no satisfactory solutions to this problem, the **brokenstick** package intends to fill this gap. Package **brokenstick** is available from the Comprehensive **R** Archive Network at <https://CRAN.R-project.org/package=brokenstick>.
Many **R** packages offer tools for interpolation. The **splines** package [@r2020] and the **akima** package [@akima2021] contains classic interpolation methods for one- and two-dimensional smoothing. Most contributed packages concentrate on time series or spatial interpolation. See @li2014 and @lepot2017 for overviews of the different concepts and methodologies. Most interpolation techniques rely on neighboring information, in time, space or both. The broken stick model addresses the problem where many independent replications provide short irregular multivariate time series, say of 5-30 time points. The scientific interest is to dynamically predict and update future observations. The model applies the linear mixed model to increase stability for such series by borrowing information across replicates. As there are no satisfactory solutions to this problem, the **brokenstick** package [@pkg:brokenstick] intends to fill this gap. Package **brokenstick** is available from the Comprehensive **R** Archive Network at <https://CRAN.R-project.org/package=brokenstick>.

Substantive researchers often favor repeated measures over the use of linear mixed models because of their simplicity. For example, we can easily fit a subject-level model to predict future outcomes conditional on earlier data with repeated measures data. While such simple regression models may be less efficient than modelling the complete data [@diggle2002, Section 6.1], increased insight may be more valuable than increased precision.

Expand All @@ -62,7 +63,7 @@ Some applications of the broken stick model are:

The original motivation for developing the broken stick model was to facilitate the statistical analysis and testing of critical ages in the onset of childhood obesity [@dekroon2010], with extensions to multiple imputation [@vanbuuren2018]. There is good support in **R** for fitting child growth data. We mention some related approaches. Methods for estimating growth references with parametric models are `gamlss()` from **gamlss** [@stasinopoulos2007] and its Bayesian incarnation `bamlss()` from **bamlss** [@umlauf2021]. Nonparametric alternatives that estimate quantiles directly are `rq()` from **quantreg** [@koenker2018] and `expectreg.ls()` from **expectreg** [@otto-sobotka2021]. Methods for modelling and smoothing growth curves fit trajectories per child include `smooth.basisPar()` from **fda** [@ramsay2021], `gam()` from **mgcv** [@wood2011], `loess()` and `smooth.spline()` from base **stats** [@r2020]. Models that smooth by borrowing strength across children are `face.sparse()` from **face** [@xiao2021], `lmer()` from **lme4** [@bates2015], and `sitar()` from **sitar** [@cole2021]. The broken stick model fits in the latter tradition, and features an intuitive parametrization of each individual growth curve as a series of connected straight lines. See @anderson2019 for an overview and comparison of these methods.

The present paper highlights various computational tools from the **brokenstick** v2.4.0 package. The package contains tools to fit the broken stick model to data, export the fitted model's parameters, create imputed values of the model, and predict broken stick estimates for new data. Also, the text illustrates how the tool helps to solve various analytic problems.
The present paper highlights various computational tools from the **brokenstick** v2.5.0 package. The package contains tools to fit the broken stick model to data, export the fitted model's parameters, create imputed values of the model, and predict broken stick estimates for new data. Also, the text illustrates how the tool helps to solve various analytic problems.

# Illustration of broken stick model

Expand Down Expand Up @@ -1203,7 +1204,7 @@ This paper has highlighted various applications of the broken stick model: Criti

# Computational setup {-}

I am running a Mac Studio, MacOS Venture, V 13.1, 32GB RAM with **R** version 4.2.2 (2022-10-31) and **brokenstick** version 2.4.0 (2022-10-30).
I am running a Mac Studio, MacOS Venture, V 13.1, 32GB RAM with **R** version 4.2.2 (2022-10-31) and **brokenstick** version 2.5.0 (2023-03-22).

# Acknowledgment {-}

Expand Down
69 changes: 42 additions & 27 deletions vignettes/manual/ref.bib
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
@Manual{akima2021,
author = {H. Akima and A. Gebhardt and T. Petzold and M. Mächler},
title = {\pkg{akima}: Interpolation of Irregularly and Regularly Spaced Data},
title = {{akima}: Interpolation of Irregularly and Regularly Spaced Data},
year = {2021},
note = {\proglang{R} package version 0.6-3.4},
edition = {R package version 0.6-3.4},
url = {https://CRAN.R-project.org/package=akima},
}

Expand Down Expand Up @@ -40,7 +41,7 @@ @Article{bai2003
}

@Article{bates2015,
title = {Fitting Linear Mixed-Effects Models Using \pkg{lme4}},
title = {Fitting Linear Mixed-Effects Models Using {lme4}},
author = {D. Bates and M. Mächler and B. Bolker and S. Walker},
journal = {Journal of Statistical Software},
year = {2015},
Expand Down Expand Up @@ -74,9 +75,10 @@ @Article{cole1995

@Manual{cole2021,
author = {T. J. Cole},
title = {\pkg{sitar}: Super Imposition by Translation and Rotation Growth Curve Analysis},
title = {{sitar}: Super Imposition by Translation and Rotation Growth Curve Analysis},
year = {2022},
note = {\proglang{R} package version 1.3.0},
edition = {R package version 1.3.0},
url = {https://CRAN.R-project.org/package=sitar},
}

Expand Down Expand Up @@ -143,17 +145,19 @@ @Book{gelman2007

@Manual{hafen2021,
author = {R. Hafen},
title = {\pkg{growthstandards}: Anthropometric Growth Standard Calculations},
title = {{growthstandards}: Anthropometric Growth Standard Calculations},
year = {2021},
note = {\proglang{R} package version 0.1.5},
edition = {R package version 0.1.5},
url = {https://github.com/ki-tools/growthstandards},
}

@Manual{hafen2020,
author = {R. Hafen and B. Schloerke},
title = {\pkg{trelliscopejs}: Create Interactive Trelliscope Displays},
title = {{trelliscopejs}: Create Interactive Trelliscope Displays},
year = {2020},
note = {\proglang{R} package version 0.2.6},
edition = {R package version 0.2.6},
url = {https://CRAN.R-project.org/package=trelliscopejs},
}

Expand Down Expand Up @@ -204,9 +208,10 @@ @Article{kenward1987

@Manual{koenker2018,
author = {R. Koenker and S. Portnoy and P. T. Ng and A. Zeileis and P. Grosjean and B. D. Ripley},
title = {\pkg{quantreg}: Quantile Regression},
title = {{quantreg}: Quantile Regression},
year = {2018},
note = {\proglang{R} package version 5.35},
edition = {R package version 5.35},
url = {https://CRAN.R-project.org/package=quantreg},
}

Expand Down Expand Up @@ -307,9 +312,10 @@ @Book{morrison1976

@Manual{myatt2019,
author = {M. Myatt and E. Guevarra},
title = {\pkg{zscorer}: Child Anthropometry Z-Score Calculator},
title = {{zscorer}: Child Anthropometry Z-Score Calculator},
year = {2019},
note = {\proglang{R} package version 0.3.1},
edition = {R package version 0.3.1},
url = {https://CRAN.R-project.org/package=zscorer},
}

Expand All @@ -325,16 +331,17 @@ @Article{naumova2001

@Manual{otto-sobotka2021,
author = {F. Otto-Sobotka and E. Spiegel and S. Schnabel and L. {Schulze Waltrup} and P. Eilers and T. Kneib and G. Kauermann},
title = {\pkg{expectreg}: Expectile and Quantile Regression},
title = {{expectreg}: Expectile and Quantile Regression},
year = {2021},
note = {\proglang{R} package version 0.52},
edition = {R package version 0.52},
url = {https://CRAN.R-project.org/package=expectreg},
}

@Article{plummer2006,
title = {\pkg{coda}: Convergence Diagnosis and Output Analysis for MCMC},
title = {{coda}: Convergence Diagnosis and Output Analysis for MCMC},
author = {M. Plummer and N. Best and K. Cowles and K. Vines},
journal = {\proglang{R} News},
journal = {R News},
year = {2006},
volume = {6},
number = {1},
Expand All @@ -356,16 +363,17 @@ @Article{pullenayegum2016

@Manual{ramsay2021,
author = {J. O. Ramsay and S. Graves and G. Hooker},
title = {\pkg{fda}: Functional Data Analysis},
title = {{fda}: Functional Data Analysis},
year = {2021},
note = {\proglang{R} package version 5.6.0},
edition = {R package version 5.6.0},
url = {https://CRAN.R-project.org/package=fda},
}

@Manual{r2020,
title = {\proglang{R}: A Language and Environment for Statistical Computing},
author = {{\proglang{R} Core Team}},
organization = {\proglang{R} Foundation for Statistical Computing},
title = {{R}: A Language and Environment for Statistical Computing},
author = {{R} Core Team}},
organization = {{R} Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2020},
url = {https://www.R-project.org/},
Expand Down Expand Up @@ -395,10 +403,11 @@ @Book{ruppert2003
keywords = {}}

@Manual{schumacher2020,
title = {\pkg{anthro}: Computation of the WHO Child Growth Standards},
title = {{anthro}: Computation of the WHO Child Growth Standards},
author = {D. Schumacher and E. Borghi and J. Polonsky},
year = {2020},
note = {\proglang{R} package version 1.0.0},
edition = {R package version 1.0.0},
url = {https://CRAN.R-project.org/package=anthro},
}

Expand All @@ -415,7 +424,7 @@ @Article{skrondal2009

@Article{stasinopoulos2007,
author = {D. M. Stasinopoulos and R. A. Rigby},
title = {Generalized Additive Models for Location Scale and Shape (GAMLSS) in \proglang{R}},
title = {Generalized Additive Models for Location Scale and Shape (GAMLSS) in {R}},
journal = {Journal of Statistical Software},
volume = {23},
number = {7},
Expand Down Expand Up @@ -457,7 +466,7 @@ @Article{towers2014
}

@Article{umlauf2021,
title = {\pkg{bamlss}: A Lego Toolbox for Flexible Bayesian Regression (and Beyond)},
title = {{bamlss}: A Lego Toolbox for Flexible Bayesian Regression (and Beyond)},
volume = {100},
doi = {10.18637/jss.v100.i04},
number = {4},
Expand Down Expand Up @@ -490,34 +499,38 @@ @Article{vanbuuren2014d
}

@Manual{vanbuuren2018b,
title = {\pkg{AGD}: Analysis of Growth Data},
title = {{AGD}: Analysis of Growth Data},
author = {S. {van Buuren}},
year = {2018},
note = {\proglang{R} package version 0.39},
edition = {R package version 0.39},
url = {https://CRAN.R-project.org/package=AGD},
}

@Manual{vanbuuren2021b,
author = {S. {van Buuren}},
title = {\pkg{chartplotter}: Analysing and Plotting Growth Curves},
title = {{chartplotter}: Analysing and Plotting Growth Curves},
year = {2021},
note = {\proglang{R} package version 0.31.1},
edition = {R package version 0.31.1},
url = {https://github.com/growthcharts/chartplotter},
}

@Manual{vanbuuren2021,
author = {S. {van Buuren}},
title = {\pkg{nlreferences}: Growth References for Children Living in the Netherlands},
title = {{nlreferences}: Growth References for Children Living in the Netherlands},
year = {2021},
note = {\proglang{R} package version 0.15.0},
edition = {R package version 0.15.0},
url = {https://github.com/growthcharts/nlreferences},
}

@Manual{vogel2020,
title = {\pkg{childsds}: Data and Methods Around Reference Values in Pediatrics},
title = {{childsds}: Data and Methods Around Reference Values in Pediatrics},
author = {M. Vogel},
year = {2020},
note = {\proglang{R} package version 0.8.0},
edition = {R package version 0.8.0},
url = {https://CRAN.R-project.org/package=childsds},
}

Expand All @@ -534,15 +547,16 @@ @Article{wood2011

@Manual{xiao2021,
author = {L. Xiao and C. Li and W. Checkley and C. Crainiceanu},
title = {\pkg{face}: Fast Covariance Estimation for Sparse Functional Data},
title = {{face}: Fast Covariance Estimation for Sparse Functional Data},
year = {2021},
note = {\proglang{R} package version 0.1-6},
edition = {R package version 0.1-6},
url = {https://CRAN.R-project.org/package=face},
}

@Article{ziyatdinov2018,
author = {A. Ziyatdinov and M. Vázquez-Santiago and H. Brunel and A. Martinez-Perez and H. Aschard and J. M. Soria},
title = {\pkg{lme4qtl}: Linear Mixed Models with Flexible Covariance Structure for Genetic Studies of Related Individuals},
title = {{lme4qtl}: Linear Mixed Models with Flexible Covariance Structure for Genetic Studies of Related Individuals},
journal = {BMC Bioinformatics},
volume = {19},
number = {1},
Expand All @@ -552,10 +566,11 @@ @Article{ziyatdinov2018
}

@Manual{pkg:brokenstick,
author = {Stef {van Buuren}},
title = {\pkg{brokenstick}: Broken Stick Model for Irregular Longitudinal Data},
year = {2022},
note = {\proglang{R} package version 2.4.0},
author = {S. {van Buuren}},
title = {{brokenstick}: Broken Stick Model for Irregular Longitudinal Data},
year = {2023},
note = {\proglang{R} package version 2.5.0},
edition = {R package version 2.5.0},
url = {https://CRAN.R-project.org/package=brokenstick},
}

Expand Down

0 comments on commit febfd96

Please sign in to comment.