The 'LCTMC.simulate' package

This R package provides an intuitive interface to simulate censored longitudinal multistate processes. The censored data are assumed to be observed at irregular time intervals (i.e., non-discrete time). The data generation is based on the continuous-time Markov/Semi-Markov chain framework.

Overview

The continuous-time Markov models have been studied for more than half of a century as of 2023. It has found many use cases in epidemiological studies, clinical trials, and public health surveillance. It is a useful tool to study the dynamic of a multistate process. It can also be extended in several directions. For example, the msm package has implementations of continous-time hidden Markov models (HMM), or the distribution assumptions can be relaxed to encompass more complex processes (e.g., semi-Markov models).

This package allows for data simulation from our extension of the CTMC model, latent class CTMC. In our study, we assumed individuals belong to one of $K$ mutually exclusive unobservable clusters, where each cluster is characterized by their differences in the disease dynamic (i.e., progressive, regressive, or mixture of both).

Installation

# use the 'devtools' package to directly install from GitHub
devtools::install_github("j-kuo/LCTMC.simulate")

Usage

Here we demonstrate the simulation of a two-state process:

# set seed
set.seed(456)

# simulate
d = LCTMC.simulate::simulate_LCTMC(
  N.indiv = 5,
  N.obs_times = 5,

  max.obs_times = 10,
  fix.obs_times = FALSE,

  true_param = LCTMC.simulate::gen_true_param(K_class = 3, M_state = 2),
  alpha.include = TRUE,
  beta.include = TRUE,

  K = 3,
  M = 2,
  p1 = 2,
  p2 = 2,

  initS_p = c(0.5, 0.5),
  death = NULL,
  sojourn = list(dist = "gamma", gamma.shape = 2)
)

This code simulates a two-state process with 3 latent clusters. Using the N.indiv = 5 argument, 5 independent CTMC processes were generated (one per person). Excluding the baseline, each person was observed at 5 irregular times (N.obs_times = 5).

The as.data.frame() function extracts/tidies the simulated data and returns the data as data.frame objects. In this example, we specify id = "BA000" to extract only this person's simulated data.

# coerce the "observed" data into a data frame
as.data.frame(sim_data, type = "obs", id = "BA000")
     id  obsTime state_at_obsTime         x1 x2         w1 w2 latent_class
1 BA000 0.000000                1 0.04352429  1 -0.2349686  1            1
2 BA000 5.470254                2 0.04352429  1 -0.2349686  1            1
3 BA000 7.793661                1 0.04352429  1 -0.2349686  1            1
4 BA000 7.915950                1 0.04352429  1 -0.2349686  1            1
5 BA000 7.986747                1 0.04352429  1 -0.2349686  1            1
6 BA000 9.930836                1 0.04352429  1 -0.2349686  1            1

# coerce the "exact transition" data into a data frame
as.data.frame(sim_data, type = "exact", id = "BA000")
     id  transTime state_at_transTime         x1 x2         w1 w2 latent_class
1 BA000  0.0000000                  1 0.04352429  1 -0.2349686  1            1
2 BA000  0.6124992                  2 0.04352429  1 -0.2349686  1            1
3 BA000  5.8285964                  1 0.04352429  1 -0.2349686  1            1
4 BA000 12.6018292                  2 0.04352429  1 -0.2349686  1            1

Visualizing CTMC

To visualize the longitudinal process, we plot person BA000's data over time. The figure on the left shows the actual process.

# S3 method for plotting
plot(x = d, id = "BA000")

The left figure's interpretation is the following:

This person begins by being in state 1 at time = 0

The state then transitions to state 2 at approximately time = 0.6

At roughly time = 5.8, the state transitions back to state 1

At last, the state then jumps back to state 2 at around time = 12.6

The red dots in the figures indicate the times at which the data are being collected on this person (e.g., at a clinic visit). If the model does not properly adjust for censoring, we would be assuming the longitudinal process is the figure on the right.

However, in time-to-event analyses, making this assumption will almost surely lead to biased estimates. Since any changes that occur between observations are not accounted for. The CTMC model or other Markov-based models handle these unobserved in-between-observation changes by making some assumptions on the sojourn time (what is sojourn time). For more info see the Wiki section at the bottom of this page.

More Info

Authors

Jacky Kuo - author, maintainer
Wenyaw Chan, PhD - dissertation advisor

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
R		R
inst/examples		inst/examples
man		man
visuals		visuals
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LCTMC.simulate.Rproj		LCTMC.simulate.Rproj
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

The 'LCTMC.simulate' package

Overview

Installation

Usage

Visualizing CTMC

More Info

Authors

Wiki

About

Licenses found

Releases

Packages

Languages

License

Licenses found

j-kuo/LCTMC.simulate

Folders and files

Latest commit

History

Repository files navigation

The 'LCTMC.simulate' package

Overview

Installation

Usage

Visualizing CTMC

More Info

Authors

Wiki

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages