CRAN Submission Accepted.

LukeDuttweiler · May 19, 2024 · 74aafaf · 74aafaf
1 parent 44645d1
commit 74aafaf
Show file tree

Hide file tree

Showing 7 changed files with 39 additions and 206 deletions.
diff --git a/CRAN-SUBMISSION b/CRAN-SUBMISSION
@@ -1,3 +1,3 @@
 Version: 0.1.0
-Date: 2024-05-14 18:08:32 UTC
-SHA: f08fb551a01360bd27af0acf43204e4c3ccddcbf
+Date: 2024-05-15 16:24:33 UTC
+SHA: 44645d11654bf12abad21d865b368be83c010451
diff --git a/R/fit.R b/R/fit.R
@@ -62,7 +62,7 @@ skipTrack.fit <- function(Y,cluster,
     par <- tryCatch(liInference(Y = Y, cluster = cluster,
                                 S = numSkips, startingParams = liHyperparams),
                     error = function(e){
-                      print(e)
+                      warning(e)
                       return(liHyperparams)
                     })
   }

diff --git a/README.Rmd b/README.Rmd
@@ -22,58 +22,23 @@ knitr::opts_chunk$set(
 <!-- badges: start -->
 [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
 [![R-CMD-check](https://github.com/LukeDuttweiler/skipTrack/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/LukeDuttweiler/skipTrack/actions/workflows/R-CMD-check.yaml)
+[![CRAN status](https://www.r-pkg.org/badges/version/skipTrack)](https://CRAN.R-project.org/package=skipTrack)
 <!-- badges: end -->
 
 Welcome to the SkipTrack Package! 
 
-SkipTrack is a Bayesian hierarchical model for self-reported menstrual cycle length data on mobile health apps. The model is an <!-- significant --> extension of the hierarchical model presented in @li2022predictive that focuses on predicting an individual's next menstrual cycle start date while accounting for cycle length inaccuracies introduced by non-adherence in user self-tracked data.
+SkipTrack is a Bayesian hierarchical model for self-reported menstrual cycle length data on mobile health apps. The model is an extension of the hierarchical model presented in @li2022predictive that focuses on predicting an individual's next menstrual cycle start date while accounting for cycle length inaccuracies introduced by non-adherence in user self-tracked data. Check out the 'Getting Started' vignette to see an overview of the SkipTrack Model!
 
 ## Installation
 
-```{r}
+``` r
 #Install from CRAN
 install.packages('skipTrack')
 
 #Install Development Version
 devtools::install_github("LukeDuttweiler/skipTrack")
 ```
 
-## Model
-
-@li2022predictive notes that apps designed to help users track their menstrual cycles "are subject to adherence artifacts that may obscure health-related conclusions: if a user forgets to track their period, their cycle length computations are inflated." This is visualized in the image below in which the numbers represent days after the initial bleeding day is recorded in the app, $\color{red}{\text{red}}$ days are bleeding days recorded by the user, and $\color{blue}{\text{blue}}$ days are bleeding days not recorded by the user. 
-
-$$\overbrace{\underbrace{\color{red}{1, 2, 3, 4}, 5,  \dots, 29}_\text{True Cycle, 29 Days}}^\text{Recorded Cycle, 29 Days}, \overbrace{\underbrace{\color{red}{30, 31, 32, 33}, 34,  \dots, 61}_\text{True Cycle, 32 Days}, \underbrace{\color{blue}{62, 63, 64, 65}, 66,  \dots, 90}_\text{True Cycle, 29 Days}}^\text{Recorded Cycle, 61 Days}$$
-
-The SkipTrack model extends the model given by @li2022predictive by specifying parameters for each individuals for cycle length regularity, as well as their cycle length mean, and weakening assumptions made by Li et al. on the probability of failing to track a cycle.
-
-<!--and by allowing for other sources of data to help identify associations between covariates and cycle length mean or regularity, while still accounting for skips in self-tracking adherence. -->
-
-In short, the modeling framework assumed by SkipTrack is as follows. The observed cycle lengths are represented with $y_{ij}$ where $1 \leq i \leq n$ represents an individual who has contributed $n_i$ observations, with $1 \leq j \leq n_i$. We assume that 
-
-$$
-y_{ij} \sim \text{LogNormal}\big(\mu_i + \log(c_{ij}), \tau_i\big),
-$$
-where $\mu_i$ is an individual level mean parameter, $\tau_i$ is an individual level precision parameter, and $c_{ij}$ is an integer-valued parameter representing the number of true cycles present in the observed cycle $y_{ij}$. That is, if $c_{ij} = 1$ then $y_{ij}$ is a true cycle, if $c_{ij} = 2$ then $y_{ij}$ gives the length of two true cycles added together, and so on. 
-
-We then assume 
-
-$$
-\mu_i \sim \text{Normal}(\mu, \rho) \mspace{100mu}\tau_i \sim \text{Gamma}(\theta, \phi)
-$$
-
-where $\rho$ is a precision parameter, and the Gamma distribution above is parameterized by mean ($\theta$) and rate $\phi$.
-
-<!--We then include covariates from two matrices $X$ and $Z$ (which may be, but are not necessarily, equal) by 
-
-$$
-\mu_i \sim \text{Normal}\big(X_i^T\beta, \rho\big) \mspace{100mu}\tau_i \sim \text{Gamma}\big(\exp(Z_i^T\Gamma), \phi\big) 
-$$
-where $\rho$ is a precision parameter, the Gamma distribution above is parameterized by mean and rate, and $X_i$ and $Z_i$ are the $i$th rows of $X$ and $Z$ respectively. -->
-
-This is a fully interpretable model that allows for the identification of skipping in cycle tracking, while allowing for different individual's regularities, and accounting for uncertainty in the model. A paper discussing the full model details will be published soon. 
-
-## Example Usage
-
 # Package Usage
 
 The SkipTrack package provides functions for fitting the SkipTrack model, evaluating model run diagnostics, retrieving and visualizing model results, and simulating related data. We begin our tutorial by examining some simulated data. 
@@ -87,30 +52,6 @@ First, we simulate data on 100 individuals from the SkipTrack model where each o
 ```{r}
 #Simulate data
 dat <- skipTrack.simulate(n = 100, model = 'skipTrack', skipProb = c(.75, .2, .05))
-
-names(dat)
-```
-
-The result of the simulation function is simply a named list with various components. The (currently) important components are
-
-
-  * `Y`: the $y_{ij}$ values, observed outcomes
-  * `cluster`: the $i$ values, individual markers 
-  * `NumTrue`: the $c_{ij}$ values, number of true cycles in an observed cycle
-  * `Underlying`: underlying parameters pertaining to the specific model used for data simulation
-
-<!--
-* `X`: the matrix $X$, covariates for cycle length mean
-  * `Z`: the matrix $Z$, covariates for cycle length regularity
-  * `Beta`: the true values of $\beta$, parameters for cycle length mean
-  * `Gamma`: the true values of $\Gamma$, parameters for cycle length regularity
--->
-
-Looking at the histogram of `dat$Y`, we can see a clear mixture of at least two distributions, one centered around 30 days, and another centered near 60 days (corresponding to the true cycles and observed cycles containing two true cycles respectively), which is what we expect based on our generation. 
-
-```{r, fig.align='center', fig.width = 7}
-#Histogram of observed outcomes
-hist(dat$Y, breaks = 10:150)
 ```
 
 Fitting the SkipTrack model using this simulated data requires a call to the function `skipTrack.fit`. Note that because this is a Bayesian model and is fit with an MCMC algorithm, it can take some time with large datasets and a high number of MCMC reps and chains. 
@@ -154,7 +95,7 @@ summary(ft)
 summary(ft, burnIn = 500)
 ```
 
-This introduction provides enough information to start fitting the SkipTrack model. For further information regarding different methods of simulating data, additional model fitting, and tuning parameters for fitting the model, please see the help pages. Additional vignettes are forthcoming.
+This introduction provides enough information to start fitting the SkipTrack model. For further information regarding different methods of simulating data, additional model fitting, and tuning parameters for fitting the model, please see the help pages and the 'Getting Started' vignette. Additional vignettes are forthcoming.
 
 \newpage
 

diff --git a/README.md b/README.md
@@ -12,110 +12,30 @@
 [![Lifecycle:
 experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
 [![R-CMD-check](https://github.com/LukeDuttweiler/skipTrack/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/LukeDuttweiler/skipTrack/actions/workflows/R-CMD-check.yaml)
+[![CRAN
+status](https://www.r-pkg.org/badges/version/skipTrack)](https://CRAN.R-project.org/package=skipTrack)
 <!-- badges: end -->
 
 Welcome to the SkipTrack Package!
 
 SkipTrack is a Bayesian hierarchical model for self-reported menstrual
-cycle length data on mobile health apps. The model is an
-<!-- significant --> extension of the hierarchical model presented in Li
-et al. (2022) that focuses on predicting an individual’s next menstrual
-cycle start date while accounting for cycle length inaccuracies
-introduced by non-adherence in user self-tracked data.
+cycle length data on mobile health apps. The model is an extension of
+the hierarchical model presented in Li et al. (2022) that focuses on
+predicting an individual’s next menstrual cycle start date while
+accounting for cycle length inaccuracies introduced by non-adherence in
+user self-tracked data. Check out the ‘Getting Started’ vignette to see
+an overview of the SkipTrack Model!
 
 ## Installation
 
 ``` r
 #Install from CRAN
 install.packages('skipTrack')
-#> Installing package into '/private/var/folders/9h/055tc3cs7ql0r89g2lrc5j1h0000gn/T/Rtmp7dAERQ/temp_libpath52ae786c035f'
-#> (as 'lib' is unspecified)
-#> Warning: package 'skipTrack' is not available for this version of R
-#> 
-#> A version of this package for your version of R might be available elsewhere,
-#> see the ideas at
-#> https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
 
 #Install Development Version
 devtools::install_github("LukeDuttweiler/skipTrack")
-#> Using GitHub PAT from the git credential store.
-#> Downloading GitHub repo LukeDuttweiler/skipTrack@HEAD
-#> farver       (2.1.1      -> 2.1.2     ) [CRAN]
-#> RcppArmad... (0.12.8.2.1 -> 0.12.8.3.0) [CRAN]
-#> Installing 2 packages: farver, RcppArmadillo
-#> Installing packages into '/private/var/folders/9h/055tc3cs7ql0r89g2lrc5j1h0000gn/T/Rtmp7dAERQ/temp_libpath52ae786c035f'
-#> (as 'lib' is unspecified)
-#> 
-#> The downloaded binary packages are in
-#>  /var/folders/9h/055tc3cs7ql0r89g2lrc5j1h0000gn/T//RtmplTNwy8/downloaded_packages
-#> ── R CMD build ─────────────────────────────────────────────────────────────────
-#> * checking for file ‘/private/var/folders/9h/055tc3cs7ql0r89g2lrc5j1h0000gn/T/RtmplTNwy8/remotes654466a9445e/LukeDuttweiler-skipTrack-bc836ab/DESCRIPTION’ ... OK
-#> * preparing ‘skipTrack’:
-#> * checking DESCRIPTION meta-information ... OK
-#> * checking for LF line-endings in source and make files and shell scripts
-#> * checking for empty or unneeded directories
-#> * building ‘skipTrack_0.1.0.tar.gz’
-#> Installing package into '/private/var/folders/9h/055tc3cs7ql0r89g2lrc5j1h0000gn/T/Rtmp7dAERQ/temp_libpath52ae786c035f'
-#> (as 'lib' is unspecified)
-#> Adding 'skipTrack_0.1.0.tgz' to the cache
 ```
 
-## Model
-
-Li et al. (2022) notes that apps designed to help users track their
-menstrual cycles “are subject to adherence artifacts that may obscure
-health-related conclusions: if a user forgets to track their period,
-their cycle length computations are inflated.” This is visualized in the
-image below in which the numbers represent days after the initial
-bleeding day is recorded in the app, $\color{red}{\text{red}}$ days are
-bleeding days recorded by the user, and $\color{blue}{\text{blue}}$ days
-are bleeding days not recorded by the user.
-
-$$\overbrace{\underbrace{\color{red}{1, 2, 3, 4}, 5,  \dots, 29}_\text{True Cycle, 29 Days}}^\text{Recorded Cycle, 29 Days}, \overbrace{\underbrace{\color{red}{30, 31, 32, 33}, 34,  \dots, 61}_\text{True Cycle, 32 Days}, \underbrace{\color{blue}{62, 63, 64, 65}, 66,  \dots, 90}_\text{True Cycle, 29 Days}}^\text{Recorded Cycle, 61 Days}$$
-
-The SkipTrack model extends the model given by Li et al. (2022) by
-specifying parameters for each individuals for cycle length regularity,
-as well as their cycle length mean, and weakening assumptions made by Li
-et al. on the probability of failing to track a cycle.
-
-<!--and by allowing for other sources of data to help identify associations between covariates and cycle length mean or regularity, while still accounting for skips in self-tracking adherence. -->
-
-In short, the modeling framework assumed by SkipTrack is as follows. The
-observed cycle lengths are represented with $y_{ij}$ where
-$1 \leq i \leq n$ represents an individual who has contributed $n_i$
-observations, with $1 \leq j \leq n_i$. We assume that
-
-$$
-y_{ij} \sim \text{LogNormal}\big(\mu_i + \log(c_{ij}), \tau_i\big),
-$$ where $\mu_i$ is an individual level mean parameter, $\tau_i$ is an
-individual level precision parameter, and $c_{ij}$ is an integer-valued
-parameter representing the number of true cycles present in the observed
-cycle $y_{ij}$. That is, if $c_{ij} = 1$ then $y_{ij}$ is a true cycle,
-if $c_{ij} = 2$ then $y_{ij}$ gives the length of two true cycles added
-together, and so on.
-
-We then assume
-
-$$
-\mu_i \sim \text{Normal}(\mu, \rho) \mspace{100mu}\tau_i \sim \text{Gamma}(\theta, \phi)
-$$
-
-where $\rho$ is a precision parameter, and the Gamma distribution above
-is parameterized by mean ($\theta$) and rate $\phi$.
-
-<!--We then include covariates from two matrices $X$ and $Z$ (which may be, but are not necessarily, equal) by 
-&#10;$$
-\mu_i \sim \text{Normal}\big(X_i^T\beta, \rho\big) \mspace{100mu}\tau_i \sim \text{Gamma}\big(\exp(Z_i^T\Gamma), \phi\big) 
-$$
-where $\rho$ is a precision parameter, the Gamma distribution above is parameterized by mean and rate, and $X_i$ and $Z_i$ are the $i$th rows of $X$ and $Z$ respectively. -->
-
-This is a fully interpretable model that allows for the identification
-of skipping in cycle tracking, while allowing for different individual’s
-regularities, and accounting for uncertainty in the model. A paper
-discussing the full model details will be published soon.
-
-## Example Usage
-
 # Package Usage
 
 The SkipTrack package provides functions for fitting the SkipTrack
@@ -135,42 +55,8 @@ cycle, a 20% probability of being two true cycles recorded as one, and a
 ``` r
 #Simulate data
 dat <- skipTrack.simulate(n = 100, model = 'skipTrack', skipProb = c(.75, .2, .05))
-
-names(dat)
-#> [1] "Y"          "cluster"    "X"          "Z"          "Beta"      
-#> [6] "Gamma"      "NumTrue"    "Underlying"
 ```
 
-The result of the simulation function is simply a named list with
-various components. The (currently) important components are
-
-- `Y`: the $y_{ij}$ values, observed outcomes
-- `cluster`: the $i$ values, individual markers
-- `NumTrue`: the $c_{ij}$ values, number of true cycles in an observed
-  cycle
-- `Underlying`: underlying parameters pertaining to the specific model
-  used for data simulation
-
-<!--
-* `X`: the matrix $X$, covariates for cycle length mean
-  * `Z`: the matrix $Z$, covariates for cycle length regularity
-  * `Beta`: the true values of $\beta$, parameters for cycle length mean
-  * `Gamma`: the true values of $\Gamma$, parameters for cycle length regularity
--->
-
-Looking at the histogram of `dat$Y`, we can see a clear mixture of at
-least two distributions, one centered around 30 days, and another
-centered near 60 days (corresponding to the true cycles and observed
-cycles containing two true cycles respectively), which is what we expect
-based on our generation.
-
-``` r
-#Histogram of observed outcomes
-hist(dat$Y, breaks = 10:150)
-```
-
-<img src="man/figures/README-unnamed-chunk-5-1.png" width="100%" style="display: block; margin: auto;" />
-
 Fitting the SkipTrack model using this simulated data requires a call to
 the function `skipTrack.fit`. Note that because this is a Bayesian model
 and is fit with an MCMC algorithm, it can take some time with large
@@ -208,23 +94,23 @@ longer).
 skipTrack.diagnostics(ft, param = 'cijs')
 ```
 
-<img src="man/figures/README-unnamed-chunk-7-1.png" width="100%" style="display: block; margin: auto;" />
+<img src="man/figures/README-unnamed-chunk-5-1.png" width="100%" style="display: block; margin: auto;" />
 
     #> ----------------------------------------------------
     #> Generalized MCMC Diagnostics using lanfear Method 
     #> ----------------------------------------------------
     #> 
     #> |Effective Sample Size: 
     #> |---------------------------
-    #> | Chain 1| Chain 2| Chain 3| Chain 4|     Sum|
-    #> |-------:|-------:|-------:|-------:|-------:|
-    #> |  85.146|  88.431|  56.475|  96.856| 326.909|
+    #> | Chain 1| Chain 2| Chain 3| Chain 4|    Sum|
+    #> |-------:|-------:|-------:|-------:|------:|
+    #> | 369.984| 151.771|  99.466|  139.51| 760.73|
     #> 
     #> |Gelman-Rubin Diagnostic: 
     #> |---------------------------
     #> | Point est.| Upper C.I.|
     #> |----------:|----------:|
-    #> |      1.009|      1.012|
+    #> |      1.001|      1.002|
 
 ### Visualization
 
@@ -236,7 +122,7 @@ can simply use `plot(ft)`, and the plots are directly accessible using
 plot(ft)
 ```
 
-<img src="man/figures/README-unnamed-chunk-8-1.png" width="100%" style="display: block; margin: auto;" />
+<img src="man/figures/README-unnamed-chunk-6-1.png" width="100%" style="display: block; margin: auto;" />
 
 ### Summary
 
@@ -254,21 +140,21 @@ summary(ft)
 #> Mean Coefficients: 
 #> 
 #>             Estimate       95% CI Lower 95% CI Upper
-#> (Intercept)     3.41              3.381        3.439
+#> (Intercept)      3.4              3.376        3.423
 #> 
 #> ----------------------------------------------------
 #> Precision Coefficients: 
 #> 
 #>             Estimate       95% CI Lower 95% CI Upper
-#> (Intercept)    5.507              5.341        5.656
+#> (Intercept)    5.593              5.423        5.755
 #> 
 #> ----------------------------------------------------
 #> Diagnostics: 
 #> 
 #>        Effective Sample Size       Gelman-Rubin
-#> Betas                4004.00               1.00
-#> Gammas                 21.71               1.00
-#> cijs                  370.96               1.01
+#> Betas                4004.00                  1
+#> Gammas                 21.74                  1
+#> cijs                  462.34                  1
 #> 
 #> ----------------------------------------------------
 
@@ -279,30 +165,30 @@ summary(ft, burnIn = 500)
 #> Mean Coefficients: 
 #> 
 #>             Estimate       95% CI Lower 95% CI Upper
-#> (Intercept)     3.41              3.381        3.439
+#> (Intercept)    3.399              3.375        3.423
 #> 
 #> ----------------------------------------------------
 #> Precision Coefficients: 
 #> 
 #>             Estimate       95% CI Lower 95% CI Upper
-#> (Intercept)    5.481              5.256        5.648
+#> (Intercept)    5.593              5.414        5.782
 #> 
 #> ----------------------------------------------------
 #> Diagnostics: 
 #> 
 #>        Effective Sample Size       Gelman-Rubin
-#> Betas                4004.00               1.00
-#> Gammas                 21.76               1.01
-#> cijs                  354.69               1.01
+#> Betas                4004.00                  1
+#> Gammas                 21.78                  1
+#> cijs                  460.43                  1
 #> 
 #> ----------------------------------------------------
 ```
 
 This introduction provides enough information to start fitting the
 SkipTrack model. For further information regarding different methods of
 simulating data, additional model fitting, and tuning parameters for
-fitting the model, please see the help pages. Additional vignettes are
-forthcoming.
+fitting the model, please see the help pages and the ‘Getting Started’
+vignette. Additional vignettes are forthcoming.
 
 ## Bibliography