diff --git a/DESCRIPTION b/DESCRIPTION new file mode 100644 index 0000000..6b33997 --- /dev/null +++ b/DESCRIPTION @@ -0,0 +1,43 @@ +Package: jlmerclusterperm +Title: Cluster-Based Permutation Analysis for Densely Sampled Time Data +Version: 1.0.0 +Authors@R: + person("June", "Choe", , "jchoe001@gmail.com", role = c("aut", "cre", "cph"), + comment = c(ORCID = "0000-0002-0701-921X")) +Description: An implementation of fast cluster-based permutation analysis + (CPA) for densely-sampled time data developed in Maris & Oostenveld, + 2007 . Supports (generalized, + mixed-effects) regression models for the calculation of timewise + statistics. Provides both a wholesale and a piecemeal interface to the + CPA procedure with an emphasis on interpretability and diagnostics. + Integrates 'Julia' libraries 'MixedModels.JL' and 'GLM.JL' for + performance improvements, with additional functionalities for + interfacing with 'Julia' from 'R' powered by the 'JuliaConnectoR' + package. +License: MIT + file LICENSE +URL: https://github.com/yjunechoe/jlmerclusterperm, + https://yjunechoe.github.io/jlmerclusterperm/ +BugReports: https://github.com/yjunechoe/jlmerclusterperm/issues +Depends: R (>= 3.5) +Imports: cli, generics, JuliaConnectoR, lme4, parallel, stats, utils +Suggests: broom, broom.mixed, covr, dplyr, eyetrackingR, forcats, + future, ggplot2, knitr, MASS, patchwork, readr, rmarkdown, + scales, testthat (>= 3.0.0), tibble +VignetteBuilder: knitr +Config/testthat/edition: 3 +Encoding: UTF-8 +RoxygenNote: 7.2.3 +SystemRequirements: Julia (>= 1.8) +Collate: 'jlmerclusterperm-package.R' 'aaa.R' 'utils.R' + 'interop-utils.R' 'interop-utils-unexported.R' 'julia_rng.R' + 'jlmer_spec.R' 'jlmer.R' 'compute_timewise_statistics.R' + 'permute.R' 'permute_timewise_statistics.R' + 'clusters_methods.R' 'extract_clusters.R' 'calculate_pvalue.R' + 'clusterpermute.R' 'threshold_search.R' 'tidy.R' 'zzz.R' + 'srr-stats-standards.R' +NeedsCompilation: no +Packaged: 2023-06-26 13:50:44 UTC; jchoe +Author: June Choe [aut, cre, cph] () +Maintainer: June Choe +Repository: CRAN +Date/Publication: 2023-06-27 17:00:02 UTC diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..18f6e7f --- /dev/null +++ b/LICENSE @@ -0,0 +1,2 @@ +YEAR: 2023 +COPYRIGHT HOLDER: June Choe diff --git a/MD5 b/MD5 new file mode 100644 index 0000000..f831b13 --- /dev/null +++ b/MD5 @@ -0,0 +1,79 @@ +99d065d3766d0bd95ccbbefd6ff86652 *DESCRIPTION +d37c99895d661b1cbb1434dba60e554f *LICENSE +ae8ed6d1e62154d0b44af768c246591c *NAMESPACE +4970923336423f93a9132d3810cf5d1b *NEWS.md +be49ecf1db6646c104427d1d426a03dd *R/aaa.R +dded6ecb8f14127b62d5cc9f73bef825 *R/calculate_pvalue.R +4d0a6389674d0dbc27f54be5c92a9095 *R/clusterpermute.R +d7d7d78d410138ec1626903f2237fd15 *R/clusters_methods.R +4867888cfb27a9cd98871505c524deb0 *R/compute_timewise_statistics.R +08484587b9e29e9bbf90ced3d90863c0 *R/extract_clusters.R +9ee9ded7d3f71dd858888ccb00fd86b1 *R/interop-utils-unexported.R +dc50d0ccb8d23819a95f446d18c380e1 *R/interop-utils.R +253cc215491df9bdea3ccbd790c1c275 *R/jlmer.R +5ce0a812aaa2ac392b554292e2da9059 *R/jlmer_spec.R +7e418f869c12eff57a7a089cde1d8327 *R/jlmerclusterperm-package.R +94e4d066934fac9ef9db48b4eff084b8 *R/julia_rng.R +543078391edd55b6f0b0f8daf84e7f88 *R/permute.R +40d4dc2f70fca23e5945b6eabb0c6ece *R/permute_timewise_statistics.R +1709a3f2998bb778fdb579b7bef162d6 *R/srr-stats-standards.R +04dbb3190c9e1fa236853de886d1d4d6 *R/threshold_search.R +28eadc273b5c00bfb4fd0d01b0cacd8d *R/tidy.R +41792612dc2971527fa11a6cf7e9a47a *R/utils.R +c87c2d9619a96ec469efd6c2e01e34ee *R/zzz.R +990403547621e4727898f19b5c4910e1 *README.md +f01682ffd9d5582c0d3b8a226f3c2b16 *build/partial.rdb +adb9de191af250848e0dc03c194aeb6f *build/vignette.rds +e8c98e39c7a9fcb4b53cb90eb37e61cc *inst/doc/jlmerclusterperm.R +2eeeaca67c6a13b0217035742cecf79e *inst/doc/jlmerclusterperm.Rmd +715c28b5a1198d707dcc3a0f4f302057 *inst/doc/jlmerclusterperm.html +9ed999be829541c377dd352b9088af7a *inst/julia/01-utils.jl +c881972c1846a7f2733b9ffd56b8666b *inst/julia/02-jlmer.jl +f4bbbc7613bc8d374c333b05e4a27ef3 *inst/julia/03-compute_timewise_statistics.jl +dfa1f3efae35458bf2a0f171ea8d3a74 *inst/julia/04-extract_clusters.jl +8a6fb995580ca2b59f2b64b76d6dc428 *inst/julia/05-permute.jl +465e79b47e376ff369e2528b0ec6c080 *inst/julia/06-permute_timewise_statistics.jl +11370675fd52536178ae9ce38146c9a7 *inst/julia/Manifest.toml +40c09ab1d240715bffa775b03686ea4e *inst/julia/Project.toml +cc2bf4ca77c78822aaf3e9039c65c3fd *inst/julia/load-pkgs.jl +16d7f1d4183d6f9ac89ab3b5472e662c *man/calculate_clusters_pvalues.Rd +78b8eed1cb045b77ad0994fd71a4c633 *man/cluster_permutation_tidiers.Rd +f391f90b43b90031a7bd47ec847e3bb0 *man/clusterpermute.Rd +f869639236e67e253bf8b689001eb550 *man/compute_timewise_statistics.Rd +d90a37c13fdd54bba8821253c3c910fc *man/extract_empirical_clusters.Rd +82b8dd2ae5b21aed081c026dffda631a *man/extract_null_cluster_dists.Rd +185842332c9c3cf20e477e28ec50f3e9 *man/figures/README-chickweight-1.png +34eb6e9a2cd58064a651364eeb7ba524 *man/figures/README-empirical_statistics-1.png +34eb6e9a2cd58064a651364eeb7ba524 *man/figures/README-plot-empirical_statistics-1.png +37b23af74265d7a1a439baefb5bb2c1d *man/figures/clusterpermute_animation.gif +568eb5f08bc3bc9853ed25d4473ff086 *man/figures/clusterpermute_slice.png +ac2234ede8656d52d22a679fcad79b99 *man/figures/deCarvalho_fig7.jpg +da15c86688a5645b0a4162997a3e9c63 *man/figures/desktop.ini +6523b4232feb1248139e0347b693835b *man/figures/jlmerclusterperm_fn_design.png +15549a03a42c7dd0ff5b8eb78f1c9c33 *man/figures/jlmerclusterperm_logo_plot.png +9ddeb882f552b0d88e63dfee434e2a72 *man/figures/logo.png +2e1a60a70886a0bb890351085236920f *man/jlmer.Rd +92b8f3fb941d7c233447c845f4cb77ab *man/jlmerclusterperm-package.Rd +9cfd94a83e341a8240c97b371a561613 *man/jlmerclusterperm_setup.Rd +cb5fd7110a0f6cfc3ded1ae54c1a75de *man/julia_model_tidiers.Rd +30de1302d68fddc2cc303514d6cabcd0 *man/julia_progress.Rd +1e375a7369ffe57045421faaa498e95d *man/julia_rng.Rd +4131b937b7bf17a5c574ce029f9348a6 *man/make_jlmer_spec.Rd +6b344f72f28d84837c831ec2714d2c07 *man/permute_by_predictor.Rd +de4ba66fc4a2fb339ac15914fe754ee3 *man/permute_timewise_statistics.Rd +e6bee05432e598a82c88b34229fe1d39 *man/reexports.Rd +f8a486bd6e9ce06479b0e9db3799b09f *man/to_jlmer.Rd +fd59a0e73dc78e189c0ddd00ab298b32 *man/walk_threshold_steps.Rd +be9f777cc0c27c3e38da655f9d5b8a3b *tests/testthat.R +0a8ac524aa68f037650976f6da70f475 *tests/testthat/helper-skip.R +ffc32b75d42ee62a5029e05a62e3bcea *tests/testthat/test-aaa.R +cce80002cf27b7242cd6016dd7f37487 *tests/testthat/test-clusterpermute.R +0673653e4f6c707ea9a7f9d952939d3e *tests/testthat/test-jlmer.R +0d186c205ba8dd25d3f5e169fba286de *tests/testthat/test-jlmer_spec.R +e615f953a14780626768f54f1ee9c38b *tests/testthat/test-julia_rng.R +f80ff490f1cbf4a792508d5b48a0cfef *tests/testthat/test-permute.R +5767147d5fd0e5e7f0d30c2fe3e62123 *tests/testthat/test-progress.R +af6c7b09b65113c6554129be6631192b *tests/testthat/test-singularity.R +5a0c129f5d65a9a286e1b99f6102bd18 *tests/testthat/test-threshold_search.R +2499641f7105e7c1b41ac40c9a913318 *tests/testthat/test-timewise_statistics.R +2eeeaca67c6a13b0217035742cecf79e *vignettes/jlmerclusterperm.Rmd diff --git a/NAMESPACE b/NAMESPACE new file mode 100644 index 0000000..cee8dae --- /dev/null +++ b/NAMESPACE @@ -0,0 +1,41 @@ +# Generated by roxygen2: do not edit by hand + +S3method(format,empirical_clusters) +S3method(format,jlmer_mod) +S3method(format,jlmer_spec) +S3method(format,null_cluster_dists) +S3method(glance,jlmer_mod) +S3method(print,CPA_out) +S3method(print,empirical_clusters) +S3method(print,jlmer_mod) +S3method(print,jlmer_spec) +S3method(print,null_cluster_dists) +S3method(print,timewise_statistics) +S3method(tidy,empirical_clusters) +S3method(tidy,jlmer_mod) +S3method(tidy,null_cluster_dists) +S3method(tidy,timewise_statistics) +export(calculate_clusters_pvalues) +export(clusterpermute) +export(clusters_are_comparable) +export(compute_timewise_statistics) +export(extract_empirical_clusters) +export(extract_null_cluster_dists) +export(get_rng_seed) +export(get_rng_state) +export(glance) +export(jlmer) +export(jlmerclusterperm_setup) +export(julia_progress) +export(make_jlmer_spec) +export(permute_by_predictor) +export(permute_timewise_statistics) +export(reset_rng_state) +export(set_rng_seed) +export(set_rng_state) +export(tidy) +export(to_jlmer) +export(walk_threshold_steps) +importFrom(generics,glance) +importFrom(generics,tidy) +importFrom(stats,setNames) diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..a70b69e --- /dev/null +++ b/NEWS.md @@ -0,0 +1,21 @@ +# jlmerclusterperm 1.0.0 + +### Breaking changes + +- The `threshold_steps` argument of `walk_threshold_steps()` is renamed to `steps`. + +### New features + +- New functions to interface with Julia RNG seed: `get_rng_seed()` and `set_rng_seed()` + +### Other improvements + +- `jlmerclusterperm_setup()` now echos `Pkg.instantiate()` to print precompilation information upon the first setup call + +# jlmerclusterperm 0.2.0 + +Added vignettes. Significant usability improvements + +# jlmerclusterperm 0.1.0 + +Initial release diff --git a/R/aaa.R b/R/aaa.R new file mode 100644 index 0000000..bbe4acb --- /dev/null +++ b/R/aaa.R @@ -0,0 +1,115 @@ +#' @keywords internal +.jlmerclusterperm <- new.env(parent = emptyenv()) +.jlmerclusterperm$cli_theme <- list( + h1 = list( + `font-weight` = "regular", + `margin-top` = 0, + fmt = function(x) cli::rule(x, line_col = "white") + ), + span.lemph = list(color = "grey", `font-style` = "italic"), + span.el = list(color = "green"), + span.fm = list() +) + +is_setup <- function() isTRUE(.jlmerclusterperm$is_setup) + +#' Initial setup for the jlmerclusterperm package +#' +#' @param ... Ignored. +#' @param restart Whether to set up a fresh Julia session, given that one is already running. +#' If `FALSE` and `jlmerclusterperm_setup()` has already been called, nothing happens. +#' @param verbose Print progress and messages from Julia in the console +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @export +#' @return TRUE +jlmerclusterperm_setup <- function(..., restart = TRUE, verbose = TRUE) { + if (!JuliaConnectoR::juliaSetupOk()) cli::cli_abort("Cannot set up Julia for {.pkg jlmerclusterperm}.") + if (restart || !is_setup()) { + JuliaConnectoR::stopJulia() + setup_with_progress(verbose = verbose) + .jlmerclusterperm$is_setup <- TRUE + } else { + if (verbose) cli::cli_inform("Julia instance already running - skipping setup.") + } + invisible(TRUE) +} + +setup_with_progress <- function(..., verbose = TRUE) { + start_with_threads(verbose = verbose) + set_projenv(verbose = verbose) + source_jl(verbose = verbose) +} + +start_with_threads <- function(..., max_threads = 7, verbose = TRUE) { + JULIA_NUM_THREADS <- Sys.getenv("JULIA_NUM_THREADS") + nthreads <- getOption("jlmerclusterperm.nthreads", JULIA_NUM_THREADS) %|0|% + (min(max_threads, parallel::detectCores() - 1)) + if (verbose) cli::cli_progress_step("Starting Julia with {nthreads} thread{?s}") + if (nthreads > 1) { + Sys.setenv("JULIA_NUM_THREADS" = nthreads) + suppressMessages(JuliaConnectoR::startJuliaServer()) + Sys.setenv("JULIA_NUM_THREADS" = JULIA_NUM_THREADS) + } else { + suppressMessages(JuliaConnectoR::startJuliaServer()) + } + .jlmerclusterperm$opts$nthreads <- nthreads +} + +set_projenv <- function(..., verbose = TRUE) { + if (verbose) cli::cli_progress_step("Activating package environment") + pkgdir <- system.file("julia/", package = "jlmerclusterperm") + seed <- as.integer(getOption("jlmerclusterperm.seed", 1L)) + JuliaConnectoR::juliaCall("cd", pkgdir) + JuliaConnectoR::juliaEval("using Pkg") + JuliaConnectoR::juliaEval('Pkg.activate(".", io = devnull)') + JuliaConnectoR::juliaEval("Pkg.instantiate()") # io = devnull + JuliaConnectoR::juliaEval("Pkg.resolve(io = devnull)") + JuliaConnectoR::juliaCall("cd", getwd()) + JuliaConnectoR::juliaEval(paste0("pg_width = ", max(1L, cli::console_width() - 44L))) + JuliaConnectoR::juliaEval("pg_io = stderr") + JuliaConnectoR::juliaEval(paste0("using Random123; const rng = Threefry2x((", seed, ", 20))")) + .jlmerclusterperm$opts$seed <- seed + .jlmerclusterperm$opts$pkgdir <- pkgdir +} + +source_jl <- function(..., verbose = TRUE) { + jl_pkgs <- readLines(file.path(.jlmerclusterperm$opts$pkgdir, "load-pkgs.jl")) + jl_scripts <- list.files(.jlmerclusterperm$opts$pkgdir, pattern = "\\d{2}-.*\\.jl$", full.names = TRUE) + load_steps <- c(jl_pkgs, jl_scripts) + i <- 1L + if (verbose) cli::cli_progress_step("Running package setup scripts ({i}/{length(load_steps)})") + for (i in seq_along(load_steps)) { + if (verbose) cli::cli_progress_update() + jl_load <- load_steps[i] + if (grepl("^using ", jl_load)) { + JuliaConnectoR::juliaEval(jl_load) + } else { + JuliaConnectoR::juliaCall("include", jl_load) + } + } + exported_fns <- c( + "jlmer", "compute_timewise_statistics", + "extract_clusters", "permute_timewise_statistics", + "guess_shuffle_as", "permute_by_predictor" + ) + for (jl_fn in exported_fns) { + .jlmerclusterperm[[jl_fn]] <- JuliaConnectoR::juliaFun(jl_fn) + } + .jlmerclusterperm$exported_fns <- exported_fns +} + +#' @keywords internal +dev_source <- function() { # nocov start + .jlmerclusterperm$opts$pkgdir <- system.file("julia/", package = "jlmerclusterperm") + source_jl() +} # nocov end diff --git a/R/calculate_pvalue.R b/R/calculate_pvalue.R new file mode 100644 index 0000000..cf1ac77 --- /dev/null +++ b/R/calculate_pvalue.R @@ -0,0 +1,111 @@ +#' Calculate bootstrapped p-values of cluster-mass statistics +#' +#' @param empirical_clusters A `empirical_clusters` object +#' @param null_cluster_dists A `null_cluster_dists` object +#' @param add1 Whether to add 1 to the numerator and denominator when calculating the p-value. +#' Use `TRUE` to effectively count the observed statistic as part of the permuted +#' null distribution (recommended with larger `nsim` prior to publishing results). +#' +#' @seealso [extract_empirical_clusters()], [extract_null_cluster_dists()] +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Make empirical clusters +#' empirical_statistics <- compute_timewise_statistics(spec) +#' empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +#' empirical_clusters +#' +#' # Make null cluster-mass distribution +#' reset_rng_state() +#' null_statistics <- permute_timewise_statistics(spec, nsim = 100) +#' null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +#' +#' # Significance test the empirical cluster(s) from each predictor against the simulated null +#' calculate_clusters_pvalues(empirical_clusters, null_cluster_dists) +#' +#' # Set `add1 = TRUE` to normalize by adding 1 to numerator and denominator +#' calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1 = TRUE) +#' +#' # This sequence of procedures is equivalent to `clusterpermute()` +#' reset_rng_state() +#' clusterpermute(spec, threshold = 2, nsim = 100, progress = FALSE) +#' +#' # The empirical clusters and the null cluster-mass distribution must be comparable +#' empirical_clusters2 <- extract_empirical_clusters(empirical_statistics, threshold = 3) +#' # For example, below code errors because thresholds are different (2 vs. 3) +#' try( calculate_clusters_pvalues(empirical_clusters2, null_cluster_dists) ) +#' +#' # Check for compatibility with `clusters_are_comparable()` +#' clusters_are_comparable(empirical_clusters, null_cluster_dists) +#' clusters_are_comparable(empirical_clusters2, null_cluster_dists) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @export +#' @return An `empirical_clusters` object augmented with p-values. +calculate_clusters_pvalues <- function(empirical_clusters, null_cluster_dists, add1 = FALSE) { + clusters_are_comparable(empirical_clusters, null_cluster_dists, error = TRUE) + empirical <- lapply(empirical_clusters, `[[`, "statistic") + empirical <- Filter(function(x) !near_zero(x[1]), empirical) + null <- lapply(null_cluster_dists, `[[`, "statistic") + predictors_to_test <- setNames(nm = names(empirical)[names(empirical) %in% names(null)]) + pvalues <- lapply(predictors_to_test, function(x) { + vapply(empirical[[x]], function(cluster_statistic) { + mean(c(abs(null[[x]]) >= abs(cluster_statistic), if (add1) TRUE)) + }, numeric(1)) + }) + augmented <- empirical_clusters + attr(augmented, "pvalues") <- pvalues + augmented +} + +#' @param error Whether to throw an error if incompatible +#' @rdname calculate_clusters_pvalues +#' @export +clusters_are_comparable <- function(empirical_clusters, null_cluster_dists, error = FALSE) { + if (!inherits(empirical_clusters, "empirical_clusters") || !inherits(null_cluster_dists, "null_cluster_dists")) { + cli::cli_abort("Can only compare object of class {.cls empirical_clusters} against object of class {.cls null_cluster_dists}") + } + empirical_attrs <- attributes(empirical_clusters) + null_attrs <- attributes(null_cluster_dists) + if (!identical(empirical_attrs[c("statistic", "threshold")], null_attrs[c("statistic", "threshold")])) { + cluster_properties <- sapply(list(empirical_attrs, null_attrs), `[`, c("statistic", "threshold")) + mismatch_info <- apply(cluster_properties, 1L, function(x) { + if (x[[1]] != x[[2]]) { + x <- unlist(x, use.names = FALSE) + if (is.numeric(x)) x <- paste0("{", x, "}") + paste0("empirical uses {.val ", x[1], "} but null uses {.val ", x[2], "}.") + } + }) + mismatch_info <- Filter(Negate(is.null), mismatch_info) + mismatch_info <- setNames(paste0("{.strong ", names(mismatch_info), "}: ", mismatch_info), rep("x", length(mismatch_info))) + if (error) { + cli::cli_abort(c( + "Cluster-mass statistics between empirical and null are not comparable.", + "i" = "{.arg empirical_clusters} and {.arg null_cluster_dists} must share the same {.code statistic} and {.code threshold}.", + mismatch_info + )) + } + return(FALSE) + } else { + return(TRUE) + } +} diff --git a/R/clusterpermute.R b/R/clusterpermute.R new file mode 100644 index 0000000..c94028d --- /dev/null +++ b/R/clusterpermute.R @@ -0,0 +1,85 @@ +#' Conduct a cluster-based permutation test +#' +#' @inheritParams permute_timewise_statistics +#' @inheritParams extract_empirical_clusters +#' @inheritParams calculate_clusters_pvalues +#' @param progress Defaults to `TRUE`, which prints progress on each step of the cluster permutation test. +#' +#' @seealso [compute_timewise_statistics()], [permute_timewise_statistics()], +#' [extract_empirical_clusters()], [extract_null_cluster_dists()], +#' [calculate_clusters_pvalues()] +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Should minimally provide `threshold` and `nsim`, in addition to the spec object +#' reset_rng_state() +#' CPA <- clusterpermute(spec, threshold = 2, nsim = 100, progress = FALSE) +#' CPA +#' +#' # CPA is a list of `` and `` objects +#' sapply(CPA, class) +#' +#' # You can extract the individual components for further inspection +#' CPA$null_cluster_dists +#' CPA$empirical_clusters +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @export +#' @return A list of `null_cluster_dists` and `empirical_clusters` with p-values +clusterpermute <- function(jlmer_spec, + family = c("gaussian", "binomial"), + statistic = c("t", "chisq"), + threshold, + nsim = 100L, + predictors = NULL, + binned = FALSE, + top_n = Inf, + add1 = TRUE, + ..., + progress = TRUE) { + family <- match.arg(family) + statistic <- match.arg(statistic) + jlmer_spec$.backdoor$prepped_for_jlmer <- prep_for_jlmer(jlmer_spec, family, ...) + jlmer_spec$.backdoor$augmented_term_groups <- augment_term_groups(jlmer_spec, statistic) + if (progress) cli::cli_progress_step("Detecting empirical clusters and calculating cluster-mass statistics.") + old_opts <- julia_progress(show = FALSE) + empirical_statistics <- suppressMessages(compute_timewise_statistics(jlmer_spec, family, statistic, ...)) + julia_progress(show = old_opts$show) + empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold, binned, top_n) + if (progress) cli::cli_progress_step("Sampling cluster-mass statistics from a bootstrapped null distribution.") + null_statistics <- permute_timewise_statistics(jlmer_spec, family, statistic, nsim, predictors, ...) + null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold, binned) + if (progress) cli::cli_progress_step("Calculating the probability of the observed cluster-mass statistics.") + empirical_clusters_p <- calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1) + structure(list(null_cluster_dists = null_cluster_dists, empirical_clusters = empirical_clusters_p), class = "CPA_out") +} + +#' @export +print.CPA_out <- function(x, ...) { + cat( + "$null_cluster_dists", + format(x$null_cluster_dists, ...), + "\n$empirical_clusters", + format(x$empirical_clusters, ...), + sep = "\n" + ) +} diff --git a/R/clusters_methods.R b/R/clusters_methods.R new file mode 100644 index 0000000..c6077b9 --- /dev/null +++ b/R/clusters_methods.R @@ -0,0 +1,103 @@ +#' @export +print.empirical_clusters <- function(x, ...) { + cat(format(x, ...), sep = "\n") +} + +#' @export +format.empirical_clusters <- function(x, ...) { + pvalues <- attr(x, "pvalues") + missing_clusters <- attr(x, "missing_clusters") + term_groups <- attr(x, "term_groups") + predictor_dfs <- attr(term_groups, "dfs") + time <- attr(x, "time") + statistic <- attr(x, "statistic") + threshold <- attr(x, "threshold") + binned <- attr(x, "binned") + zero_clusters <- x[missing_clusters] + valid_clusters <- x[!missing_clusters] + if (!is.null(pvalues)) valid_clusters <- valid_clusters[names(pvalues)] + cli::cli_format_method( + { + cli::cli_rule(left = paste("{.strong Empirical clusters}", format_threshold(statistic)), right = "{.cls empirical_clusters}") + for (i in seq_along(valid_clusters)) { + predictor <- names(valid_clusters)[[i]] + cli::cli_text("{.el {predictor}}", if (statistic == "chisq") " ({.emph df = {predictor_dfs[[predictor]]}}){?*}") + cluster_df <- valid_clusters[[i]] + clusters <- split(cluster_df, seq_len(nrow(cluster_df))) + names(clusters) <- paste0("[", time[cluster_df$cluster_start], ", ", time[cluster_df$cluster_end], "]") + clusters <- lapply(clusters, function(cluster) sprintf("%0.3f", cluster$statistic)) + if (clusters[[1]] != "0.00") { + if (!is.null(pvalues)) { + clusters[] <- lapply(seq_along(clusters), function(cluster_ind) { + pval <- pvalues[[i]][[cluster_ind]] + paste(clusters[[cluster_ind]], sprintf(paste0("{.", ifelse(pval < 0.05, "emph", "lemph"), " (p=%0.4f)}"), round(pval, 4))) + }) + } + cli::cli_ul() + cli::cli_dl(clusters) + cli::cli_end() + } + } + cli::cli_rule() + if (length(zero_clusters) > 0) { + cli::cli_alert_warning("No clusters found for {.el {names(zero_clusters)}}") + } + if (statistic == "chisq" && any(predictor_dfs > 1)) { + cli::cli_alert_info(c("* The {.val chisq} statistic for multi-level factors are unsigned. Use {.arg statistic = {.val t}} for a more interpretable result.")) + } + }, + theme = .jlmerclusterperm$cli_theme + ) +} + +#' @export +print.null_cluster_dists <- function(x, levels = 0.95, ...) { + cat(format(x, levels, ...), sep = "\n") +} + +#' @export +format.null_cluster_dists <- function(x, levels = 0.95, ...) { + term_groups <- attr(x, "term_groups") + predictor_dfs <- attr(term_groups, "dfs") + statistic <- attr(x, "statistic") + threshold <- attr(x, "threshold") + binned <- attr(x, "binned") + cluster_stats <- lapply(x, extract_null_cluster_stats, levels) + cli::cli_format_method( + { + cli::cli_rule(left = paste("{.strong Null cluster-mass distribution}", format_threshold(statistic)), right = "{.cls null_cluster_dists}") + for (i in seq_along(cluster_stats)) { + predictor <- names(x)[[i]] + cli::cli_text("{.el {predictor}} (n = {cluster_stats[[i]]$n}", if (statistic == "chisq") ", {.emph df = {predictor_dfs[[predictor]]}}{?*}", ")") + cli::cli_ul() + cli::cli_dl(cluster_stats[[i]][1:2]) + cli::cli_end() + } + cli::cli_rule() + if (statistic == "chisq" && any(predictor_dfs > 1)) { + cli::cli_alert_info(c("* The {.val chisq} statistic for multi-level factors are unsigned. Use {.arg statistic = {.val t}} for a more interpretable result.")) + } + }, + theme = .jlmerclusterperm$cli_theme + ) +} + +extract_null_cluster_stats <- function(x, levels) { + statistics <- x$statistic + mean_se <- do.call(sprintf, c("%0.3f (%0.2f)", lapply(list(mean, stats::sd), function(f) f(statistics)))) + cis <- paste(vapply(levels, function(prob) { + percent <- paste0(prob * 100, "%") + bounds <- ((1 - prob) / 2) + c(0, prob) + interval <- stats::quantile(statistics, bounds) + paste(percent, sprintf("[%0.3f, %0.3f]", interval[1], interval[2])) + }, character(1)), collapse = ", ") + list("Mean (SD)" = mean_se, "Coverage intervals" = cis, n = nrow(x)) +} + +format_threshold <- function(statistic) { + if (statistic == "t") { + "(t > {.val {threshold}})" + } else if (statistic == "chisq") { + "(chisq p < {.val {threshold}})" + } +} diff --git a/R/compute_timewise_statistics.R b/R/compute_timewise_statistics.R new file mode 100644 index 0000000..007d3de --- /dev/null +++ b/R/compute_timewise_statistics.R @@ -0,0 +1,117 @@ +#' Fit Julia regression models to each time point of a time series data +#' +#' @inheritParams jlmer +#' @param statistic Test statistic for calculating cluster mass. +#' Can be one of `"t"` (default) from the regression model output or +#' `"chisq"` from a likelihood ratio test (takes about twice as long to calculate). +#' @param ... Optional arguments passed to Julia for model fitting. +#' Defaults to `fast = TRUE` (when `family = "binomial"`) and `progress = FALSE`. +#' +#' @seealso [jlmer()], [make_jlmer_spec()] +#' +#' @srrstats {RE3.0} Issues singularity messages and excludes runs with convergence failures in permutation testing (and informs of this) +#' @srrstats {RE3.1} Convergence failures can be retrieved from function outputs, but users are encouraged to watch out for warnings and messages. +#' These can be suppressed via the `suppress*()` functions. +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Predictor x Time matrix of t-statistics from regression output +#' empirical_statistics <- compute_timewise_statistics(spec) +#' round(empirical_statistics, 2) +#' +#' # Collect as dataframe with `tidy()` +#' empirical_statistics_df <- tidy(empirical_statistics) +#' empirical_statistics_df +#' +#' # Timewise statistics are from regression models fitted to each time point +#' # - Note the identical statistics at `Time == 0` +#' empirical_statistics_df %>% +#' filter(time == 0) +#' to_jlmer(weight ~ 1 + Diet, filter(ChickWeight, Time == 0)) %>% +#' tidy() %>% +#' select(term, statistic) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A predictor-by-time matrix of cluster statistics, of class `timewise_statistics`. +#' @export +compute_timewise_statistics <- function(jlmer_spec, family = c("gaussian", "binomial"), statistic = c("t", "chisq"), ...) { + check_arg_class(jlmer_spec, "jlmer_spec") + family <- match.arg(family) + statistic <- match.arg(statistic) + is_mem <- jlmer_spec$meta$is_mem + term_groups <- augment_term_groups(jlmer_spec, statistic) + args <- prep_for_jlmer(jlmer_spec, family = family, ...) + + opts <- list(...) + opts <- utils::modifyList(list(progress = FALSE), opts) + if (family == "binomial") { + opts <- utils::modifyList(list(fast = TRUE), opts) + } + + + out <- JuliaConnectoR::juliaGet(do.call( + .jlmerclusterperm$compute_timewise_statistics, + c(args, term_groups$jl, statistic, is_mem, opts) + )) + + alert_diagnostics(jlmer_spec, out) + + if (statistic == "t") { + dimnames(out$t_matrix) <- out[c("Predictor", "Time")] + out$t_matrix <- out$t_matrix[out$Predictor != "1", , drop = FALSE] + } else { + Predictors <- names(term_groups$r) + dimnames(out$t_matrix) <- c(list(Predictor = Predictors[Predictors != "1"]), out["Time"]) + } + + structure(out$t_matrix, + class = "timewise_statistics", + statistic = statistic, term_groups = term_groups$r + ) +} + +alert_diagnostics <- function(jlmer_spec, out) { + if (any(out$convergence_failures)) { + cli::cli_alert_danger(c( + "{.val {sum(out$convergence_failures)}} convergence failure{?s} at the following timepoint{?s}: ", + "{.val {out$Time[out$convergence_failures]}}." + )) + } + if (jlmer_spec$meta$is_mem) { + singular_fits <- out$singular_fits + if (any(singular_fits)) { + cli::cli_alert_info("{.val {sum(singular_fits)}} singular fit{?s} ({round(mean(singular_fits) * 100, 2)}%).") + } + re_n_terms <- sapply(lme4::findbars(jlmer_spec$formula$jl), function(x) { + setNames(length(x[[2]]), deparse1(x[[3]])) + }) + if (mean(singular_fits) > .2 && any(re_n_terms > 1)) { + cli::cli_alert_info("Average number of components estimated to capture 95% of RE variance:") + rePCs <- rowMeans(out$rePCA_95_matrix) + cli::cli_ul() + lapply(seq_along(out$Grouping), function(i) { + if (re_n_terms[out$Grouping[i]] > 1) cli::cli_li("{out$Grouping[i]}: {sprintf('%.01f', rePCs[i])}") + }) + cli::cli_end() + } + } +} diff --git a/R/extract_clusters.R b/R/extract_clusters.R new file mode 100644 index 0000000..cfe6cfc --- /dev/null +++ b/R/extract_clusters.R @@ -0,0 +1,164 @@ +#' Detect largest clusters from a time sequence of predictor statistics +#' +#' @param empirical_statistics A predictor-by-time matrix of empirical timewise statistics. +#' @param threshold The threshold value that the statistic must pass to contribute to cluster mass. +#' Interpretation differs on the choice of statistic (more below): +#' * If `statistic = "t"`, the threshold for t-value (beta/std.err) from the regression model. +#' * If `statistic = "chisq"`, the threshold for the p-value of chi-squared statistics from likelihood ratio tests. +#' @param binned Whether the data has been aggregated/collapsed into time bins. Defaults to `FALSE`, +#' which requires a cluster to span at least two time points. If `TRUE`, allows length-1 clusters to exist. +#' @param top_n How many clusters to return, in the order of the size of the cluster-mass statistic. +#' Defaults to `Inf` which return all detected clusters. +#' +#' @seealso [compute_timewise_statistics()] +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Empirical clusters are derived from the timewise statistics +#' empirical_statistics <- compute_timewise_statistics(spec) +#' empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +#' empirical_clusters +#' +#' # Collect as dataframe with `tidy()` +#' empirical_clusters_df <- tidy(empirical_clusters) +#' empirical_clusters_df +#' +#' # Changing the `threshold` value identifies different clusters +#' extract_empirical_clusters(empirical_statistics, threshold = 1) +#' +#' # A predictor can have zero or multiple clusters associated with it +#' extract_empirical_clusters(empirical_statistics, threshold = 3) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return An `empirical_clusters` object. +#' @export +extract_empirical_clusters <- function(empirical_statistics, threshold, binned = FALSE, top_n = Inf) { + time <- dimnames(empirical_statistics)$Time + statistic <- attr(empirical_statistics, "statistic") + empirical_statistics <- apply_threshold(empirical_statistics, statistic, threshold) + predictors <- rownames(empirical_statistics) + n <- as.integer(max(min(top_n, ncol(empirical_statistics)), 1)) + largest_clusters <- .jlmerclusterperm$extract_clusters(empirical_statistics, binned, n) + cluster_dfs <- df_from_DF(largest_clusters) + empirical_clusters <- split(cluster_dfs[, -5], predictors[cluster_dfs$id])[predictors] + empirical_clusters <- lapply(empirical_clusters, function(cluster_df) { + cluster_df[order(cluster_df$cluster_id), ] + }) + missing_clusters <- vapply(empirical_clusters, function(x) all(near_zero(x$statistic)), logical(1)) + structure(empirical_clusters, + class = "empirical_clusters", + missing_clusters = missing_clusters, statistic = statistic, threshold = threshold, + binned = binned, time = time, + term_groups = attr(empirical_statistics, "term_groups") + ) +} + +#' Construct a null distribution of cluster-mass statistics +#' +#' @param null_statistics A simulation-by-time-by-predictor 3D array of null (permuted) timewise statistics. +#' @inheritParams extract_empirical_clusters +#' +#' @seealso [permute_timewise_statistics()] +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Null cluster-mass distributions are derived from the permuted timewise statistics +#' reset_rng_state() +#' null_statistics <- permute_timewise_statistics(spec, nsim = 100) +#' null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +#' null_cluster_dists +#' +#' # Collect as dataframe with `tidy()` +#' # - Each simulation contributes one (largest) cluster-mass statistic to the null +#' # - When no clusters are found, the `sum_statistic` value is zero +#' null_cluster_dists_df <- tidy(null_cluster_dists) +#' null_cluster_dists_df +#' +#' # Changing the `threshold` value changes the shape of the null +#' extract_null_cluster_dists(null_statistics, threshold = 1) +#' extract_null_cluster_dists(null_statistics, threshold = 3) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A `null_cluster_dists` object. +#' @export +extract_null_cluster_dists <- function(null_statistics, threshold, binned = FALSE) { + time <- dimnames(null_statistics)$Time + statistic <- attr(null_statistics, "statistic") + null_statistics <- apply_threshold(null_statistics, statistic, threshold) + null_cluster_dists <- apply(null_statistics, 3, function(t_matrix) { + t_matrix <- t_matrix[!is.nan(rowSums(t_matrix)), ] + largest_clusters <- df_from_DF(.jlmerclusterperm$extract_clusters(t_matrix, binned, 1L)) + }, simplify = FALSE) + structure(null_cluster_dists, + class = "null_cluster_dists", + statistic = statistic, threshold = threshold, binned = binned, time = time, + term_groups = attr(null_statistics, "term_groups") + ) +} + +#' @keywords internal +apply_threshold <- function(timewise_statistics, statistic, threshold) { + if (statistic == "t") { + timewise_statistics[abs(timewise_statistics) <= abs(threshold)] <- 0 + } else if (statistic == "chisq") { + threshold_dict <- chisq_threshold_dict(attr(timewise_statistics, "term_groups"), threshold) + predictors <- dimnames(timewise_statistics)$Predictor + for (predictor in predictors) { + predictor_ind <- slice.index(timewise_statistics, "Predictor") == match(predictor, predictors) + predictor_statistics <- timewise_statistics[predictor_ind] + predictor_statistics[abs(predictor_statistics) <= abs(threshold_dict[threshold_dict$term == predictor, "threshold"])] <- 0 + timewise_statistics[predictor_ind] <- predictor_statistics + } + } + timewise_statistics +} + +#' @keywords internal +chisq_threshold_dict <- function(term_groups, threshold) { + if (threshold < 0 || threshold > 1) { + cli::cli_abort("{.arg threshold} must be between {.val {0}} and {.val {1}} when {.arg statistic = {.val chisq}}") + } + threshold_dict <- utils::stack(attr(term_groups, "dfs")) + colnames(threshold_dict) <- c("df", "term") + threshold_dict$df <- as.integer(threshold_dict$df) + threshold_dict$threshold <- stats::qchisq(p = 1 - threshold, df = threshold_dict$df) + threshold_dict +} diff --git a/R/interop-utils-unexported.R b/R/interop-utils-unexported.R new file mode 100644 index 0000000..a4b9832 --- /dev/null +++ b/R/interop-utils-unexported.R @@ -0,0 +1,41 @@ +strip_JLTYPE <- function(x) { + attr(x, "JLTYPE") <- NULL + x +} + +df_from_DF <- function(DF) { + df_str <- JuliaConnectoR::juliaGet(DF) + as.data.frame(df_str$columns, col.names = unlist(df_str$colindex$names, use.names = FALSE)) +} + +prep_for_jlmer <- function(jlmer_spec, family, ...) { + if (!is.null(jlmer_spec$.backdoor$prepped_for_jlmer)) { + return(jlmer_spec$.backdoor$prepped_for_jlmer) + } + + julia_formula <- jlmer_spec$formula$jl + data <- jlmer_spec$data + time <- jlmer_spec$meta$time + + opts <- list(...) + if (is.null(opts) || any(names(opts) == "")) { + cli::cli_abort("All optional arguments to fit in {.arg ...} must be named.") + } + + jlmer_fm <- JuliaConnectoR::juliaEval(paste0("@formula(", deparse1(julia_formula), ")")) + jlmer_df <- JuliaConnectoR::juliaCall("DataFrame", data) + + jmler_family <- JuliaConnectoR::juliaCall(switch(family, + gaussian = "Normal", + binomial = "Bernoulli" + )) + + jlmer_groupings <- if (!is.null(lme4::findbars(julia_formula))) { + grouping_vars <- lapply(lme4::findbars(julia_formula), `[[`, 3) + jlmer_groupings <- JuliaConnectoR::juliaLet("Dict(x .=> [Grouping()])", x = grouping_vars) + } else { + list() + } + + list(jlmer_fm, jlmer_df, time, jmler_family, jlmer_groupings) +} diff --git a/R/interop-utils.R b/R/interop-utils.R new file mode 100644 index 0000000..d4e692e --- /dev/null +++ b/R/interop-utils.R @@ -0,0 +1,59 @@ +#' Set/get options for Julia progress bar +#' +#' @param show Whether to show the progress bar. You may also pass in a list of `"show"` and `"width"`. +#' @param width Width of the progress bar. If `"auto"`, adjusts the progress bar width to fit the console. +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' } +#' +#' # Show current progress options +#' julia_progress() +#' +#' # Set options and save previous options +#' old_progress_opts <- julia_progress(show = FALSE, width = 100) +#' julia_progress() +#' +#' # Restoring progress settings by passing a list of old options +#' old_progress_opts +#' julia_progress(old_progress_opts) +#' identical(julia_progress(), old_progress_opts) +#' +#' # Alternatively, reset to default settings using this syntax: +#' julia_progress(show = TRUE, width = "auto") +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return Previous values for `show` and `width` +#' @export +julia_progress <- function(show, width) { + show_missing <- missing(show) + both_missing <- show_missing && missing(width) + opts_is_list <- !show_missing && is.list(show) && identical(names(show), c("show", "width")) + old_opts <- JuliaConnectoR::juliaGet(JuliaConnectoR::juliaEval("(show = !(pg_io isa Base.DevNull), width = pg_width)")) + if (!show_missing) { + if (opts_is_list) { + width <- show$width + show <- show$show + } + JuliaConnectoR::juliaEval(paste0("pg_io = ", if (show) "stderr" else "devnull")) + } + if (!missing(width)) { + if (width == "auto") { + width <- max(1L, cli::console_width() - 44L) + } + JuliaConnectoR::juliaEval(paste0("pg_width = ", width)) + } + out <- strip_JLTYPE(old_opts) + if (both_missing) { + out + } else { + invisible(out) + } +} diff --git a/R/jlmer.R b/R/jlmer.R new file mode 100644 index 0000000..9343663 --- /dev/null +++ b/R/jlmer.R @@ -0,0 +1,117 @@ +#' Fit a Julia regression model using lme4 syntax +#' +#' @param jlmer_spec_opts List of options passed to `make_jlmer_spec()` +#' @inheritParams jlmer +#' @inheritParams make_jlmer_spec +#' +#' @seealso [jlmer()], [make_jlmer_spec()] +#' +#' @srrstats {G2.16} Undefined values are caught during model fitting in Julia. +#' @srrstats {RE1.0} Uses R formula interface +#' @srrstats {RE4.0} `jlmer()` and `to_jlmer()` return pointers to Julia model objects. +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' jlmerclusterperm_setup(verbose = FALSE) +#' +#' # Fitting a regression model with R formula syntax +#' to_jlmer(weight ~ 1 + Diet, ChickWeight) +#' +#' # `lm()` equivalent +#' summary(lm(weight ~ 1 + Diet, ChickWeight))$coef +#' +#' # Fitting a mixed model with {lme4} syntax +#' to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight) +#' +#' # Passing MixedModels.jl fit options to the `...` +#' to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight, REML = TRUE) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A `jlmer_mod` object. +#' @export +to_jlmer <- function(formula, data, family = c("gaussian", "binomial"), jlmer_spec_opts = list(), ..., progress = FALSE) { + jlmer_spec <- do.call(make_jlmer_spec, utils::modifyList(jlmer_spec_opts, list(formula = formula, data = data))) + jlmer(jlmer_spec, family, ...) +} + +#' Fit a Julia regression model using jlmer specifications +#' +#' @param jlmer_spec Data prepped for jlmer from `make_jlmer_spec()` +#' @param family A GLM family. Currently supports "gaussian" and "binomial". +#' @param ... Optional arguments passed to Julia for model fitting. +#' @param progress If `TRUE`, prints the timing of iterations. +#' +#' @seealso [make_jlmer_spec()] +#' +#' @srrstats {RE3.3} Convergence thresholds can be explicitly set by passing the appropriate argument to the `...` of functions that call GLM/MixedModels +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' jlmerclusterperm_setup(verbose = FALSE) +#' +#' # Fitting a regression model with a specification object +#' spec <- make_jlmer_spec(weight ~ 1 + Diet, ChickWeight) +#' jlmer(spec) +#' +#' # `lm()` equivalent +#' summary(lm(weight ~ 1 + Diet, ChickWeight))$coef +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A `jlmer_mod` object. +#' @export +jlmer <- function(jlmer_spec, family = c("gaussian", "binomial"), ..., progress = FALSE) { + check_arg_class(jlmer_spec, "jlmer_spec") + family <- match.arg(family) + args <- prep_for_jlmer(jlmer_spec, family = family, ...)[-3] + + mod <- do.call(.jlmerclusterperm$jlmer, c(args, jlmer_spec$meta$is_mem, progress = progress, ...)) + structure(mod, class = c("jlmer_mod", class(mod))) +} + +#' @srrstats {RE4.3} Confidence intervals printed in `print.jlmer_mod()` for LMs, not implemented for LMEMs. +#' @srrstats {RE4.4} Formula printed via `print.jlmer_mod()` or evaluating Julia code on model object via JuliaConnectoR +#' @srrstats {RE4.6} The variance-covariance matrix of the model parameters via `print.jlmer_mod()` or evaluating Julia code on model object via JuliaConnectoR +#' @srrstats {RE4.8} Response variable and metadata printed via `print.jlmer_mod()` or evaluating Julia code on model object via JuliaConnectoR +#' @srrstats {RE4.13} Predictor variables and metadata printed via `print.jlmer_mod()` or evaluating Julia code on model object via JuliaConnectoR +#' @srrstats {RE4.17} Print method defined for all custom S3 objects. +#' @export +print.jlmer_mod <- function(x, ...) { + cat(format(x, ...)) +} + +#' @export +format.jlmer_mod <- function(x, ...) { + cat("\n", sep = "") + if (JuliaConnectoR::juliaLet("x isa MixedModel", x = x)) { + re <- gsub("\n\n$", "\n", showobj_reformat(JuliaConnectoR::juliaCall("VarCorr", x))) + fe <- showobj_reformat(JuliaConnectoR::juliaCall("coeftable", x)) + out <- c(re, fe) + } else { + out <- showobj_reformat(JuliaConnectoR::juliaCall("coeftable", x)) + } + out +} + +showobj_reformat <- function(x) { + paste0(trimws(utils::capture.output(print(x))[-1], whitespace = "[\n]"), collapse = "\n") +} diff --git a/R/jlmer_spec.R b/R/jlmer_spec.R new file mode 100644 index 0000000..791a8d6 --- /dev/null +++ b/R/jlmer_spec.R @@ -0,0 +1,285 @@ +#' Create a specifications object for fitting regression models in Julia +#' +#' @param formula Model formula in R syntax +#' @param data A data frame +#' @param subject Column for subjects in the data. +#' @param trial Column for trials in the data. Must uniquely identify a time series within subject +#' (for example, the column for items in a counterbalanced design where each subject sees exactly one item). +#' @param time Column for time in the data. +#' @param drop_terms (Optional) any terms to drop from the reconstructed model formula +#' @param ... Unused, for extensibility. +#' +#' @srrstats {G2.5} Inputs are never strictly expected as factor, but users are expected to be aware of the concept of contrast coding, +#' such that they can further control the numerical coding of their categorical variables with `contrasts()`. +#' @srrstats {G2.7} Package accepts any tabular structure that is `model.matrix()`-able. +#' @srrstats {G2.8} Package applies a strict gateway via `make_jlmer_spec()` where all CPA-related functions require a pre-processed `` object. +#' @srrstats {G2.9} Messages dropped rows and invalid names on `model.matrix()` conversion. +#' @srrstats {G2.10} Handled via `model.matrix()` +#' @srrstats {G2.11} Handled via `model.matrix()` +#' @srrstats {G2.12} Handled via `model.matrix()` +#' @srrstats {G2.13} Handled via `model.matrix()` and informs on dropping NA rows. +#' @srrstats {G2.14} `make_jlmer_spec()` informs about NA rows. Imputation not implemented as it does not make sense for this package. +#' @srrstats {G5.8} Edge conditions on data are caught firstly by `model.matrix()` (inside `make_jlmer_spec()`) and also in Julia when handing the data off to GLM/MixedModels +#' @srrstats {RE1.1} Handled standardly via `model.matrix()` +#' @srrstats {RE1.3} Handled standardly via `model.matrix()`. Users can see the model matrix with contrasts spelled-out upon printing a `` object +#' +#' @examples +#' # Bare specification object (minimal spec for fitting a global model) +#' spec <- make_jlmer_spec(weight ~ 1 + Diet, ChickWeight) +#' spec +#' +#' # Constraints on specification for CPA: +#' # 1) The combination of `subject`, `trial`, and `time` must uniquely identify rows in the data +#' # 2) `time` must have constant sampling rate (i.e., evenly spaced values) +#' spec_wrong <- make_jlmer_spec( +#' weight ~ 1 + Diet, ChickWeight, +#' time = "Time" +#' ) +#' unique(ChickWeight$Time) +#' +#' # Corrected specification for the above +#' spec_correct <- make_jlmer_spec( +#' weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec_correct +#' +#' @return An object of class `jlmer_spec`. +#' @export +make_jlmer_spec <- function(formula, data, subject = NULL, trial = NULL, time = NULL, drop_terms = NULL, ...) { + # Old names + fm <- formula + df <- data + + # Validation + check_arg_class(formula, "formula") + check_arg_class(data, "data.frame", "data") + special_cols <- c(subject, trial, time) + # Validate grouping structure + if (!is.null(special_cols)) { + if (!all(special_cols %in% colnames(data))) { + cli::cli_abort("Column{?s} {.val {special_cols[!special_cols %in% colnames(data)]}} not found in {.arg data}") + } + if (nrow(data) != nrow(unique(data[, special_cols, drop = FALSE]))) { + cli::cli_alert_warning(c( + "Grouping column{?s} {.val {special_cols}} do{?es/} not uniquely identify rows in the data" + )) + } + } + if (!is.null(time)) { + time_diffs <- diff(sort(unique(data[[time]]))) + if (!all(time_diffs == time_diffs[1])) { + cli::cli_alert_warning(c( + "Sampling rate for the {.arg time} column {.val {time}} is not constant - ", + "may affect interpretability of results." + )) + } + } + + # Formula reconstruction + fm_env <- attr(fm, ".Environment") + fm_response <- deparse1(fm[[2]]) + fm_terms <- stats::terms(lme4::subbars(fm), keep.order = TRUE) + attr(fm_terms, "intercept") <- as.logical(attr(stats::terms(lme4::nobars(fm)), "intercept")) + fm_cols <- vapply(as.list(attr(fm_terms, "variables")[-1]), deparse1, character(1)) + df_subset <- df[, fm_cols, drop = FALSE] + na_rows <- !stats::complete.cases(df_subset) + if (any(na_rows)) { + df_subset <- df_subset[!na_rows, ] + cli::cli_alert_warning("Dropping {.val {sum(na_rows)}} row{?s} with missing values.") + } + + re <- lme4::findbars(fm) + has_re <- !is.null(re) + if (has_re) { + lfm <- lme4::lFormula(fm, df) + fe_term_labels <- attr(stats::terms(lme4::nobars(fm)), "term.labels") + model_matrix <- lfm$X + } else { + fe_term_labels <- attr(fm_terms, "term.labels") + model_matrix <- stats::model.matrix(fm_terms, df_subset) + } + + terms_compact <- fe_term_labels + terms_expanded <- colnames(model_matrix) + terms_grouping <- setNames(attr(model_matrix, "assign"), colnames(model_matrix)) + has_intercept <- 0 %in% terms_grouping + if (has_intercept) { + terms_compact <- c(1, terms_compact) + } + terms_dict <- split(names(terms_grouping), terms_grouping) + names(terms_dict) <- terms_compact + + # Renaming + # split_fct_term <- function(x, y) { + # ifelse(x == y, x, vapply(x, gsub, character(1), pattern = paste0(y, ""), replacement = paste0(y, "."), fixed = TRUE)) + # } + renamed_terms_dict <- lapply(seq_along(terms_dict), function(i) { + nm <- names(terms_dict)[i] + x <- terms_dict[[i]] + if (grepl(":", nm)) { + renamed <- vapply(x, gsub, character(1), pattern = ":", replacement = "__", fixed = TRUE) + # if (length(x) > 1) { + # nm_split <- strsplit(nm, ":", fixed = TRUE)[[1]] + # renamed_split <- sapply(renamed, function(x) strsplit(x, "__", fixed = TRUE)[[1]]) + # renamed_terms_split <- sapply(seq_along(nm_split), function(j) { + # split_fct_term(renamed_split[j,], nm_split[j]) + # }) + # renamed <- apply(renamed_terms_split, 1, paste0, collapse = "__") + # } + } else if (length(x) > 1) { + renamed <- x + # renamed <- split_fct_term(x, nm) + } else { + renamed <- x + } + out <- unname(renamed) + out[!out %in% drop_terms] + }) + names(renamed_terms_dict) <- names(terms_dict) + colnames(model_matrix) <- gsub(":", "__", colnames(model_matrix), fixed = TRUE) + model_matrix_df <- as.data.frame(model_matrix)[setdiff(colnames(model_matrix), "(Intercept)")] + model_matrix_df <- cbind(df_subset[, fm_response, drop = FALSE], model_matrix_df) + + if (!has_re) { + fe_terms_renamed <- unlist(renamed_terms_dict, recursive = TRUE, use.names = FALSE) + fe_terms_renamed <- fe_terms_renamed[fe_terms_renamed != "(Intercept)"] + if (!is.null(drop_terms)) fe_terms_renamed <- fe_terms_renamed[!fe_terms_renamed %in% drop_terms] + fe_terms_renamed <- as.character(c(as.integer(has_intercept), fe_terms_renamed)) + fe_fm <- stats::reformulate(fe_terms_renamed, fm_response, env = fm_env) + r_fm <- jl_fm <- fe_fm + re_groups <- NULL + } else { + fe <- lme4::nobars(fm) + fe_expanded <- stats::terms(fe, keep.order = TRUE) + fe_terms <- attr(fe_expanded, "term.labels") + renamed_fe_terms_dict <- renamed_terms_dict[fe_terms] + fe_terms_renamed <- unlist(renamed_fe_terms_dict, use.names = FALSE) + if (!is.null(drop_terms)) fe_terms_renamed <- fe_terms_renamed[!fe_terms_renamed %in% drop_terms] + fe_terms_renamed <- as.character(c(as.integer(has_intercept), fe_terms_renamed)) + fe_fm <- stats::reformulate(fe_terms_renamed)[[2]] + re_terms_renamed <- lapply(re, function(x) { + terms <- stats::terms.formula(call("~", x[[2]])) + renamed <- unlist(renamed_terms_dict[attr(terms, "term.labels")], use.names = FALSE) + if ("0" %in% fe_terms_renamed) renamed <- renamed[renamed != "1"] + if (!is.null(drop_terms)) renamed <- renamed[!renamed %in% drop_terms] + unique(c(attr(terms, "intercept"), renamed)) + }) + re_rhs_deparsed <- vapply(re, function(x) deparse1(x[[3]]), character(1)) + re_terms_regrouped <- lapply(split(re_terms_renamed, re_rhs_deparsed), unlist, use.names = FALSE) + re_bars <- ifelse(table(re_rhs_deparsed) > 1, "||", "|") + re_groups <- lapply(names(re_terms_regrouped), function(x) { + bar <- re_bars[[x]] + lhs <- re_terms_regrouped[[x]] + lhs <- gsub(":", "__", lhs, fixed = TRUE) + if (all(c("1", "0") %in% lhs)) lhs <- lhs[lhs != "0"] + lhs <- stats::reformulate(lhs)[[2]] + call(bar, lhs, as.symbol(x)) + }) + re_fm_r <- lapply(re_groups, function(x) call("(", x)) + re_fm_jl <- lapply(re_groups, function(x) { + if (identical(x[[1]], quote(`||`))) { + x[[1]] <- quote(`|`) + call("zerocorr", x) + } else { + call("(", x) + } + }) + combine_fm <- function(re_str) { + expanded <- Reduce(function(x, y) call("+", x, y), c(fe_fm, re_str)) + stats::as.formula(call("~", as.symbol(fm_response), expanded), fm_env) + } + re_groups <- names(re_bars) + r_fm <- combine_fm(re_fm_r) + jl_fm <- combine_fm(re_fm_jl) + re_cols <- as.data.frame.list(lapply(df[!na_rows, names(re_bars), drop = FALSE], as.character)) + model_matrix_df <- cbind(model_matrix_df, re_cols) + } + + cols_keep <- c(subject, trial, time) + + model_matrix_df <- cbind( + model_matrix_df, + df[!na_rows, setdiff(cols_keep, colnames(model_matrix_df)), drop = FALSE] + ) + + model_matrix_df <- maybe_as_tibble(model_matrix_df) + + out <- list( + formula = list(r = r_fm, jl = jl_fm, original = fm), + data = model_matrix_df, + meta = list( + term_groups = renamed_terms_dict, re_groups = re_groups, + subject = subject, trial = trial, time = time, + is_mem = has_re + ) + ) + + structure(out, class = "jlmer_spec") +} + +#' @export +print.jlmer_spec <- function(x, ...) { + cat(format(x, ...), sep = "\n") +} + +#' @export +format.jlmer_spec <- function(x, ...) { + cli::cli_format_method( + { + cli::cli_rule(left = "{.strong jlmer specification}", right = "{.cls jlmer_spec}") + # Formula + cli::cli_text("{.el Formula}: {.fm {deparse1(x$formula$jl)}}") + # Terms + cli::cli_text("{.el Predictors}:") + terms <- Filter(function(term) !identical(term, "(Intercept)"), x$meta$term_groups) + cli::cli_ul() + cli::cli_dl(lapply(terms, paste, collapse = ", "), paste0("{.emph ", names(terms), "}")) + cli::cli_end() + # Grouping + groupings <- x$meta[c("subject", "trial", "time")] + if (!is.null(unlist(groupings))) { + cli::cli_text("{.el Groupings}:") + cli::cli_ul() + cli::cli_dl(groupings, paste0("{.emph ", c("Subject", "Trial", "Time"), "}")) + cli::cli_end() + } + # Data + cli::cli_text("{.el Data}:") + if (inherits(x$data, "tbl_df")) { + old_pillar.advice <- options(pillar.advice = FALSE) + print(x$data, n = 3) + options(old_pillar.advice) + } else { + print(x$data, max = 3 * ncol(x$data)) + } + cli::cli_rule() + }, + theme = .jlmerclusterperm$cli_theme + ) +} + +# update_spec_data <- function(jlmer_spec, data, reform = FALSE) { +# .jlmer_spec <- jlmer_spec +# if (is.function(data)) { +# .data <- data(jlmer_spec$data) +# } else { +# .data <- data +# } +# used_cols <- get_spec_used_cols(jlmer_spec) +# if (!all(used_cols %in% colnames(.data))) { +# missing_cols <- used_cols[!used_cols %in% colnames(.data)] +# cli::cli_abort("Necessary column{?s} {.val {missing_cols}} missing from {.arg data}") +# } +# .jlmer_spec$data <- .data +# .jlmer_spec +# } +# +# get_spec_used_cols <- function(jlmer_spec) { +# all_cols <- unlist(c( +# jlmer_spec$meta$term_groups, +# jlmer_spec$meta$re_groups, +# jlmer_spec$meta[c("subject", "trial", "time")] +# ), use.names = FALSE) +# all_cols[all_cols != "(Intercept)"] +# } diff --git a/R/jlmerclusterperm-package.R b/R/jlmerclusterperm-package.R new file mode 100644 index 0000000..7b3a38d --- /dev/null +++ b/R/jlmerclusterperm-package.R @@ -0,0 +1,7 @@ +#' @keywords internal +"_PACKAGE" + +## usethis namespace: start +#' @importFrom stats setNames +## usethis namespace: end +NULL diff --git a/R/julia_rng.R b/R/julia_rng.R new file mode 100644 index 0000000..adf26fb --- /dev/null +++ b/R/julia_rng.R @@ -0,0 +1,80 @@ +#' Interface to the Julia RNG +#' +#' @name julia_rng +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' # RNG initializes to seed=1 counter=0 +#' get_rng_seed() +#' get_rng_state() +#' +#' # setter/getter for RNG counter +#' set_rng_state(123) +#' get_rng_state() +#' +#' # setter/getter for RNG seed +#' set_rng_seed(2) +#' get_rng_seed() +#' +#' # restore to initial setting (seed=1, counter=0) +#' set_rng_seed(1) +#' set_rng_state(0) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return The current seed or counter +NULL + +#' @param i Counter number +#' +#' @rdname julia_rng +#' @export +set_rng_state <- function(i) { + JuliaConnectoR::juliaLet("set_counter!(rng, Int(i))", i = i) + invisible(i) +} + +#' @rdname julia_rng +#' @export +reset_rng_state <- function() { + JuliaConnectoR::juliaEval("set_counter!(rng, 0)") + invisible(0) +} + +#' @rdname julia_rng +#' @export +get_rng_state <- function() { + as.double(JuliaConnectoR::juliaEval("Int(rng.ctr1)")) +} + +#' @rdname julia_rng +#' @param seed Seed +#' @export +set_rng_seed <- function(seed) { + seed_missing <- missing(seed) + if (seed_missing) { + seed <- as.double(JuliaConnectoR::juliaEval("Int(Random123.gen_seed(UInt32, 1)[1])")) + } + JuliaConnectoR::juliaEval(paste0("Random123.seed!(rng, (Int(", seed, "), 20))")) + .jlmerclusterperm$opts$seed <- seed + if (seed_missing) { + cli::cli_alert_info("Using randomly generated seed") + seed + } else { + invisible(seed) + } +} + +#' @rdname julia_rng +#' @export +get_rng_seed <- function() { + .jlmerclusterperm$opts$seed +} diff --git a/R/permute.R b/R/permute.R new file mode 100644 index 0000000..b025be1 --- /dev/null +++ b/R/permute.R @@ -0,0 +1,87 @@ +#' Permute data while respecting grouping structure(s) of observations +#' +#' @inheritParams jlmer +#' @param predictors A vector of terms from the model. If multiple, the must form the levels of one predictor. +#' @param predictor_type Whether the predictor is `"between_participant"` or `"within_participant"`. Defaults to `"guess"`. +#' @param n Number of permuted samples of the data to generate. Defaults to `1L`. +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' # Example data setup +#' chickweights_df <- ChickWeight +#' chickweights_df <- chickweights_df[chickweights_df$Time <= 20, ] +#' chickweights_df$DietInt <- as.integer(chickweights_df$Diet) +#' head(chickweights_df) +#' +#' # Example 1: Spec object using the continuous `DietInt` predictor +#' chickweights_spec1 <- make_jlmer_spec( +#' formula = weight ~ 1 + DietInt, +#' data = chickweights_df, +#' subject = "Chick", time = "Time" +#' ) +#' chickweights_spec1 +#' +#' # Shuffling `DietInt` values guesses `predictor_type = "between_participant"` +#' reset_rng_state() +#' spec1_perm1 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt") +#' # This calls the same shuffling algorithm for CPA in Julia, so counter is incremented +#' get_rng_state() +#' +#' # Shuffling under shared RNG state reproduces the same permutation of the data +#' reset_rng_state() +#' spec1_perm2 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt") +#' identical(spec1_perm1, spec1_perm2) +#' +#' # Example 2: Spec object using the multilevel `Diet` predictor +#' chickweights_spec2 <- make_jlmer_spec( +#' formula = weight ~ 1 + Diet, +#' data = chickweights_df, +#' subject = "Chick", time = "Time" +#' ) +#' chickweights_spec2 +#' +#' # Levels of a category are automatically shuffled together +#' reset_rng_state() +#' spec2_perm1 <- permute_by_predictor(chickweights_spec2, predictors = "Diet2") +#' reset_rng_state() +#' spec2_perm2 <- permute_by_predictor(chickweights_spec2, predictors = c("Diet2", "Diet3", "Diet4")) +#' identical(spec2_perm1, spec2_perm2) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A long dataframe of permuted re-samples with `.id` column representing replication IDs. +#' @export +permute_by_predictor <- function(jlmer_spec, predictors, predictor_type = c("guess", "between_participant", "within_participant"), n = 1L) { + df <- jlmer_spec$data + df_jl <- JuliaConnectoR::juliaCall("DataFrame", as.data.frame(df)) + subject <- jlmer_spec$meta$subject + trial <- jlmer_spec$meta$trial + time <- jlmer_spec$meta$time + predictor_type <- match.arg(predictor_type) + if (predictor_type == "guess") { + predictor_type <- .jlmerclusterperm$guess_shuffle_as(df_jl, predictors, subject, trial) + cli::cli_alert_info("Guessed {.arg predictor_type} to be {.val {predictor_type}}") + } + predictor_group <- Filter(function(x) any(predictors %in% x), jlmer_spec$meta$term_groups) + if (length(predictor_group) > 1) { + cli::cli_abort("You can only shuffle levels of one factor at a time.") + } else { + predictor_group <- predictor_group[[1]] + if (!all(predictor_group %in% predictors)) { + cli::cli_alert_info("Shuffling all levels of the factor together ({.val {predictor_group}})") + } + predictors <- predictor_group + } + shuffled <- df_from_DF(.jlmerclusterperm$permute_by_predictor(df_jl, predictor_type, predictors, subject, trial, as.integer(n))) + class(shuffled) <- class(df) + shuffled +} diff --git a/R/permute_timewise_statistics.R b/R/permute_timewise_statistics.R new file mode 100644 index 0000000..ba7ad1f --- /dev/null +++ b/R/permute_timewise_statistics.R @@ -0,0 +1,176 @@ +#' Simulate cluster-mass statistics via bootstrapped permutations +#' +#' @param nsim Number of simulations description +#' @param predictors (Optional) a subset of predictors to test. Defaults to `NULL` which tests all predictors. +#' @inheritParams compute_timewise_statistics +#' +#' @seealso [make_jlmer_spec()] +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Simulation x Time x Predictor array of t-statistics from regression output +#' reset_rng_state() +#' null_statistics <- permute_timewise_statistics(spec, nsim = 3) +#' round(null_statistics, 2) +#' +#' # Collect as dataframe with `tidy()` +#' permuted_timewise_stats_df <- tidy(null_statistics) +#' permuted_timewise_stats_df +#' +#' # Permutations ran under the same RNG state are identical +#' reset_rng_state() +#' null_statistics2 <- permute_timewise_statistics(spec, nsim = 3) +#' identical(null_statistics, null_statistics2) +#' +#' get_rng_state() +#' null_statistics3 <- permute_timewise_statistics(spec, nsim = 3) +#' identical(null_statistics, null_statistics3) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A simulation-by-time-by-predictor 3D array of cluster statistics, of class `timewise_statistics`. +#' @export +permute_timewise_statistics <- function(jlmer_spec, family = c("gaussian", "binomial"), + statistic = c("t", "chisq"), + nsim = 100L, predictors = NULL, ...) { + check_arg_class(jlmer_spec, "jlmer_spec") + statistic <- match.arg(statistic) + is_mem <- jlmer_spec$meta$is_mem + participant_col <- jlmer_spec$meta$subject + trial_col <- jlmer_spec$meta$trial %|0|% "" + term_groups <- augment_term_groups(jlmer_spec, statistic) + predictors_subset <- validate_predictors_subset(predictors, term_groups$r) + + family <- match.arg(family) + args <- prep_for_jlmer(jlmer_spec, family = family, ...) + nsim <- as.integer(nsim) + + opts <- list(...) + opts <- utils::modifyList(list(progress = FALSE), opts) + if (family == "binomial") { + opts <- utils::modifyList(list(fast = TRUE), opts) + } + + out <- JuliaConnectoR::juliaGet(do.call( + .jlmerclusterperm$permute_timewise_statistics, + c(args, nsim, participant_col, trial_col, term_groups$jl, predictors_subset, statistic, is_mem, opts) + )) + + # shuffle_predictor_groups <- Filter(function(x) !identical(x, "(Intercept)"), jlmer_spec$meta$term_groups) + # counter_states <- split(out$counter_states, rep(names(shuffle_predictor_groups), each = nsim)) + # counter_states[] <- lapply(names(counter_states), function(x) { + # list(predictors = shuffle_predictor_groups[[x]], counter = counter_states[[x]]) + # }) + + dimnames(out$z_array) <- list( + Sim = as.factor(zero_pad(1:nsim)), + Time = sort(unique(jlmer_spec$data[[jlmer_spec$meta$time]])), + Predictor = unlist(out$predictors) + ) + if (statistic == "t") { + # if (!is.null(predictors)) { + # predictors_keep <- names(Filter(function(x) predictors %in% x, jlmer_spec$meta$term_groups)) + # out$z_array <- out$z_array[, , predictors_keep, drop = FALSE] + # } + } else if (statistic == "chisq") { + Predictors <- Filter(function(x) any(dimnames(out$z_array)$Predictor %in% x), jlmer_spec$meta$term_groups) + pruned <- which(!duplicated(rep(names(Predictors), lengths(Predictors)))) + out$z_array <- out$z_array[, , pruned, drop = FALSE] + dimnames(out$z_array)$Predictor <- names(Predictors) + } + + if (is.null(dimnames(out$z_array)$Predictor)) { + cli::cli_abort(c( + "No predictors to permute.", + i = if (!is.null(predictors)) "{.val {predictors}} {?is/are} not among model terms." + )) + } + + convergence_failures <- check_convergence_failures(out$z_array) + + structure(out$z_array, + class = "timewise_statistics", + statistic = statistic, term_groups = term_groups$r, convergence_failures = convergence_failures + ) +} + +check_convergence_failures <- function(z_array) { + convergence_failures_pos <- is.nan(z_array) + if (any(convergence_failures_pos)) { + convergence_failures <- unique(which(convergence_failures_pos, arr.ind = TRUE)[, c("Predictor", "Sim")]) + convergence_failure_table <- table(convergence_failures[, "Predictor"]) + names(convergence_failure_table) <- dimnames(z_array)$Predictor[as.integer(names(convergence_failure_table))] + cli::cli_alert_info("Convergence errors encountered (out of {.arg nsim = {.val {nrow(z_array)}}}) while bootstrapping the following {cli::qty(names(convergence_failure_table))}predictor{?s}:") + cli::cli_div(theme = .jlmerclusterperm$cli_theme) + cli::cli_ul() + cli::cli_dl(as.list(convergence_failure_table), labels = paste0("{.el ", names(convergence_failure_table), "}")) + cli::cli_end() + cli::cli_end() + convergence_failures[do.call(order, asplit(convergence_failures, 2)), ] + } +} + +validate_predictors_subset <- function(predictors, r_term_groups) { + predictors_set <- unlist(r_term_groups, use.names = FALSE) + if (!is.null(predictors) && !any(predictors %in% predictors_set)) { + cli::cli_abort(c( + "Invalid selection passed to the {.arg predictor} argument.", + "x" = "Must choose among {.val {predictors_set}}." + )) + } else { + list(as.list(predictors)) + } +} + +augment_term_groups <- function(jlmer_spec, statistic) { + if (!is.null(jlmer_spec$.backdoor$augmented_term_groups)) { + return(jlmer_spec$.backdoor$augmented_term_groups) + } + + term_groups <- jlmer_spec$meta$term_groups + term_levels <- lengths(term_groups) + grp_idx <- split(seq_len(sum(lengths(term_groups))), rep(seq_along(term_groups), times = term_levels)) + term_groups_jl <- JuliaConnectoR::juliaLet("Tuple(x)", x = lapply(seq_along(term_groups), function(i) { + JuliaConnectoR::juliaLet("(P = P, p = p, i = i)", P = names(term_groups)[i], p = as.list(term_groups[[i]]), i = as.list(grp_idx[[i]])) + })) + term_groups <- term_groups[names(term_groups) != "1"] + if (statistic == "chisq") { + # cli::cli_abort(c( + # "Using {.val chisq} statistic for multi-level categorical variables is not supported.", + # x = "Predictor{?s} {.val {names(term_groups)[term_levels != 1]}} {?is/are} expressed by more than one model term.", + # i = "Try {.arg statistic = {.val t}} instead for a more interpretable result over individual terms." + # )) + term_dfs <- setNames(lengths(term_groups), names(term_groups)) + term_groups <- setNames(nm = names(term_groups)) + attr(term_groups, "dfs") <- term_dfs + } + list(r = term_groups, jl = term_groups_jl) +} + +#' @export +print.timewise_statistics <- function(x, ...) { + x_dim <- dim(x) + x_dimnames <- dimnames(x) + attributes(x) <- NULL + dim(x) <- x_dim + dimnames(x) <- x_dimnames + print(x, ...) +} diff --git a/R/srr-stats-standards.R b/R/srr-stats-standards.R new file mode 100644 index 0000000..64ebbad --- /dev/null +++ b/R/srr-stats-standards.R @@ -0,0 +1,98 @@ +#' srr_stats +#' +#' All of the following standards initially have `@srrstats` tags. +#' These may be moved at any time to any other locations in your code. +#' Once addressed, please modify the tag from `@srrstats` to `@srrstats`, +#' or `@srrstatsNA`, ensuring that references to every one of the following +#' standards remain somewhere within your code. +#' (These comments may be deleted at any time.) +#' +#' @srrstatsVerbose TRUE +#' +#' @srrstats {G1.4a} +#' @srrstats {G1.6} Some performance comparisons are available in the case study vignettes upon following the links to original tutorials they replicate. +#' @srrstats {G2.0} Function arguments passed to Julia are appropriately handled in Julia. Outputs from Julia to R are checked on type and length. +#' @srrstats {G2.0a} +#' @srrstats {G2.1} Function arguments passed to Julia are appropriately handled in Julia. Type conversion is applied where R and Julia disagree. +#' @srrstats {G2.1a} +#' @srrstats {G2.2} Package checks length of parameters where appropriate. +#' @srrstats {G2.3} Functions use `match.arg()` where appropriate. Character parameters are documented as lower-case and functions do not implement implicit `tolower()` conversion. +#' @srrstats {G2.3a} +#' @srrstats {G2.3b} +#' @srrstats {G2.4} The occasional type conversions happen in transporting data between R and Julia and these are handled appropriately. +#' @srrstats {G2.4a} +#' @srrstats {G2.4b} +#' @srrstats {G2.4c} +#' @srrstats {G2.4d} +#' @srrstats {G2.4e} +#' @srrstats {G2.6} Values are appropriately pre-processed regardless of class structures where appropriate +#' @srrstats {G2.14a} +#' @srrstats {G2.14b} +#' @srrstats {G2.14c} +#' @srrstats {G2.15} Functions check for missinginess in output where appropriate. +#' @srrstats {G5.2a} +#' @srrstats {G5.2b} +#' @srrstats {G5.4a} +#' @srrstats {G5.4b} +#' @srrstats {G5.4c} +#' @srrstats {G5.8a} +#' @srrstats {G5.8b} +#' @srrstats {G5.8c} +#' @srrstats {G5.8d} +#' @srrstats {G5.9a} +#' @srrstats {G5.9b} +#' @srrstats {G5.10} Tests ran on codecov. +#' @srrstats {RE1.2} See G2.5 +#' @srrstats {RE1.3a} +#' @srrstats {RE2.1} See G2.14 +#' @srrstats {RE4.7} Convergence statistics can be retrieved from evaluating Julia code on model object via JuliaConnectoR +#' @noRd +NULL + +#' NA_standards +#' +#' Any non-applicable standards can have their tags changed from `@srrstats` +#' to `@srrstatsNA`, and placed together in this block, along with explanations +#' for why each of these standards have been deemed not applicable. +#' (These comments may also be deleted at any time.) +#' +#' @srrstatsNA {G1.5} There are no associated publications with the package. +#' @srrstatsNA {G3.1} Package does not rely on covariance calculations +#' @srrstatsNA {G3.1a} +#' @srrstatsNA {G4.0} Package does not enable outputs to be written to local files +#' @srrstatsNA {G5.1} Package does not create any data sets to export. +#' @srrstatsNA {G5.6} No parameter recovery tests, but handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {G5.6a} +#' @srrstatsNA {G5.6b} +#' @srrstatsNA {G5.7} No Algorithm performance tests on regression itself, but handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {G5.11} Tests only require built-in datasets +#' @srrstatsNA {G5.11a} +#' @srrstatsNA {G5.12} All tests are trivial checks on quality and correctness. +#' @srrstatsNA {RE1.4} No data transformations applied beyond numerical coding of categorical variables via `model.matrix()` +#' @srrstatsNA {RE2.0} No data transformations applied beyond numerical coding of categorical variables via `model.matrix()` +#' @srrstatsNA {RE2.2} Missing data in response variable not allowed. +#' @srrstatsNA {RE2.3} Centering data not within scope of the package. +#' @srrstatsNA {RE2.4} Colinearity checks not within scope of the package. +#' @srrstatsNA {RE2.4a} +#' @srrstatsNA {RE2.4b} +#' @srrstatsNA {RE3.2} Handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {RE4.1} Generating a model object without actually fitting values not within the scope of the package. +#' @srrstatsNA {RE4.9} Modelled values of response variables not within scope of the package +#' @srrstatsNA {RE4.12} Transformation functions not within the scope of the package. +#' @srrstatsNA {RE4.14} Extrapolation or forecast *errors* capability not within the scope of the package. +#' @srrstatsNA {RE4.15} Forecase-related functionalities not within the scope of the package. +#' @srrstatsNA {RE4.16} No feature for modelling distinct responses for different categorical groups. +#' @srrstatsNA {RE4.18} No `summary()` method implemented for Julia model objects but see `tidy.jlmer_mod()` and `glance.jlmer_mod()` +#' @srrstatsNA {RE5.0} Scaling relationships on speed not checked, but handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {RE6.0} No default `plot()` methods but users may interface with Julia-specific visualization tools via JuliaConnectoR. +#' @srrstatsNA {RE6.1} See RE6.0 +#' @srrstatsNA {RE6.2} See RE6.0 +#' @srrstatsNA {RE6.3} See RE4.15 +#' @srrstatsNA {RE7.0} Tests with noiseless, exact relationships between predictor (independent) data handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {RE7.0a} +#' @srrstatsNA {RE7.1} Tests with noiseless, exact relationships between predictor (independent) and response (dependent) data handled in GLM.jl and MixedModels.jl +#' @srrstatsNA {RE7.1a} +#' @srrstatsNA {RE7.2} Output objects are designed to strip meta-data attributes input data for R-Julia interoperability +#' @srrstatsNA {RE7.4} See RE4.15 +#' @noRd +NULL diff --git a/R/threshold_search.R b/R/threshold_search.R new file mode 100644 index 0000000..3d3e9cd --- /dev/null +++ b/R/threshold_search.R @@ -0,0 +1,65 @@ +#' Test the probability of cluster-mass statistics over a range of threshold values +#' +#' @param steps A vector of threshold values to test +#' @inheritParams extract_empirical_clusters +#' @inheritParams extract_null_cluster_dists +#' @inheritParams calculate_clusters_pvalues +#' @param progress Whether to display a progress bar +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Compute timewise statistics for the observed and permuted data +#' empirical_statistics <- compute_timewise_statistics(spec) +#' reset_rng_state() +#' null_statistics <- permute_timewise_statistics(spec, nsim = 100) +#' +#' # Test cluster mass/probability under different threshold values +#' walk_threshold_steps(empirical_statistics, null_statistics, steps = 1:3) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @return A data frame of predictor clusters-mass statistics by threshold. +#' @export +walk_threshold_steps <- function(empirical_statistics, null_statistics, steps, + top_n = Inf, binned = FALSE, add1 = TRUE, progress = TRUE) { + test_threshold <- function(threshold) { + empirical <- extract_empirical_clusters(empirical_statistics, threshold = threshold, binned = binned, top_n = top_n) + if (all(attr(empirical, "missing_clusters"))) { + return(NULL) + } + null <- extract_null_cluster_dists(null_statistics, threshold = threshold, binned = binned) + out <- tidy(calculate_clusters_pvalues(empirical, null, add1 = TRUE)) + out[!is.na(out$pvalue), ] + } + if (progress) { + i_vec <- cli::cli_progress_along(steps) + } else { + i_vec <- seq_along(steps) + } + res <- lapply(i_vec, function(i) { + out <- test_threshold(steps[i]) + if (!is.null(out)) out$threshold <- steps[i] + out + }) + + res_df <- do.call(rbind.data.frame, res) + res_df[, c("threshold", "predictor", "id", "start", "end", "length", "sum_statistic", "pvalue")] +} diff --git a/R/tidy.R b/R/tidy.R new file mode 100644 index 0000000..bb936bc --- /dev/null +++ b/R/tidy.R @@ -0,0 +1,240 @@ +#' @importFrom generics tidy +#' @export +generics::tidy + +#' Tidier methods for Julia regression models +#' +#' @param x An object of class `jlmer_mod` +#' @param effects One of "var_model", "ran_pars", or "fixed" +#' @param ... Unused +#' +#' @srrstats {RE4.2} Model coefficients via `tidy()` +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' # Fixed-effects only model +#' mod1 <- to_jlmer(weight ~ 1 + Diet, ChickWeight) +#' tidy(mod1) +#' glance(mod1) +#' +#' # Mixed model +#' mod2 <- to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight) +#' tidy(mod2) +#' glance(mod2) +#' +#' # Select which of fixed/random effects to return +#' tidy(mod2, effects = "fixed") +#' tidy(mod2, effects = "ran_pars") +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @name julia_model_tidiers +#' @return A data frame +NULL + +#' @rdname julia_model_tidiers +#' @method tidy jlmer_mod +#' @export +tidy.jlmer_mod <- function(x, effects = c("var_model", "ran_pars", "fixed"), ...) { + out <- df_from_DF(JuliaConnectoR::juliaLet("DataFrame(coeftable(x))", x = x))[, 1:5] + colnames(out) <- c("term", "estimate", "std.error", "statistic", "p.value") + out$term <- backtrans_interaction(out$term) + is_mem <- JuliaConnectoR::juliaLet("typeof(x) <: MixedModel", x = x) + effects <- match.arg(effects) + if (is_mem && effects != "fixed") { + vc <- JuliaConnectoR::juliaGet(JuliaConnectoR::juliaCall("VarCorr", x))[[1]] + re_flatten <- lapply(vc, function(g) lapply(g, unlist)) + re_sd <- lapply(re_flatten, function(g) { + setNames(g[[1]], paste0("sd__", backtrans_interaction(names(g[[1]])))) + }) + re_cor <- lapply(re_flatten, function(g) { + re_terms <- backtrans_interaction(names(g[[1]])) + re_term_matrix <- outer(re_terms, re_terms, function(i, j) { + vapply(seq_along(i), function(ind) { + paste0(sort(c(i[ind], j[ind]))[1], ".", sort(c(i[ind], j[ind]))[2]) + }, character(1)) + }) + re_cor_terms <- re_term_matrix[lower.tri(re_term_matrix)] + if (!is.null(g[[2]])) { + setNames(g[[2]], paste0("cor__", re_cor_terms)) + } + }) + re <- Filter(Negate(is.null), c(re_sd, re_cor)) + re_dfs <- lapply(seq_along(re), function(i) { + data.frame(group = names(re)[i], term = names(re[[i]]), estimate = unname(re[[i]])) + }) + re_df <- do.call(rbind, re_dfs) + sigma <- JuliaConnectoR::juliaLet("x.sigma", x = x) + if (!is.na(sigma)) { + re_df <- rbind(re_df, data.frame(group = "Residual", term = "sd__Observation", estimate = sigma)) + } + if (effects == "ran_pars") { + out <- re_df + } else { + out$effect <- "fixed" + out$group <- NA_character_ + re_df$effect <- "ran_pars" + re_df[, setdiff(names(out), names(re_df))] <- NA + out <- rbind(out, re_df)[, c("effect", "group", "term", "estimate", "std.error", "statistic", "p.value")] + } + } + maybe_as_tibble(out) +} + +#' Tidiers for cluster permutation test objects +#' +#' @param x An object of class ``, ``, or `` +#' @param ... Unused +#' +#' @examplesIf JuliaConnectoR::juliaSetupOk() +#' \donttest{ +#' \dontshow{ +#' options("jlmerclusterperm.nthreads" = 2) +#' jlmerclusterperm_setup(verbose = FALSE) +#' julia_progress(show = FALSE) +#' } +#' +#' library(dplyr, warn.conflicts = FALSE) +#' +#' # Specification object +#' spec <- make_jlmer_spec( +#' weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), +#' subject = "Chick", time = "Time" +#' ) +#' spec +#' +#' # Method for `` +#' empirical_statistics <- compute_timewise_statistics(spec) +#' class(empirical_statistics) +#' tidy(empirical_statistics) +#' +#' reset_rng_state() +#' null_statistics <- permute_timewise_statistics(spec, nsim = 100) +#' class(null_statistics) +#' tidy(null_statistics) +#' +#' # Method for `` +#' empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +#' class(empirical_clusters) +#' tidy(empirical_clusters) +#' +#' # Method for `` +#' null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +#' class(null_cluster_dists) +#' tidy(null_cluster_dists) +#' +#' \dontshow{ +#' JuliaConnectoR::stopJulia() +#' } +#' } +#' +#' @name cluster_permutation_tidiers + +#' @rdname cluster_permutation_tidiers +#' @method tidy timewise_statistics +#' @export +#' @return A data frame +tidy.timewise_statistics <- function(x, ...) { + stacked <- as.data.frame.table(x, responseName = "statistic") + colnames(stacked) <- tolower(colnames(stacked)) + stacked$time <- as.numeric(levels(stacked$time))[stacked$time] + stacked$predictor <- as.character(stacked$predictor) + stacked <- stacked[do.call(order, stacked[c("time", "predictor")]), intersect(c("predictor", "time", "statistic", "sim"), colnames(stacked))] + maybe_as_tibble(stacked) +} + +#' @rdname cluster_permutation_tidiers +#' @method tidy empirical_clusters +#' @export +tidy.empirical_clusters <- function(x, ...) { + pvalues <- attr(x, "pvalues") + cluster_dfs <- lapply(seq_along(x), function(i) { + predictor <- names(x)[i] + cluster_df <- cbind(predictor = predictor, x[[i]]) + if (!is.null(pvalues[[predictor]])) { + cluster_df$pvalue <- pvalues[[predictor]] + } else if (!is.null(pvalues)) { + cluster_df$pvalue <- NA + } + cluster_df + }) + clusters_df <- do.call(rbind, cluster_dfs) + clusters_df <- clusters_df[clusters_df$cluster_id != 0, ] + time <- as.numeric(attr(x, "time")) + clusters_df$id <- factor(zero_pad(clusters_df$cluster_id)) + clusters_df$length <- clusters_df$cluster_end - clusters_df$cluster_start + 1 + clusters_df$start <- time[clusters_df$cluster_start] + clusters_df$end <- time[clusters_df$cluster_end] + clusters_df$sum_statistic <- clusters_df$statistic + clusters_df <- clusters_df[intersect(c("predictor", "id", "start", "end", "length", "sum_statistic", "pvalue"), names(clusters_df))] + maybe_as_tibble(clusters_df) +} + +#' @rdname cluster_permutation_tidiers +#' @method tidy null_cluster_dists +#' @export +tidy.null_cluster_dists <- function(x, ...) { + cluster_dfs <- lapply(seq_along(x), function(i) { + cbind(predictor = names(x)[i], x[[i]]) + }) + clusters_df <- do.call(rbind, cluster_dfs) + time <- as.numeric(attr(x, "time")) + clusters_df$length <- clusters_df$cluster_end - clusters_df$cluster_start + 1 + clusters_df$start <- time[replace_as_na(clusters_df$cluster_start, 0)] + clusters_df$end <- time[replace_as_na(clusters_df$cluster_end, 0)] + clusters_df$length[is.na(clusters_df$start)] <- NA + clusters_df$sim <- as.factor(zero_pad(clusters_df$id)) + clusters_df$sum_statistic <- clusters_df$statistic + clusters_df <- clusters_df[c("predictor", "start", "end", "length", "sum_statistic", "sim")] + maybe_as_tibble(clusters_df) +} + +#' @importFrom generics glance +#' @export +generics::glance + +#' @rdname julia_model_tidiers +#' +#' @srrstats {RE4.5} Numbers of observations via `glance()` +#' @srrstats {RE4.10} Model Residuals via `glance()`. Users are assumed to be familiar. +#' @srrstats {RE4.11} Goodness-of-fit and other statistics via `glance()` +#' +#' @method glance jlmer_mod +#' @export +glance.jlmer_mod <- function(x, ...) { + is_mixed <- JuliaConnectoR::juliaLet("x isa MixedModel", x = x) + is_REML <- is_mixed && JuliaConnectoR::juliaLet("x.optsum.REML", x = x) + nobs <- JuliaConnectoR::juliaCall("nobs", x) + sigma <- if (is_mixed) { + JuliaConnectoR::juliaLet("x.sigma", x = x) %|0|% NA + } else { + has_dispersion <- JuliaConnectoR::juliaCall("GLM.dispersion_parameter", x) + if (has_dispersion) JuliaConnectoR::juliaLet("dispersion(x.model)", x = x) else NA + } + ll <- if (is_REML) { + list(logLik = NA, AIC = NA, BIC = NA) + } else { + list( + logLik = JuliaConnectoR::juliaCall("loglikelihood", x), + AIC = JuliaConnectoR::juliaCall("aic", x), + BIC = JuliaConnectoR::juliaCall("bic", x) + ) + } + deviance <- JuliaConnectoR::juliaCall("deviance", x) + dof <- JuliaConnectoR::juliaCall("dof", x) + out <- data.frame( + nobs = nobs, df = dof, sigma = sigma, + logLik = ll$logLik, AIC = ll$AIC, BIC = ll$BIC, + deviance = deviance, df.residual = nobs - dof + ) + maybe_as_tibble(out) +} diff --git a/R/utils.R b/R/utils.R new file mode 100644 index 0000000..e58191d --- /dev/null +++ b/R/utils.R @@ -0,0 +1,35 @@ +`%|0|%` <- function(lhs, rhs) if (is.null(lhs) || identical(lhs, "") || length(lhs) == 0 || is.na(lhs)) rhs else lhs + +#' @srrstats {G3.0} Equality comparison between floats use tolerance (e.g., the internal function `near_zero()`) +near_zero <- function(x) { + abs(x) < .Machine$double.eps^0.5 +} + +maybe_as_tibble <- function(x) { + if ("tibble" %in% loadedNamespaces()) { + rownames(x) <- NULL + class(x) <- c("tbl_df", "tbl", class(x)) + } + x +} + +backtrans_interaction <- function(x) { + gsub("__", ":", x, fixed = TRUE) +} + +replace_as_na <- function(x, y) { + x[x == y] <- NA + x +} + +zero_pad <- function(x, y) { + if (missing(y)) y <- max(x) + sprintf(paste0("%0", floor(log10(y)) + 1, "d"), x) +} + +check_arg_class <- function(x, x_class, x_arg = x_class) { + if (!inherits(x, x_class)) { + cli::cli_abort("{.arg {x_arg}} must be a {.cls {x_class}} object, not a {.cls {class(x)}}") + } + invisible(TRUE) +} diff --git a/R/zzz.R b/R/zzz.R new file mode 100644 index 0000000..aeddc8e --- /dev/null +++ b/R/zzz.R @@ -0,0 +1,3 @@ +.onLoad <- function(...) { + # jlmerclusterperm_setup() +} diff --git a/README.md b/README.md new file mode 100644 index 0000000..fbf8faf --- /dev/null +++ b/README.md @@ -0,0 +1,264 @@ + + + +# jlmerclusterperm + + + +[![Project Status: Active – The project has reached a stable, usable +state and is being actively +developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) +[![Development +Version](https://img.shields.io/badge/devel%20version-1.0.0-check.svg)](https://github.com/yjunechoe/jlmerclusterperm) +[![R-CMD-check](https://github.com/yjunechoe/jlmerclusterperm/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/yjunechoe/jlmerclusterperm/actions/workflows/R-CMD-check.yaml) +[![pkgcheck](https://github.com/yjunechoe/jlmerclusterperm/workflows/pkgcheck/badge.svg)](https://github.com/yjunechoe/jlmerclusterperm/actions?query=workflow%3Apkgcheck) +[![Codecov test +coverage](https://codecov.io/gh/yjunechoe/jlmerclusterperm/branch/main/graph/badge.svg)](https://app.codecov.io/gh/yjunechoe/jlmerclusterperm?branch=main) + + +Julia [GLM.jl](https://github.com/JuliaStats/GLM.jl) and +[MixedModels.jl](https://github.com/JuliaStats/MixedModels.jl) based +implementation of cluster-based permutation test for time series data, +powered by +[JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR). + +![](man/figures/clusterpermute_slice.png) + +## Installation and usage + +You can install the development version of `jlmerclusterperm` from +[GitHub](https://github.com/yjunechoe/jlmerclusterperm) with: + +``` r +# install.packages("remotes") +remotes::install_github("yjunechoe/jlmerclusterperm") +``` + +Using `jlmerclusterperm` requires a prior installation of the Julia +programming language, which can be downloaded from either the [official +website](https://julialang.org/) or using the command line utility +[juliaup](https://github.com/JuliaLang/juliaup). Julia version \>=1.8 is +required and +[1.9](https://julialang.org/blog/2023/04/julia-1.9-highlights/#caching_of_native_code) +is preferred for its substantial speed improvements. + +Before using functions from `jlmerclusterperm`, an initial setup is +required via calling `jlmerclusterperm_setup()`. The very first call on +a system will install necessary dependencies (this only happens once and +takes around 10-15 minutes). + +Subsequent calls to `jlmerclusterperm_setup()` incur a small overhead of +around 30 seconds, plus slight delays for first-time function calls. You +pay up front for start-up and warm-up costs and get blazingly-fast +functions from the package. + +``` r +# Both lines must be run at the start of each new session +library(jlmerclusterperm) +system.time(jlmerclusterperm_setup(verbose = FALSE)) +#> user system elapsed +#> 0.02 0.01 15.99 +``` + +See the [Get +Started](https://yjunechoe.github.io/jlmerclusterperm/articles/jlmerclusterperm.html) +page on the [package +website](https://yjunechoe.github.io/jlmerclusterperm/) for background +and tutorials. + +## Quick tour of package functionalities + +### Wholesale CPA with `clusterpermute()` + +A time series data: + +``` r +chickweights <- ChickWeight +chickweights$Time <- as.integer(factor(chickweights$Time)) +matplot( + tapply(chickweights$weight, chickweights[c("Time", "Diet")], mean), + type = "b", lwd = 3, ylab = "Weight", xlab = "Time" +) +``` + + + +Preparing a specification object: + +``` r +chickweights_spec <- make_jlmer_spec( + formula = weight ~ 1 + Diet, + data = chickweights, + subject = "Chick", time = "Time" +) +chickweights_spec +#> ── jlmer specification ───────────────────────────────────────── ── +#> Formula: weight ~ 1 + Diet2 + Diet3 + Diet4 +#> Predictors: +#> Diet: Diet2, Diet3, Diet4 +#> Groupings: +#> Subject: Chick +#> Trial: +#> Time: Time +#> Data: +#> weight Diet2 Diet3 Diet4 Chick Time +#> 1 42 0 0 0 1 1 +#> 2 51 0 0 0 1 2 +#> 3 59 0 0 0 1 3 +#> [ reached 'max' / getOption("max.print") -- omitted 575 rows ] +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +Cluster-based permutation test: + +``` r +set_rng_state(123L) +clusterpermute( + chickweights_spec, + threshold = 2.5, + nsim = 100, + progress = FALSE +) +#> $null_cluster_dists +#> ── Null cluster-mass distribution (t > 2.5) ──────────── ── +#> Diet2 (n = 100) +#> Mean (SD): -0.039 (1.89) +#> Coverage intervals: 95% [-2.862, 0.000] +#> Diet3 (n = 100) +#> Mean (SD): -0.129 (2.02) +#> Coverage intervals: 95% [0.000, 0.000] +#> Diet4 (n = 100) +#> Mean (SD): 0.296 (3.21) +#> Coverage intervals: 95% [0.000, 5.797] +#> ──────────────────────────────────────────────────────────────────────────────── +#> +#> $empirical_clusters +#> ── Empirical clusters (t > 2.5) ──────────────────────── ── +#> Diet2 +#> [3, 4]: 6.121 (p=0.0495) +#> Diet3 +#> [3, 12]: 35.769 (p=0.0099) +#> Diet4 +#> [2, 8]: 32.442 (p=0.0099) +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +With random effects: + +``` r +chickweights_re_spec <- make_jlmer_spec( + formula = weight ~ 1 + Diet + (1 | Chick), + data = chickweights, + subject = "Chick", time = "Time" +) +set_rng_state(123L) +clusterpermute( + chickweights_re_spec, + threshold = 2.5, + nsim = 100, + progress = FALSE +)$empirical_clusters +#> ── Empirical clusters (t > 2.5) ──────────────────────── ── +#> Diet2 +#> [3, 4]: 6.387 (p=0.0594) +#> Diet3 +#> [2, 12]: 39.919 (p=0.0099) +#> Diet4 +#> [2, 8]: 33.853 (p=0.0099) +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +### Piecemeal approach to CPA + +Computing time-wise statistics of the observed data: + +``` r +empirical_statistics <- compute_timewise_statistics(chickweights_spec) +matplot(t(empirical_statistics), type = "b", pch = 1, lwd = 3, ylab = "t-statistic") +abline(h = 2.5, lty = 3) +``` + + + +Identifying empirical clusters: + +``` r +empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2.5) +empirical_clusters +#> ── Empirical clusters (t > 2.5) ──────────────────────── ── +#> Diet2 +#> [3, 4]: 6.121 +#> Diet3 +#> [3, 12]: 35.769 +#> Diet4 +#> [2, 8]: 32.442 +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +Simulating the null distribution: + +``` r +set_rng_state(123L) +null_statistics <- permute_timewise_statistics(chickweights_spec, nsim = 100) +null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2.5) +null_cluster_dists +#> ── Null cluster-mass distribution (t > 2.5) ──────────── ── +#> Diet2 (n = 100) +#> Mean (SD): -0.039 (1.89) +#> Coverage intervals: 95% [-2.862, 0.000] +#> Diet3 (n = 100) +#> Mean (SD): -0.129 (2.02) +#> Coverage intervals: 95% [0.000, 0.000] +#> Diet4 (n = 100) +#> Mean (SD): 0.296 (3.21) +#> Coverage intervals: 95% [0.000, 5.797] +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +Significance testing the cluster-mass statistic: + +``` r +calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1 = TRUE) +#> ── Empirical clusters (t > 2.5) ──────────────────────── ── +#> Diet2 +#> [3, 4]: 6.121 (p=0.0495) +#> Diet3 +#> [3, 12]: 35.769 (p=0.0099) +#> Diet4 +#> [2, 8]: 32.442 (p=0.0099) +#> ──────────────────────────────────────────────────────────────────────────────── +``` + +Iterating over a range of threshold values: + +``` r +walk_threshold_steps(empirical_statistics, null_statistics, steps = c(2, 2.5, 3)) +#> threshold predictor id start end length sum_statistic pvalue +#> 1 2.0 Diet2 1 3 5 3 8.496376 0.07920792 +#> 2 2.0 Diet3 1 2 12 11 38.216035 0.00990099 +#> 3 2.0 Diet4 1 2 12 11 41.651468 0.00990099 +#> 4 2.5 Diet2 1 3 4 2 6.121141 0.04950495 +#> 5 2.5 Diet3 1 3 12 10 35.768957 0.00990099 +#> 6 2.5 Diet4 1 2 8 7 32.442352 0.00990099 +#> 31 3.0 Diet3 1 3 5 3 12.719231 0.00990099 +#> 21 3.0 Diet3 2 9 12 4 14.037622 0.00990099 +#> 41 3.0 Diet4 1 2 7 6 29.659402 0.00990099 +``` + +## Acknowledgments + +- The paper [Maris & Oostenveld + (2007)](https://doi.org/10.1016/j.jneumeth.2007.03.024) which + originally proposed the cluster-based permutation analysis. + +- The [JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR) + package for powering the R interface to Julia. + +- The Julia packages [GLM.jl](https://github.com/JuliaStats/GLM.jl) and + [MixedModels.jl](https://github.com/JuliaStats/MixedModels.jl) for + fast implementations (mixed effects) regression models. + +- Existing implementations of CPA in R + ([permuco](https://jaromilfrossard.github.io/permuco/), + [permutes](https://github.com/cvoeten/permutes), etc.) whose designs + inspired the CPA interface in jlmerclusterperm. diff --git a/build/partial.rdb b/build/partial.rdb new file mode 100644 index 0000000..2784684 Binary files /dev/null and b/build/partial.rdb differ diff --git a/build/vignette.rds b/build/vignette.rds new file mode 100644 index 0000000..677cf20 Binary files /dev/null and b/build/vignette.rds differ diff --git a/inst/doc/jlmerclusterperm.R b/inst/doc/jlmerclusterperm.R new file mode 100644 index 0000000..f899f3d --- /dev/null +++ b/inst/doc/jlmerclusterperm.R @@ -0,0 +1,6 @@ +## ---- include = FALSE--------------------------------------------------------- +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) + diff --git a/inst/doc/jlmerclusterperm.Rmd b/inst/doc/jlmerclusterperm.Rmd new file mode 100644 index 0000000..c987ab1 --- /dev/null +++ b/inst/doc/jlmerclusterperm.Rmd @@ -0,0 +1,77 @@ +--- +title: "Introduction to jlmerclusterperm" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Introduction to jlmerclusterperm} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +``` + +`jlmerclusterperm` is an interface to regression-based CPAs which supports both wholesale and piecemeal approaches to conducting the test. + +## Overview of CPA + +**Cluster-based permutation analysis (CPA)** is a simulation-based, non-parametric statistical test of difference between groups in a time series. It is suitable for analyzing **densely-sampled time data** (such as in EEG and eye-tracking research) where the research hypothesis is often specified up to the existence of an effect (e.g., as predicted by higher-order cognitive mechanisms) but agnostic to the temporal details of the effect (such as the precise moment of its emergence). + +CPA formalizes two intuitions about what it means for there to be a difference between groups: + +1) The countable unit of difference (i.e., a **cluster**) is a contiguous, uninterrupted span of _sufficiently large_ differences at each time point (determined via a _threshold_). + +2) The degree of extremity of a cluster (i.e., the **cluster-mass statistic**) is a measure that is sensitive to the magnitude of the difference, its variability, and the sample size (e.g., the t-statistic from a regression). + +The CPA procedure identifies empirical clusters in a time series data and tests the significance of the observed cluster-mass statistics against **bootstrapped permutations** of the data. The bootstrapping procedure samples from the "null" (via random shuffling of condition labels), yielding a distribution of cluster-mass statistics emerging from chance. The statistical significance of a cluster is the probability of observing a cluster-mass statistic as extreme as the cluster's against the simulated null distribution. + +## Package design + +![jlmerclusterperm package function design](https://raw.githubusercontent.com/yjunechoe/jlmerclusterperm/main/man/figures/jlmerclusterperm_fn_design.png) + +`jlmerclusterperm` provides both a wholesale and a piecemeal approach to conducting a CPA. The main workhorse function in the package is `clusterpermute()`, which is composed of five smaller functions that are called internally in succession. The smaller functions representing the algorithmic steps of a CPA are also exported, to allow more control over the procedure (e.g., for debugging and diagnosing a CPA run). + +See the [function documentation](https://yjunechoe.github.io/jlmerclusterperm/reference/index.html#cluster-based-permutation-analysis) for more. + +## Organization of vignettes + +The package [vignettes](https://yjunechoe.github.io/jlmerclusterperm/articles/) are roughly divided into two groups: **topics** and **case studies**. If you are a researcher with data ready for analysis, it is recommended to go through the case studies first and pluck out the relevant bits for your own desired analysis + +In the order of increasing complexity, the **case studies** are: + +1) [Garrison et al. 2020](https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html), which introduces the package's functions for running a CPA, demonstrating both wholesale and piecemeal approaches. It also compares the use of linear vs. logistic regression for the calculation of timewise statistics. + +2) [Geller et al. 2020](https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html), which demonstrates the use of random effects and regression contrasts in the CPA specification. It also compares the use of t vs. chisq timewise statistics. + +3) [de Carvalho et al. 2021](https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html), which showcases using custom regression contrasts to test the relationship between multiple levels of a factor. It also discusses issues surrounding the complexity of the timewise regression model. + +The **topics** cover general package features: + +- [Tidying output](https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html): `tidy()` and other ways of collecting `jlmerclusterperm` objects as tidy data. + +- [Julia interface](https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html): interacting with Julia objects using the [JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR) package. + +- [Reproducibility](https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html): using the Julia RNG to make CPA runs reproducible. + +- [Asynchronous CPA](https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html): running a CPA in a background process using the [future](https://future.futureverse.org/) package. + +## Compared to other implementations + +There are other R packages for cluster-based permutation analysis, such as [clusterperm](https://github.com/dalejbarr/clusterperm/), [eyetrackingR](https://github.com/jwdink/eyetrackingR), [permuco](https://github.com/jaromilfrossard/permuco/), and [permutes](https://github.com/cvoeten/permutes). + +Compared to existing implementations, `jlmerclusterperm` is the only package optimized for CPAs based on **mixed-effects regression models**, suitable for typical experimental research data with multiple, crossed grouping structures (e.g., subjects and items). It is also the only package with an interface to the individual algorithmic steps of a CPA. Lastly, thanks to the Julia back-end, bootstrapping is fast even for mixed models. + +## Further readings + +- The original method paper for CPA by [Maris & Oostenveld 2007](https://doi.org/10.1016/j.jneumeth.2007.03.024). + +- A more recent paper [Sassenhagen & Draschkow 2019](https://doi.org/10.1111/psyp.13335) on the DOs and DON'Ts of of interpreting and reporting CPA results. + +- A review paper for eyetracking analysis methods [Ito & Knoeferle 2022](https://doi.org/10.3758/s13428-022-01969-3) which compares CPA to other statistical techniques for testing difference between groups in time series data. + +- The JOSS paper for the [permuco](https://jaromilfrossard.github.io/permuco/) package ([Frossard & Renaud 2021](https://www.jstatsoft.org/article/view/v099i15)) and references therein. + +- [Benedikt V. Ehinger's blog post](https://benediktehinger.de/blog/science/statistics-cluster-permutation-test/), which includes a detailed graphical explainer for the CPA procedure. diff --git a/inst/doc/jlmerclusterperm.html b/inst/doc/jlmerclusterperm.html new file mode 100644 index 0000000..6f77a8d --- /dev/null +++ b/inst/doc/jlmerclusterperm.html @@ -0,0 +1,379 @@ + + + + + + + + + + + + + + +Introduction to jlmerclusterperm + + + + + + + + + + + + + + + + + + + + + + + +

Introduction to jlmerclusterperm

+ + + +

jlmerclusterperm is an interface to regression-based +CPAs which supports both wholesale and piecemeal approaches to +conducting the test.

+
+

Overview of CPA

+

Cluster-based permutation analysis (CPA) is a +simulation-based, non-parametric statistical test of difference between +groups in a time series. It is suitable for analyzing +densely-sampled time data (such as in EEG and +eye-tracking research) where the research hypothesis is often specified +up to the existence of an effect (e.g., as predicted by higher-order +cognitive mechanisms) but agnostic to the temporal details of the effect +(such as the precise moment of its emergence).

+

CPA formalizes two intuitions about what it means for there to be a +difference between groups:

+
    +
  1. The countable unit of difference (i.e., a +cluster) is a contiguous, uninterrupted span of +sufficiently large differences at each time point (determined +via a threshold).

  2. +
  3. The degree of extremity of a cluster (i.e., the +cluster-mass statistic) is a measure that is sensitive +to the magnitude of the difference, its variability, and the sample size +(e.g., the t-statistic from a regression).

  4. +
+

The CPA procedure identifies empirical clusters in a time series data +and tests the significance of the observed cluster-mass statistics +against bootstrapped permutations of the data. The +bootstrapping procedure samples from the “null” (via random shuffling of +condition labels), yielding a distribution of cluster-mass statistics +emerging from chance. The statistical significance of a cluster is the +probability of observing a cluster-mass statistic as extreme as the +cluster’s against the simulated null distribution.

+
+
+

Package design

+
+jlmerclusterperm package function design +
jlmerclusterperm package function design
+
+

jlmerclusterperm provides both a wholesale and a +piecemeal approach to conducting a CPA. The main workhorse function in +the package is clusterpermute(), which is composed of five +smaller functions that are called internally in succession. The smaller +functions representing the algorithmic steps of a CPA are also exported, +to allow more control over the procedure (e.g., for debugging and +diagnosing a CPA run).

+

See the function +documentation for more.

+
+
+

Organization of vignettes

+

The package vignettes +are roughly divided into two groups: topics and +case studies. If you are a researcher with data ready +for analysis, it is recommended to go through the case studies first and +pluck out the relevant bits for your own desired analysis

+

In the order of increasing complexity, the case +studies are:

+
    +
  1. Garrison +et al. 2020, which introduces the package’s functions for running a +CPA, demonstrating both wholesale and piecemeal approaches. It also +compares the use of linear vs. logistic regression for the calculation +of timewise statistics.

  2. +
  3. Geller +et al. 2020, which demonstrates the use of random effects and +regression contrasts in the CPA specification. It also compares the use +of t vs. chisq timewise statistics.

  4. +
  5. de +Carvalho et al. 2021, which showcases using custom regression +contrasts to test the relationship between multiple levels of a factor. +It also discusses issues surrounding the complexity of the timewise +regression model.

  6. +
+

The topics cover general package features:

+ +
+
+

Compared to other implementations

+

There are other R packages for cluster-based permutation analysis, +such as clusterperm, eyetrackingR, permuco, and permutes.

+

Compared to existing implementations, jlmerclusterperm +is the only package optimized for CPAs based on mixed-effects +regression models, suitable for typical experimental research +data with multiple, crossed grouping structures (e.g., subjects and +items). It is also the only package with an interface to the individual +algorithmic steps of a CPA. Lastly, thanks to the Julia back-end, +bootstrapping is fast even for mixed models.

+
+
+

Further readings

+ +
+ + + + + + + + + + + diff --git a/inst/julia/01-utils.jl b/inst/julia/01-utils.jl new file mode 100644 index 0000000..4bc4a25 --- /dev/null +++ b/inst/julia/01-utils.jl @@ -0,0 +1,27 @@ +function t_value(mod) + coef(mod) ./ stderror(mod) +end + +function get_rng_counter() + Int(rng.ctr1) +end + +reduce_formula = function (to_remove, enriched_formula, is_mem) + rhs = enriched_formula.rhs + if is_mem + is_fe = [map(x -> x isa MatrixTerm, rhs)...] + fe_terms = rhs[findfirst(is_fe)].terms + fe_to_keep = findall(!in(to_remove), Symbol.(fe_terms)) + fe_keep = MixedModels.collect_matrix_terms(fe_terms[fe_to_keep]) + new_rhs = (fe_keep, rhs[.!is_fe]...) + else + fe_terms = rhs.terms + fe_to_keep = findall(!in(to_remove), Symbol.(fe_terms)) + new_rhs = StatsModels.collect_matrix_terms(fe_terms[fe_to_keep]) + end + FormulaTerm(enriched_formula.lhs, new_rhs) +end + +chisq_value = function (lrt) + abs(2 * (lrt.loglikelihood[2] - lrt.loglikelihood[1])) +end diff --git a/inst/julia/02-jlmer.jl b/inst/julia/02-jlmer.jl new file mode 100644 index 0000000..19143c5 --- /dev/null +++ b/inst/julia/02-jlmer.jl @@ -0,0 +1,7 @@ +function jlmer(formula, data, family, contrasts, is_mem; opts...) + if is_mem + fit(MixedModel, formula, data, family; contrasts = contrasts, opts...) + else + glm(formula, data, family) + end +end diff --git a/inst/julia/03-compute_timewise_statistics.jl b/inst/julia/03-compute_timewise_statistics.jl new file mode 100644 index 0000000..103ac14 --- /dev/null +++ b/inst/julia/03-compute_timewise_statistics.jl @@ -0,0 +1,289 @@ +function compute_timewise_statistics( + formula, + data, + time, + family, + contrasts, + term_groups, + statistic, + is_mem; + opts..., +) + + response_var = formula.lhs.sym + times = sort(unique(data[!, time])) + n_times = length(times) + + if is_mem + fm_schema = MixedModels.schema(formula, data, contrasts) + form = MixedModels.apply_schema(formula, fm_schema, MixedModel) + re_term = [isa(x, MixedModels.AbstractReTerm) for x in form.rhs] + fixed = String.(Symbol.(form.rhs[.!re_term][1].terms)) + grouping_vars = [String(Symbol(x.rhs)) for x in form.rhs[re_term]] + else + fm_schema = StatsModels.schema(formula, data) + form = StatsModels.apply_schema(formula, fm_schema) + fixed = String.(Symbol.(form.rhs.terms)) + end + + if statistic == "chisq" + drop_terms = filter(x -> x.p != ["(Intercept)"], term_groups) + reduced_formula = + [(fm = reduce_formula(Symbol.(x.p), form, is_mem), i = x.i) for x in drop_terms] + test_opts = (reduced_formula = reduced_formula,) + elseif statistic == "t" + test_opts = Nothing + end + + if is_mem + timewise_stats = timewise_lme( + formula, + data, + time, + family, + contrasts, + statistic, + test_opts, + response_var, + fixed, + grouping_vars, + times, + n_times, + true; + opts..., + ) + timewise_stats + else + timewise_stats = timewise_lm( + formula, + data, + time, + family, + statistic, + test_opts, + response_var, + fixed, + times, + n_times, + ) + ( + t_matrix = timewise_stats.t_matrix, + convergence_failures = timewise_stats.convergence_failures, + Predictor = fixed, + Time = times, + ) + end + +end + +function timewise_lme( + formula, + data, + time, + family, + contrasts, + statistic, + test_opts, + response_var, + fixed, + grouping_vars, + times, + n_times, + diagnose; + opts..., +) + + if statistic == "t" + t_matrix = zeros(length(fixed), n_times) + elseif statistic == "chisq" + drop_formula = test_opts.reduced_formula + if drop_formula isa Vector + t_matrix = zeros(length(drop_formula), n_times) + else + t_matrix = zeros(length(fixed), n_times) + end + end + + convergence_failures = zeros(Bool, n_times) + if diagnose + singular_fits = zeros(Bool, n_times) + rePCA_95_matrix = zeros(length(grouping_vars), n_times) + end + + if diagnose + pg = Progress(n_times, output = pg_io, barlen = pg_width, showspeed = true) + end + + @suppress begin + Threads.@threads for i = 1:n_times + + data_at_time = filter(time => ==(times[i]), data) + response = data_at_time[!, response_var] + + if all(==(response[1]), response) + t_matrix[:, i] .= response[1] == 1 ? Inf : -Inf + convergence_failures[i] = missing + if diagnose + singular_fits[i] = missing + rePCA_95_matrix[:, i] .= missing + end + else + try + time_mod = fit( + MixedModel, + formula, + data_at_time, + family; + contrasts = contrasts, + opts..., + ) + # test statistic + if statistic == "chisq" + if drop_formula isa Vector + drop1_mods = map( + x -> fit( + MixedModel, + x.fm, + data_at_time, + family; + contrasts = contrasts, + opts..., + ), + drop_formula, + ) + betas = map(x -> coef(time_mod)[x.i], drop_formula) + chisq_vals = map( + x -> MixedModels.likelihoodratiotest( + time_mod, + x, + ).tests.deviancediff[1], + drop1_mods, + ) + signed_chisq = [ + chisq_vals[i] * + (length(betas[i]) == 1 ? sign(betas[i][1]) : 1) for + i = 1:length(chisq_vals) + ] + t_matrix[:, i] = signed_chisq + else + drop1_mod = fit( + MixedModel, + drop_formula.fm, + data_at_time, + family; + contrasts = contrasts, + opts..., + ) + chisq_val = MixedModels.likelihoodratiotest( + time_mod, + drop1_mod, + ).tests.deviancediff[1] + betas = coef(time_mod)[drop_formula.i] + signed_chisq = + chisq_val * (length(betas) == 1 ? sign(betas[1]) : 1) + t_matrix[drop_formula.i, i] .= signed_chisq + end + elseif statistic == "t" + t_matrix[:, i] = t_value(time_mod) + end + + if diagnose + singular_fits[i] = issingular(time_mod) + rePCA_95_matrix[:, i] = [ + all(isnan, x) ? 0 : findfirst(>(0.95), x) for + x in time_mod.rePCA + ] + end + catch e + convergence_failures[i] = true + t_matrix[:, i] .= NaN + end + end + + if diagnose + next!(pg) + end + end + end + + if diagnose + ( + singular_fits = singular_fits, + convergence_failures = convergence_failures, + t_matrix = t_matrix, + Predictor = fixed, + Time = times, + rePCA_95_matrix = rePCA_95_matrix, + Grouping = grouping_vars, + ) + else + (t_matrix = t_matrix, convergence_failures = convergence_failures) + end + +end + +function timewise_lm( + formula, + data, + time, + family, + statistic, + test_opts, + response_var, + fixed, + times, + n_times, +) + + if statistic == "t" + t_matrix = zeros(length(fixed), n_times) + elseif statistic == "chisq" + drop_formula = test_opts.reduced_formula + if drop_formula isa Vector + t_matrix = zeros(length(drop_formula), n_times) + else + t_matrix = zeros(length(fixed), n_times) + end + end + + convergence_failures = zeros(Bool, n_times) + + Threads.@threads for i = 1:n_times + data_at_time = filter(time => ==(times[i]), data) + response = data_at_time[!, response_var] + if all(==(response[1]), response) + constant = response[1] == 1 ? Inf : -Inf + t_matrix[:, i] .= constant + else + try + time_mod = glm(formula, data_at_time, family) + if statistic == "chisq" + if drop_formula isa Vector + drop1_mods = map(x -> glm(x.fm, data_at_time, family), drop_formula) + betas = map(x -> coef(time_mod)[x.i], drop_formula) + chisq_vals = map(x -> chisq_value(lrtest(time_mod, x)), drop1_mods) + signed_chisq = [ + chisq_vals[i] * (length(betas[i]) == 1 ? sign(betas[i][1]) : 1) for + i = 1:length(chisq_vals) + ] + t_matrix[:, i] = signed_chisq + else + chisq_val = chisq_value( + lrtest(time_mod, glm(drop_formula.fm, data_at_time, family)), + ) + betas = coef(time_mod)[drop_formula.i] + signed_chisq = chisq_val * (length(betas) == 1 ? sign(betas[1]) : 1) + t_matrix[drop_formula.i, i] .= signed_chisq + end + elseif statistic == "t" + t_matrix[:, i] = t_value(time_mod) + end + catch e + convergence_failures[i] = true + t_matrix[:, i] .= NaN + continue + end + end + end + (t_matrix = t_matrix, convergence_failures = convergence_failures) +end diff --git a/inst/julia/04-extract_clusters.jl b/inst/julia/04-extract_clusters.jl new file mode 100644 index 0000000..d21894b --- /dev/null +++ b/inst/julia/04-extract_clusters.jl @@ -0,0 +1,40 @@ +function _extract_clusters(t_vec, binned, n, id) + runs = rle(sign.(t_vec)) + run_inds = vcat(0, cumsum(runs[2])) + clusters = (:).(run_inds[1:end-1] .+ 1, run_inds[2:end]) + run_groups = getindex.(Ref(t_vec), clusters) + sum_t = (sum.(x -> isinf(x) ? 0 : x, run_groups)) + cluster_ranges = extrema.(clusters) + clusters_df = DataFrame(cluster_ranges) + rename!(clusters_df, :1 => :cluster_start, :2 => :cluster_end) + clusters_df.statistic = sum_t + filter!(:statistic => !≈(0), clusters_df) + transform!(clusters_df, :statistic => ByRow(abs) => :abs_stat) + if !binned + filter!([:cluster_end, :cluster_start] => !=, clusters_df) + end + transform!(clusters_df, eachindex => :cluster_id) + sort!(clusters_df, :abs_stat, rev = true) + select!(clusters_df, [:cluster_id, :cluster_start, :cluster_end, :statistic]) + if nrow(clusters_df) == 0 + out = DataFrame( + cluster_start = 0, + cluster_end = 0, + statistic = 0, + cluster_id = 0, + id = id, + ) + else + out = first(clusters_df, n) + out.id .= id + end + out +end + +function extract_clusters(t_matrix, binned, n) + out = Vector{DataFrame}(undef, size(t_matrix, 1)) + for i = 1:length(out) + out[i] = _extract_clusters(t_matrix[i, :], binned, n, i) + end + vcat(out...) +end diff --git a/inst/julia/05-permute.jl b/inst/julia/05-permute.jl new file mode 100644 index 0000000..ce81494 --- /dev/null +++ b/inst/julia/05-permute.jl @@ -0,0 +1,68 @@ +function guess_and_shuffle_as!(df, predictor_cols, participant_col, trial_col) + shuffle_type = guess_shuffle_as(df, predictor_cols, participant_col, trial_col) + shuffle_as!(df, predictor_cols, participant_col, trial_col, shuffle_type) +end + +function guess_shuffle_as(df, predictor_cols, participant_col, trial_col) + subj_pred_pair = unique(df[!, vcat(participant_col, predictor_cols)]) + unique_combinations = length(unique(df[!, participant_col])) == nrow(subj_pred_pair) + if unique_combinations + "between_participant" + else + if (ismissing(trial_col) || isnothing(trial_col)) + throw("""Guessed "within_participant" but no column for `trial` supplied.""") + else + "within_participant" + end + end +end + +function shuffle_as!(df, shuffle_type, predictor_cols, participant_col, trial_col) + if shuffle_type == "between_participant" + subj_pred_pair = unique(df[!, vcat(participant_col, predictor_cols)]) + shuffle!(rng, subj_pred_pair[!, participant_col]) + select!(df, Not(predictor_cols)) + leftjoin!(df, subj_pred_pair, on = participant_col) + elseif shuffle_type == "within_participant" + trial_pred_pair = unique(df[!, vcat(participant_col, trial_col, predictor_cols)]) + combine( + groupby(trial_pred_pair, participant_col), + sdf -> shuffle!(rng, sdf[!, trial_col]), + ) + select!(df, Not(predictor_cols)) + leftjoin!(df, trial_pred_pair, on = [participant_col, trial_col]) + end +end + +function permute_by_predictor( + df, + shuffle_type, + predictor_cols, + participant_col, + trial_col, + n, +) + _df = copy(df) + out = insertcols( + shuffle_as!(_df, shuffle_type, predictor_cols, participant_col, trial_col), + :id => 1, + ) + if (n > 1) + for i = 2:n + append!( + out, + insertcols( + shuffle_as!( + _df, + shuffle_type, + predictor_cols, + participant_col, + trial_col, + ), + :id => i, + ), + ) + end + end + select!(out, :id, Not(:id)) +end diff --git a/inst/julia/06-permute_timewise_statistics.jl b/inst/julia/06-permute_timewise_statistics.jl new file mode 100644 index 0000000..a089fb5 --- /dev/null +++ b/inst/julia/06-permute_timewise_statistics.jl @@ -0,0 +1,114 @@ +function permute_timewise_statistics( + formula, + data, + time, + family, + contrasts, + nsim, + participant_col, + trial_col, + term_groups, + predictors_subset, + statistic, + is_mem; + opts..., +) + + response_var = formula.lhs.sym + times = sort(unique(data[!, time])) + n_times = length(times) + + predictors_exclude = ["(Intercept)"] + if length(predictors_subset) > 0 + term_groups_est = + filter(grp -> any(in(predictors_subset), vcat(grp.p, grp.P)), term_groups) + else + term_groups_est = filter(grp -> !all(in(predictors_exclude), grp.p), term_groups) + end + + nsims = nsim * length(term_groups_est) + counter_states = zeros(nsims) + pg = Progress(nsims, output = pg_io, barlen = pg_width, showspeed = true) + + if is_mem + fm_schema = MixedModels.schema(formula, data, contrasts) + form = MixedModels.apply_schema(formula, fm_schema, MixedModel) + re_term = [isa(x, MixedModels.AbstractReTerm) for x in form.rhs] + fixed = String.(Symbol.(form.rhs[.!re_term][1].terms)) + grouping_vars = [String(Symbol(x.rhs)) for x in form.rhs[re_term]] + else + fm_schema = StatsModels.schema(formula, data) + form = StatsModels.apply_schema(formula, fm_schema) + fixed = String.(Symbol.(form.rhs.terms)) + end + + n_fixed = length(fixed) + res = zeros(nsim, n_times, n_fixed) + + for term_groups in term_groups_est + predictors = term_groups.p + permute_data = copy(data) + shuffle_type = guess_shuffle_as( + permute_data, + predictors, + participant_col, + trial_col == "" ? missing : 3, + ) + + if statistic == "chisq" + reduced_formula = reduce_formula(Symbol.(predictors), form, is_mem) + test_opts = (reduced_formula = (fm = reduced_formula, i = term_groups.i),) + elseif statistic == "t" + test_opts = Nothing + end + + for i = 1:nsim + counter_states[i] = get_rng_counter() + shuffle_as!(permute_data, shuffle_type, predictors, participant_col, trial_col) + if is_mem + timewise_stats = timewise_lme( + formula, + permute_data, + time, + family, + contrasts, + statistic, + test_opts, + response_var, + fixed, + grouping_vars, + times, + n_times, + false; + opts..., + ) + zs = timewise_stats.t_matrix + else + timewise_stats = timewise_lm( + formula, + permute_data, + time, + family, + statistic, + test_opts, + response_var, + fixed, + times, + n_times, + ) + zs = timewise_stats.t_matrix + end + for term_ind in term_groups.i + res[i, :, term_ind] = zs[term_ind, :] + end + next!(pg) + end + end + + + predictors = vcat(map(terms -> terms.p, term_groups_est)...) + res = res[:, :, vcat(map(terms -> terms.i, term_groups_est)...)] + + (z_array = res, predictors = predictors, counter_states = counter_states) + +end diff --git a/inst/julia/Manifest.toml b/inst/julia/Manifest.toml new file mode 100644 index 0000000..eb3bf29 --- /dev/null +++ b/inst/julia/Manifest.toml @@ -0,0 +1,776 @@ +# This file is machine-generated - editing it directly is not advised + +julia_version = "1.9.1" +manifest_format = "2.0" +project_hash = "a13d0e98276016de2b7ee63b5853697496f13590" + +[[deps.Adapt]] +deps = ["LinearAlgebra", "Requires"] +git-tree-sha1 = "cc37d689f599e8df4f464b2fa3870ff7db7492ef" +uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" +version = "3.6.1" +weakdeps = ["StaticArrays"] + + [deps.Adapt.extensions] + AdaptStaticArraysExt = "StaticArrays" + +[[deps.ArgTools]] +uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f" +version = "1.1.1" + +[[deps.Arrow]] +deps = ["ArrowTypes", "BitIntegers", "CodecLz4", "CodecZstd", "DataAPI", "Dates", "LoggingExtras", "Mmap", "PooledArrays", "SentinelArrays", "Tables", "TimeZones", "UUIDs", "WorkerUtilities"] +git-tree-sha1 = "551fd7902ebdac5e346500d9a7a6afb3f3502c83" +uuid = "69666777-d1a9-59fb-9406-91d4454c9d45" +version = "2.5.0" + +[[deps.ArrowTypes]] +deps = ["UUIDs"] +git-tree-sha1 = "563d60f89fcb730668bd568ba3e752ee71dde023" +uuid = "31f734f8-188a-4ce0-8406-c8a06bd891cd" +version = "2.0.2" + +[[deps.Artifacts]] +uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33" + +[[deps.Base64]] +uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" + +[[deps.BenchmarkTools]] +deps = ["JSON", "Logging", "Printf", "Profile", "Statistics", "UUIDs"] +git-tree-sha1 = "d9a9701b899b30332bbcb3e1679c41cce81fb0e8" +uuid = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf" +version = "1.3.2" + +[[deps.BitIntegers]] +deps = ["Random"] +git-tree-sha1 = "fc54d5837033a170f3bad307f993e156eefc345f" +uuid = "c3b6d118-76ef-56ca-8cc7-ebb389d030a1" +version = "0.2.7" + +[[deps.Bzip2_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "19a35467a82e236ff51bc17a3a44b69ef35185a2" +uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0" +version = "1.0.8+0" + +[[deps.CEnum]] +git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90" +uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82" +version = "0.4.2" + +[[deps.Calculus]] +deps = ["LinearAlgebra"] +git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad" +uuid = "49dc2e85-a5d0-5ad3-a950-438e2897f1b9" +version = "0.5.1" + +[[deps.CodecBzip2]] +deps = ["Bzip2_jll", "Libdl", "TranscodingStreams"] +git-tree-sha1 = "2e62a725210ce3c3c2e1a3080190e7ca491f18d7" +uuid = "523fee87-0ab8-5b00-afb7-3ecf72e48cfd" +version = "0.7.2" + +[[deps.CodecLz4]] +deps = ["Lz4_jll", "TranscodingStreams"] +git-tree-sha1 = "59fe0cb37784288d6b9f1baebddbf75457395d40" +uuid = "5ba52731-8f18-5e0d-9241-30f10d1ec561" +version = "0.4.0" + +[[deps.CodecZlib]] +deps = ["TranscodingStreams", "Zlib_jll"] +git-tree-sha1 = "9c209fb7536406834aa938fb149964b985de6c83" +uuid = "944b1d66-785c-5afd-91f1-9de20f533193" +version = "0.7.1" + +[[deps.CodecZstd]] +deps = ["CEnum", "TranscodingStreams", "Zstd_jll"] +git-tree-sha1 = "849470b337d0fa8449c21061de922386f32949d9" +uuid = "6b39b394-51ab-5f42-8807-6242bab2b4c2" +version = "0.7.2" + +[[deps.CommonSubexpressions]] +deps = ["MacroTools", "Test"] +git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7" +uuid = "bbf7d656-a473-5ed7-a52c-81e309532950" +version = "0.3.0" + +[[deps.Compat]] +deps = ["UUIDs"] +git-tree-sha1 = "7a60c856b9fa189eb34f5f8a6f6b5529b7942957" +uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" +version = "4.6.1" +weakdeps = ["Dates", "LinearAlgebra"] + + [deps.Compat.extensions] + CompatLinearAlgebraExt = "LinearAlgebra" + +[[deps.CompilerSupportLibraries_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" +version = "1.0.2+0" + +[[deps.Crayons]] +git-tree-sha1 = "249fe38abf76d48563e2f4556bebd215aa317e15" +uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" +version = "4.1.1" + +[[deps.DataAPI]] +git-tree-sha1 = "e8119c1a33d267e16108be441a287a6981ba1630" +uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a" +version = "1.14.0" + +[[deps.DataFrames]] +deps = ["Compat", "DataAPI", "Future", "InlineStrings", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Random", "Reexport", "SentinelArrays", "SnoopPrecompile", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"] +git-tree-sha1 = "aa51303df86f8626a962fccb878430cdb0a97eee" +uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" +version = "1.5.0" + +[[deps.DataStructures]] +deps = ["Compat", "InteractiveUtils", "OrderedCollections"] +git-tree-sha1 = "d1fff3a548102f48987a52a2e0d114fa97d730f0" +uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8" +version = "0.18.13" + +[[deps.DataValueInterfaces]] +git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6" +uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464" +version = "1.0.0" + +[[deps.Dates]] +deps = ["Printf"] +uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" + +[[deps.Dictionaries]] +deps = ["Indexing", "Random", "Serialization"] +git-tree-sha1 = "e82c3c97b5b4ec111f3c1b55228cebc7510525a2" +uuid = "85a47980-9c8c-11e8-2b9f-f7ca1fa99fb4" +version = "0.3.25" + +[[deps.DiffResults]] +deps = ["StaticArraysCore"] +git-tree-sha1 = "782dd5f4561f5d267313f23853baaaa4c52ea621" +uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5" +version = "1.1.0" + +[[deps.DiffRules]] +deps = ["IrrationalConstants", "LogExpFunctions", "NaNMath", "Random", "SpecialFunctions"] +git-tree-sha1 = "a4ad7ef19d2cdc2eff57abbbe68032b1cd0bd8f8" +uuid = "b552c78f-8df3-52c6-915a-8e097449b14b" +version = "1.13.0" + +[[deps.Distributed]] +deps = ["Random", "Serialization", "Sockets"] +uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b" + +[[deps.Distributions]] +deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "Test"] +git-tree-sha1 = "13027f188d26206b9e7b863036f87d2f2e7d013a" +uuid = "31c24e10-a181-5473-b8eb-7969acd0382f" +version = "0.25.87" + + [deps.Distributions.extensions] + DistributionsChainRulesCoreExt = "ChainRulesCore" + DistributionsDensityInterfaceExt = "DensityInterface" + + [deps.Distributions.weakdeps] + ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" + DensityInterface = "b429d917-457f-4dbc-8f4c-0cc954292b1d" + +[[deps.DocStringExtensions]] +deps = ["LibGit2"] +git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d" +uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" +version = "0.9.3" + +[[deps.Downloads]] +deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"] +uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6" +version = "1.6.0" + +[[deps.DualNumbers]] +deps = ["Calculus", "NaNMath", "SpecialFunctions"] +git-tree-sha1 = "5837a837389fccf076445fce071c8ddaea35a566" +uuid = "fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74" +version = "0.6.8" + +[[deps.ExprTools]] +git-tree-sha1 = "c1d06d129da9f55715c6c212866f5b1bddc5fa00" +uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" +version = "0.1.9" + +[[deps.FileWatching]] +uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee" + +[[deps.FillArrays]] +deps = ["LinearAlgebra", "Random", "SparseArrays", "Statistics"] +git-tree-sha1 = "fc86b4fd3eff76c3ce4f5e96e2fdfa6282722885" +uuid = "1a297f60-69ca-5386-bcde-b61e274b549b" +version = "1.0.0" + +[[deps.Formatting]] +deps = ["Printf"] +git-tree-sha1 = "8339d61043228fdd3eb658d86c926cb282ae72a8" +uuid = "59287772-0a20-5a39-b81b-1366585eb4c0" +version = "0.4.2" + +[[deps.ForwardDiff]] +deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "LinearAlgebra", "LogExpFunctions", "NaNMath", "Preferences", "Printf", "Random", "SpecialFunctions"] +git-tree-sha1 = "00e252f4d706b3d55a8863432e742bf5717b498d" +uuid = "f6369f11-7733-5829-9624-2563aa707210" +version = "0.10.35" +weakdeps = ["StaticArrays"] + + [deps.ForwardDiff.extensions] + ForwardDiffStaticArraysExt = "StaticArrays" + +[[deps.Future]] +deps = ["Random"] +uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" + +[[deps.GLM]] +deps = ["Distributions", "LinearAlgebra", "Printf", "Reexport", "SparseArrays", "SpecialFunctions", "Statistics", "StatsAPI", "StatsBase", "StatsFuns", "StatsModels"] +git-tree-sha1 = "cd3e314957dc11c4c905d54d1f5a65c979e4748a" +uuid = "38e38edf-8417-5370-95a0-9cbb8c7f171a" +version = "1.8.2" + +[[deps.HypergeometricFunctions]] +deps = ["DualNumbers", "LinearAlgebra", "OpenLibm_jll", "SpecialFunctions"] +git-tree-sha1 = "432b5b03176f8182bd6841fbfc42c718506a2d5f" +uuid = "34004b35-14d8-5ef3-9330-4cdb6864b03a" +version = "0.3.15" + +[[deps.Indexing]] +git-tree-sha1 = "ce1566720fd6b19ff3411404d4b977acd4814f9f" +uuid = "313cdc1a-70c2-5d6a-ae34-0150d3930a38" +version = "1.1.1" + +[[deps.InlineStrings]] +deps = ["Parsers"] +git-tree-sha1 = "9cc2baf75c6d09f9da536ddf58eb2f29dedaf461" +uuid = "842dd82b-1e85-43dc-bf29-5d0ee9dffc48" +version = "1.4.0" + +[[deps.InteractiveUtils]] +deps = ["Markdown"] +uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" + +[[deps.InvertedIndices]] +git-tree-sha1 = "0dc7b50b8d436461be01300fd8cd45aa0274b038" +uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f" +version = "1.3.0" + +[[deps.IrrationalConstants]] +git-tree-sha1 = "630b497eafcc20001bba38a4651b327dcfc491d2" +uuid = "92d709cd-6900-40b7-9082-c6be49f344b6" +version = "0.2.2" + +[[deps.IteratorInterfaceExtensions]] +git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856" +uuid = "82899510-4779-5014-852e-03e436cf321d" +version = "1.0.0" + +[[deps.JLLWrappers]] +deps = ["Preferences"] +git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1" +uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210" +version = "1.4.1" + +[[deps.JSON]] +deps = ["Dates", "Mmap", "Parsers", "Unicode"] +git-tree-sha1 = "31e996f0a15c7b280ba9f76636b3ff9e2ae58c9a" +uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" +version = "0.21.4" + +[[deps.JSON3]] +deps = ["Dates", "Mmap", "Parsers", "SnoopPrecompile", "StructTypes", "UUIDs"] +git-tree-sha1 = "84b10656a41ef564c39d2d477d7236966d2b5683" +uuid = "0f8b85d8-7281-11e9-16c2-39a750bddbf1" +version = "1.12.0" + +[[deps.LaTeXStrings]] +git-tree-sha1 = "f2355693d6778a178ade15952b7ac47a4ff97996" +uuid = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" +version = "1.3.0" + +[[deps.LazyArtifacts]] +deps = ["Artifacts", "Pkg"] +uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3" + +[[deps.LibCURL]] +deps = ["LibCURL_jll", "MozillaCACerts_jll"] +uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21" +version = "0.6.3" + +[[deps.LibCURL_jll]] +deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"] +uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0" +version = "7.84.0+0" + +[[deps.LibGit2]] +deps = ["Base64", "NetworkOptions", "Printf", "SHA"] +uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" + +[[deps.LibSSH2_jll]] +deps = ["Artifacts", "Libdl", "MbedTLS_jll"] +uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8" +version = "1.10.2+0" + +[[deps.Libdl]] +uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" + +[[deps.LinearAlgebra]] +deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"] +uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" + +[[deps.LogExpFunctions]] +deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"] +git-tree-sha1 = "0a1b7c2863e44523180fdb3146534e265a91870b" +uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688" +version = "0.3.23" + + [deps.LogExpFunctions.extensions] + LogExpFunctionsChainRulesCoreExt = "ChainRulesCore" + LogExpFunctionsChangesOfVariablesExt = "ChangesOfVariables" + LogExpFunctionsInverseFunctionsExt = "InverseFunctions" + + [deps.LogExpFunctions.weakdeps] + ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" + ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0" + InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112" + +[[deps.Logging]] +uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" + +[[deps.LoggingExtras]] +deps = ["Dates", "Logging"] +git-tree-sha1 = "cedb76b37bc5a6c702ade66be44f831fa23c681e" +uuid = "e6f89c97-d47a-5376-807f-9c37f3926c36" +version = "1.0.0" + +[[deps.Lz4_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "5d494bc6e85c4c9b626ee0cab05daa4085486ab1" +uuid = "5ced341a-0733-55b8-9ab6-a4889d929147" +version = "1.9.3+0" + +[[deps.MacroTools]] +deps = ["Markdown", "Random"] +git-tree-sha1 = "42324d08725e200c23d4dfb549e0d5d89dede2d2" +uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09" +version = "0.5.10" + +[[deps.Markdown]] +deps = ["Base64"] +uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" + +[[deps.MathOptInterface]] +deps = ["BenchmarkTools", "CodecBzip2", "CodecZlib", "DataStructures", "ForwardDiff", "JSON", "LinearAlgebra", "MutableArithmetics", "NaNMath", "OrderedCollections", "Printf", "SnoopPrecompile", "SparseArrays", "SpecialFunctions", "Test", "Unicode"] +git-tree-sha1 = "58a367388e1b068104fa421cb34f0e6ee6316a26" +uuid = "b8f27783-ece8-5eb3-8dc8-9495eed66fee" +version = "1.14.1" + +[[deps.MathProgBase]] +deps = ["LinearAlgebra", "SparseArrays"] +git-tree-sha1 = "9abbe463a1e9fc507f12a69e7f29346c2cdc472c" +uuid = "fdba3010-5040-5b88-9595-932c9decdf73" +version = "0.7.8" + +[[deps.MbedTLS_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" +version = "2.28.2+0" + +[[deps.Missings]] +deps = ["DataAPI"] +git-tree-sha1 = "f66bdc5de519e8f8ae43bdc598782d35a25b1272" +uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28" +version = "1.1.0" + +[[deps.MixedModels]] +deps = ["Arrow", "DataAPI", "Distributions", "GLM", "JSON3", "LazyArtifacts", "LinearAlgebra", "Markdown", "NLopt", "PooledArrays", "ProgressMeter", "Random", "SnoopPrecompile", "SparseArrays", "StaticArrays", "Statistics", "StatsAPI", "StatsBase", "StatsFuns", "StatsModels", "StructTypes", "Tables", "TypedTables"] +git-tree-sha1 = "6d21808e96593d3a7f36ace8a255ae04497a0a66" +uuid = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316" +version = "4.12.0" + +[[deps.Mmap]] +uuid = "a63ad114-7e13-5084-954f-fe012c677804" + +[[deps.Mocking]] +deps = ["Compat", "ExprTools"] +git-tree-sha1 = "782e258e80d68a73d8c916e55f8ced1de00c2cea" +uuid = "78c3b35d-d492-501b-9361-3d52fe80e533" +version = "0.7.6" + +[[deps.MozillaCACerts_jll]] +uuid = "14a3606d-f60d-562e-9121-12d972cd8159" +version = "2022.10.11" + +[[deps.MutableArithmetics]] +deps = ["LinearAlgebra", "SparseArrays", "Test"] +git-tree-sha1 = "3295d296288ab1a0a2528feb424b854418acff57" +uuid = "d8a4904e-b15c-11e9-3269-09a3773c0cb0" +version = "1.2.3" + +[[deps.NLopt]] +deps = ["MathOptInterface", "MathProgBase", "NLopt_jll"] +git-tree-sha1 = "5a7e32c569200a8a03c3d55d286254b0321cd262" +uuid = "76087f3c-5699-56af-9a33-bf431cd00edd" +version = "0.6.5" + +[[deps.NLopt_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "9b1f15a08f9d00cdb2761dcfa6f453f5d0d6f973" +uuid = "079eb43e-fd8e-5478-9966-2cf3e3edb778" +version = "2.7.1+0" + +[[deps.NaNMath]] +deps = ["OpenLibm_jll"] +git-tree-sha1 = "0877504529a3e5c3343c6f8b4c0381e57e4387e4" +uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3" +version = "1.0.2" + +[[deps.NetworkOptions]] +uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908" +version = "1.2.0" + +[[deps.OpenBLAS_jll]] +deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] +uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" +version = "0.3.21+4" + +[[deps.OpenLibm_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "05823500-19ac-5b8b-9628-191a04bc5112" +version = "0.8.1+0" + +[[deps.OpenSpecFun_jll]] +deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1" +uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" +version = "0.5.5+0" + +[[deps.OrderedCollections]] +git-tree-sha1 = "d321bf2de576bf25ec4d3e4360faca399afca282" +uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d" +version = "1.6.0" + +[[deps.PDMats]] +deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"] +git-tree-sha1 = "67eae2738d63117a196f497d7db789821bce61d1" +uuid = "90014a1f-27ba-587c-ab20-58faa44d9150" +version = "0.11.17" + +[[deps.Parsers]] +deps = ["Dates", "SnoopPrecompile"] +git-tree-sha1 = "478ac6c952fddd4399e71d4779797c538d0ff2bf" +uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" +version = "2.5.8" + +[[deps.Pkg]] +deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"] +uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" +version = "1.9.0" + +[[deps.PooledArrays]] +deps = ["DataAPI", "Future"] +git-tree-sha1 = "a6062fe4063cdafe78f4a0a81cfffb89721b30e7" +uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720" +version = "1.4.2" + +[[deps.Preferences]] +deps = ["TOML"] +git-tree-sha1 = "47e5f437cc0e7ef2ce8406ce1e7e24d44915f88d" +uuid = "21216c6a-2e73-6563-6e65-726566657250" +version = "1.3.0" + +[[deps.PrettyTables]] +deps = ["Crayons", "Formatting", "LaTeXStrings", "Markdown", "Reexport", "StringManipulation", "Tables"] +git-tree-sha1 = "548793c7859e28ef026dba514752275ee871169f" +uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" +version = "2.2.3" + +[[deps.Printf]] +deps = ["Unicode"] +uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" + +[[deps.Profile]] +deps = ["Printf"] +uuid = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79" + +[[deps.ProgressMeter]] +deps = ["Distributed", "Printf"] +git-tree-sha1 = "d7a7aef8f8f2d537104f170139553b14dfe39fe9" +uuid = "92933f4c-e287-5a05-a399-4b506db050ca" +version = "1.7.2" + +[[deps.QuadGK]] +deps = ["DataStructures", "LinearAlgebra"] +git-tree-sha1 = "6ec7ac8412e83d57e313393220879ede1740f9ee" +uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" +version = "2.8.2" + +[[deps.REPL]] +deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"] +uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" + +[[deps.Random]] +deps = ["SHA", "Serialization"] +uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" + +[[deps.Random123]] +deps = ["Random", "RandomNumbers"] +git-tree-sha1 = "7a1a306b72cfa60634f03a911405f4e64d1b718b" +uuid = "74087812-796a-5b5d-8853-05524746bad3" +version = "1.6.0" + +[[deps.RandomNumbers]] +deps = ["Random", "Requires"] +git-tree-sha1 = "043da614cc7e95c703498a491e2c21f58a2b8111" +uuid = "e6cf234a-135c-5ec9-84dd-332b85af5143" +version = "1.5.3" + +[[deps.RecipesBase]] +deps = ["SnoopPrecompile"] +git-tree-sha1 = "261dddd3b862bd2c940cf6ca4d1c8fe593e457c8" +uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" +version = "1.3.3" + +[[deps.Reexport]] +git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" +uuid = "189a3867-3050-52da-a836-e630ba90ab69" +version = "1.2.2" + +[[deps.Requires]] +deps = ["UUIDs"] +git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7" +uuid = "ae029012-a4dd-5104-9daa-d747884805df" +version = "1.3.0" + +[[deps.Rmath]] +deps = ["Random", "Rmath_jll"] +git-tree-sha1 = "f65dcb5fa46aee0cf9ed6274ccbd597adc49aa7b" +uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa" +version = "0.7.1" + +[[deps.Rmath_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "6ed52fdd3382cf21947b15e8870ac0ddbff736da" +uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f" +version = "0.4.0+0" + +[[deps.SHA]] +uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" +version = "0.7.0" + +[[deps.Scratch]] +deps = ["Dates"] +git-tree-sha1 = "30449ee12237627992a99d5e30ae63e4d78cd24a" +uuid = "6c6a2e73-6563-6170-7368-637461726353" +version = "1.2.0" + +[[deps.SentinelArrays]] +deps = ["Dates", "Random"] +git-tree-sha1 = "77d3c4726515dca71f6d80fbb5e251088defe305" +uuid = "91c51154-3ec4-41a3-a24f-3f23e20d615c" +version = "1.3.18" + +[[deps.Serialization]] +uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" + +[[deps.ShiftedArrays]] +git-tree-sha1 = "503688b59397b3307443af35cd953a13e8005c16" +uuid = "1277b4bf-5013-50f5-be3d-901d8477a67a" +version = "2.0.0" + +[[deps.SnoopPrecompile]] +deps = ["Preferences"] +git-tree-sha1 = "e760a70afdcd461cf01a575947738d359234665c" +uuid = "66db9d55-30c0-4569-8b51-7e840670fc0c" +version = "1.0.3" + +[[deps.Sockets]] +uuid = "6462fe0b-24de-5631-8697-dd941f90decc" + +[[deps.SortingAlgorithms]] +deps = ["DataStructures"] +git-tree-sha1 = "a4ada03f999bd01b3a25dcaa30b2d929fe537e00" +uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c" +version = "1.1.0" + +[[deps.SparseArrays]] +deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"] +uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" + +[[deps.SpecialFunctions]] +deps = ["IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"] +git-tree-sha1 = "ef28127915f4229c971eb43f3fc075dd3fe91880" +uuid = "276daf66-3868-5448-9aa4-cd146d93841b" +version = "2.2.0" + + [deps.SpecialFunctions.extensions] + SpecialFunctionsChainRulesCoreExt = "ChainRulesCore" + + [deps.SpecialFunctions.weakdeps] + ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" + +[[deps.SplitApplyCombine]] +deps = ["Dictionaries", "Indexing"] +git-tree-sha1 = "48f393b0231516850e39f6c756970e7ca8b77045" +uuid = "03a91e81-4c3e-53e1-a0a4-9c0c8f19dd66" +version = "1.2.2" + +[[deps.StaticArrays]] +deps = ["LinearAlgebra", "Random", "StaticArraysCore", "Statistics"] +git-tree-sha1 = "70e0cc0c0f9ef7ea76b3d7a50ada18c8c52e69a2" +uuid = "90137ffa-7385-5640-81b9-e52037218182" +version = "1.5.20" + +[[deps.StaticArraysCore]] +git-tree-sha1 = "6b7ba252635a5eff6a0b0664a41ee140a1c9e72a" +uuid = "1e83bf80-4336-4d27-bf5d-d5a4f845583c" +version = "1.4.0" + +[[deps.Statistics]] +deps = ["LinearAlgebra", "SparseArrays"] +uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" +version = "1.9.0" + +[[deps.StatsAPI]] +deps = ["LinearAlgebra"] +git-tree-sha1 = "45a7769a04a3cf80da1c1c7c60caf932e6f4c9f7" +uuid = "82ae8749-77ed-4fe6-ae5f-f523153014b0" +version = "1.6.0" + +[[deps.StatsBase]] +deps = ["DataAPI", "DataStructures", "LinearAlgebra", "LogExpFunctions", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics", "StatsAPI"] +git-tree-sha1 = "d1bf48bfcc554a3761a133fe3a9bb01488e06916" +uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" +version = "0.33.21" + +[[deps.StatsFuns]] +deps = ["HypergeometricFunctions", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"] +git-tree-sha1 = "f625d686d5a88bcd2b15cd81f18f98186fdc0c9a" +uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c" +version = "1.3.0" + + [deps.StatsFuns.extensions] + StatsFunsChainRulesCoreExt = "ChainRulesCore" + StatsFunsInverseFunctionsExt = "InverseFunctions" + + [deps.StatsFuns.weakdeps] + ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" + InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112" + +[[deps.StatsModels]] +deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Printf", "REPL", "ShiftedArrays", "SparseArrays", "StatsBase", "StatsFuns", "Tables"] +git-tree-sha1 = "51cdf1afd9d78552e7a08536930d7abc3b288a5c" +uuid = "3eaba693-59b7-5ba5-a881-562e759f1c8d" +version = "0.7.1" + +[[deps.StringManipulation]] +git-tree-sha1 = "46da2434b41f41ac3594ee9816ce5541c6096123" +uuid = "892a3eda-7b42-436c-8928-eab12a02cf0e" +version = "0.3.0" + +[[deps.StructTypes]] +deps = ["Dates", "UUIDs"] +git-tree-sha1 = "ca4bccb03acf9faaf4137a9abc1881ed1841aa70" +uuid = "856f2bd8-1eba-4b0a-8007-ebc267875bd4" +version = "1.10.0" + +[[deps.SuiteSparse]] +deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"] +uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9" + +[[deps.SuiteSparse_jll]] +deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"] +uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c" +version = "5.10.1+6" + +[[deps.Suppressor]] +git-tree-sha1 = "c6ed566db2fe3931292865b966d6d140b7ef32a9" +uuid = "fd094767-a336-5f1f-9728-57cf17d0bbfb" +version = "0.2.1" + +[[deps.TOML]] +deps = ["Dates"] +uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76" +version = "1.0.3" + +[[deps.TableTraits]] +deps = ["IteratorInterfaceExtensions"] +git-tree-sha1 = "c06b2f539df1c6efa794486abfb6ed2022561a39" +uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c" +version = "1.0.1" + +[[deps.Tables]] +deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "OrderedCollections", "TableTraits", "Test"] +git-tree-sha1 = "1544b926975372da01227b382066ab70e574a3ec" +uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" +version = "1.10.1" + +[[deps.Tar]] +deps = ["ArgTools", "SHA"] +uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e" +version = "1.10.0" + +[[deps.Test]] +deps = ["InteractiveUtils", "Logging", "Random", "Serialization"] +uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" + +[[deps.TimeZones]] +deps = ["Dates", "Downloads", "InlineStrings", "LazyArtifacts", "Mocking", "Printf", "RecipesBase", "Scratch", "Unicode"] +git-tree-sha1 = "a92ec4466fc6e3dd704e2668b5e7f24add36d242" +uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53" +version = "1.9.1" + +[[deps.TranscodingStreams]] +deps = ["Random", "Test"] +git-tree-sha1 = "0b829474fed270a4b0ab07117dce9b9a2fa7581a" +uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa" +version = "0.9.12" + +[[deps.TypedTables]] +deps = ["Adapt", "Dictionaries", "Indexing", "SplitApplyCombine", "Tables", "Unicode"] +git-tree-sha1 = "d911ae4e642cf7d56b1165d29ef0a96ba3444ca9" +uuid = "9d95f2ec-7b3d-5a63-8d20-e2491e220bb9" +version = "1.4.3" + +[[deps.UUIDs]] +deps = ["Random", "SHA"] +uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" + +[[deps.Unicode]] +uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" + +[[deps.WorkerUtilities]] +git-tree-sha1 = "cd1659ba0d57b71a464a29e64dbc67cfe83d54e7" +uuid = "76eceee3-57b5-4d4a-8e66-0e911cebbf60" +version = "1.6.1" + +[[deps.Zlib_jll]] +deps = ["Libdl"] +uuid = "83775a58-1f1d-513f-b197-d71354ab007a" +version = "1.2.13+0" + +[[deps.Zstd_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl"] +git-tree-sha1 = "49ce682769cd5de6c72dcf1b94ed7790cd08974c" +uuid = "3161d3a3-bdf6-5164-811a-617609db77b4" +version = "1.5.5+0" + +[[deps.libblastrampoline_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "8e850b90-86db-534c-a0d3-1478176c7d93" +version = "5.8.0+0" + +[[deps.nghttp2_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d" +version = "1.48.0+0" + +[[deps.p7zip_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0" +version = "17.4.0+0" diff --git a/inst/julia/Project.toml b/inst/julia/Project.toml new file mode 100644 index 0000000..32d0cec --- /dev/null +++ b/inst/julia/Project.toml @@ -0,0 +1,21 @@ +[deps] +DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" +GLM = "38e38edf-8417-5370-95a0-9cbb8c7f171a" +MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316" +ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca" +Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" +Random123 = "74087812-796a-5b5d-8853-05524746bad3" +StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" +StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d" +Suppressor = "fd094767-a336-5f1f-9728-57cf17d0bbfb" + +[compat] +DataFrames = "1.5" +GLM = "1.8.2" +MixedModels = "4.12" +ProgressMeter = "1.7" +Random123 = "1.6" +StatsBase = "0.33" +StatsModels = "0.7" +Suppressor = "0.2.1" +julia = "1.8" diff --git a/inst/julia/load-pkgs.jl b/inst/julia/load-pkgs.jl new file mode 100644 index 0000000..b23d279 --- /dev/null +++ b/inst/julia/load-pkgs.jl @@ -0,0 +1,9 @@ +using Suppressor +using Random +using Random123 +using StatsBase +using StatsModels +using GLM +using MixedModels +using DataFrames +using ProgressMeter diff --git a/man/calculate_clusters_pvalues.Rd b/man/calculate_clusters_pvalues.Rd new file mode 100644 index 0000000..f11c183 --- /dev/null +++ b/man/calculate_clusters_pvalues.Rd @@ -0,0 +1,88 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/calculate_pvalue.R +\name{calculate_clusters_pvalues} +\alias{calculate_clusters_pvalues} +\alias{clusters_are_comparable} +\title{Calculate bootstrapped p-values of cluster-mass statistics} +\usage{ +calculate_clusters_pvalues( + empirical_clusters, + null_cluster_dists, + add1 = FALSE +) + +clusters_are_comparable(empirical_clusters, null_cluster_dists, error = FALSE) +} +\arguments{ +\item{empirical_clusters}{A \code{empirical_clusters} object} + +\item{null_cluster_dists}{A \code{null_cluster_dists} object} + +\item{add1}{Whether to add 1 to the numerator and denominator when calculating the p-value. +Use \code{TRUE} to effectively count the observed statistic as part of the permuted +null distribution (recommended with larger \code{nsim} prior to publishing results).} + +\item{error}{Whether to throw an error if incompatible} +} +\value{ +An \code{empirical_clusters} object augmented with p-values. +} +\description{ +Calculate bootstrapped p-values of cluster-mass statistics +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Make empirical clusters +empirical_statistics <- compute_timewise_statistics(spec) +empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +empirical_clusters + +# Make null cluster-mass distribution +reset_rng_state() +null_statistics <- permute_timewise_statistics(spec, nsim = 100) +null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) + +# Significance test the empirical cluster(s) from each predictor against the simulated null +calculate_clusters_pvalues(empirical_clusters, null_cluster_dists) + +# Set `add1 = TRUE` to normalize by adding 1 to numerator and denominator +calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1 = TRUE) + +# This sequence of procedures is equivalent to `clusterpermute()` +reset_rng_state() +clusterpermute(spec, threshold = 2, nsim = 100, progress = FALSE) + +# The empirical clusters and the null cluster-mass distribution must be comparable +empirical_clusters2 <- extract_empirical_clusters(empirical_statistics, threshold = 3) +# For example, below code errors because thresholds are different (2 vs. 3) +try( calculate_clusters_pvalues(empirical_clusters2, null_cluster_dists) ) + +# Check for compatibility with `clusters_are_comparable()` +clusters_are_comparable(empirical_clusters, null_cluster_dists) +clusters_are_comparable(empirical_clusters2, null_cluster_dists) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=extract_empirical_clusters]{extract_empirical_clusters()}}, \code{\link[=extract_null_cluster_dists]{extract_null_cluster_dists()}} +} diff --git a/man/cluster_permutation_tidiers.Rd b/man/cluster_permutation_tidiers.Rd new file mode 100644 index 0000000..1685600 --- /dev/null +++ b/man/cluster_permutation_tidiers.Rd @@ -0,0 +1,70 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/tidy.R +\name{cluster_permutation_tidiers} +\alias{cluster_permutation_tidiers} +\alias{tidy.timewise_statistics} +\alias{tidy.empirical_clusters} +\alias{tidy.null_cluster_dists} +\title{Tidiers for cluster permutation test objects} +\usage{ +\method{tidy}{timewise_statistics}(x, ...) + +\method{tidy}{empirical_clusters}(x, ...) + +\method{tidy}{null_cluster_dists}(x, ...) +} +\arguments{ +\item{x}{An object of class \verb{}, \verb{}, or \verb{}} + +\item{...}{Unused} +} +\value{ +A data frame +} +\description{ +Tidiers for cluster permutation test objects +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Method for `` +empirical_statistics <- compute_timewise_statistics(spec) +class(empirical_statistics) +tidy(empirical_statistics) + +reset_rng_state() +null_statistics <- permute_timewise_statistics(spec, nsim = 100) +class(null_statistics) +tidy(null_statistics) + +# Method for `` +empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +class(empirical_clusters) +tidy(empirical_clusters) + +# Method for `` +null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +class(null_cluster_dists) +tidy(null_cluster_dists) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/clusterpermute.Rd b/man/clusterpermute.Rd new file mode 100644 index 0000000..ddee0bd --- /dev/null +++ b/man/clusterpermute.Rd @@ -0,0 +1,102 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/clusterpermute.R +\name{clusterpermute} +\alias{clusterpermute} +\title{Conduct a cluster-based permutation test} +\usage{ +clusterpermute( + jlmer_spec, + family = c("gaussian", "binomial"), + statistic = c("t", "chisq"), + threshold, + nsim = 100L, + predictors = NULL, + binned = FALSE, + top_n = Inf, + add1 = TRUE, + ..., + progress = TRUE +) +} +\arguments{ +\item{jlmer_spec}{Data prepped for jlmer from \code{make_jlmer_spec()}} + +\item{family}{A GLM family. Currently supports "gaussian" and "binomial".} + +\item{statistic}{Test statistic for calculating cluster mass. +Can be one of \code{"t"} (default) from the regression model output or +\code{"chisq"} from a likelihood ratio test (takes about twice as long to calculate).} + +\item{threshold}{The threshold value that the statistic must pass to contribute to cluster mass. +Interpretation differs on the choice of statistic (more below): +\itemize{ +\item If \code{statistic = "t"}, the threshold for t-value (beta/std.err) from the regression model. +\item If \code{statistic = "chisq"}, the threshold for the p-value of chi-squared statistics from likelihood ratio tests. +}} + +\item{nsim}{Number of simulations description} + +\item{predictors}{(Optional) a subset of predictors to test. Defaults to \code{NULL} which tests all predictors.} + +\item{binned}{Whether the data has been aggregated/collapsed into time bins. Defaults to \code{FALSE}, +which requires a cluster to span at least two time points. If \code{TRUE}, allows length-1 clusters to exist.} + +\item{top_n}{How many clusters to return, in the order of the size of the cluster-mass statistic. +Defaults to \code{Inf} which return all detected clusters.} + +\item{add1}{Whether to add 1 to the numerator and denominator when calculating the p-value. +Use \code{TRUE} to effectively count the observed statistic as part of the permuted +null distribution (recommended with larger \code{nsim} prior to publishing results).} + +\item{...}{Optional arguments passed to Julia for model fitting. +Defaults to \code{fast = TRUE} (when \code{family = "binomial"}) and \code{progress = FALSE}.} + +\item{progress}{Defaults to \code{TRUE}, which prints progress on each step of the cluster permutation test.} +} +\value{ +A list of \code{null_cluster_dists} and \code{empirical_clusters} with p-values +} +\description{ +Conduct a cluster-based permutation test +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Should minimally provide `threshold` and `nsim`, in addition to the spec object +reset_rng_state() +CPA <- clusterpermute(spec, threshold = 2, nsim = 100, progress = FALSE) +CPA + +# CPA is a list of `` and `` objects +sapply(CPA, class) + +# You can extract the individual components for further inspection +CPA$null_cluster_dists +CPA$empirical_clusters + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=compute_timewise_statistics]{compute_timewise_statistics()}}, \code{\link[=permute_timewise_statistics]{permute_timewise_statistics()}}, +\code{\link[=extract_empirical_clusters]{extract_empirical_clusters()}}, \code{\link[=extract_null_cluster_dists]{extract_null_cluster_dists()}}, +\code{\link[=calculate_clusters_pvalues]{calculate_clusters_pvalues()}} +} diff --git a/man/compute_timewise_statistics.Rd b/man/compute_timewise_statistics.Rd new file mode 100644 index 0000000..2dd471f --- /dev/null +++ b/man/compute_timewise_statistics.Rd @@ -0,0 +1,74 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/compute_timewise_statistics.R +\name{compute_timewise_statistics} +\alias{compute_timewise_statistics} +\title{Fit Julia regression models to each time point of a time series data} +\usage{ +compute_timewise_statistics( + jlmer_spec, + family = c("gaussian", "binomial"), + statistic = c("t", "chisq"), + ... +) +} +\arguments{ +\item{jlmer_spec}{Data prepped for jlmer from \code{make_jlmer_spec()}} + +\item{family}{A GLM family. Currently supports "gaussian" and "binomial".} + +\item{statistic}{Test statistic for calculating cluster mass. +Can be one of \code{"t"} (default) from the regression model output or +\code{"chisq"} from a likelihood ratio test (takes about twice as long to calculate).} + +\item{...}{Optional arguments passed to Julia for model fitting. +Defaults to \code{fast = TRUE} (when \code{family = "binomial"}) and \code{progress = FALSE}.} +} +\value{ +A predictor-by-time matrix of cluster statistics, of class \code{timewise_statistics}. +} +\description{ +Fit Julia regression models to each time point of a time series data +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Predictor x Time matrix of t-statistics from regression output +empirical_statistics <- compute_timewise_statistics(spec) +round(empirical_statistics, 2) + +# Collect as dataframe with `tidy()` +empirical_statistics_df <- tidy(empirical_statistics) +empirical_statistics_df + +# Timewise statistics are from regression models fitted to each time point +# - Note the identical statistics at `Time == 0` +empirical_statistics_df \%>\% + filter(time == 0) +to_jlmer(weight ~ 1 + Diet, filter(ChickWeight, Time == 0)) \%>\% + tidy() \%>\% + select(term, statistic) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=jlmer]{jlmer()}}, \code{\link[=make_jlmer_spec]{make_jlmer_spec()}} +} diff --git a/man/extract_empirical_clusters.Rd b/man/extract_empirical_clusters.Rd new file mode 100644 index 0000000..db02b92 --- /dev/null +++ b/man/extract_empirical_clusters.Rd @@ -0,0 +1,77 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/extract_clusters.R +\name{extract_empirical_clusters} +\alias{extract_empirical_clusters} +\title{Detect largest clusters from a time sequence of predictor statistics} +\usage{ +extract_empirical_clusters( + empirical_statistics, + threshold, + binned = FALSE, + top_n = Inf +) +} +\arguments{ +\item{empirical_statistics}{A predictor-by-time matrix of empirical timewise statistics.} + +\item{threshold}{The threshold value that the statistic must pass to contribute to cluster mass. +Interpretation differs on the choice of statistic (more below): +\itemize{ +\item If \code{statistic = "t"}, the threshold for t-value (beta/std.err) from the regression model. +\item If \code{statistic = "chisq"}, the threshold for the p-value of chi-squared statistics from likelihood ratio tests. +}} + +\item{binned}{Whether the data has been aggregated/collapsed into time bins. Defaults to \code{FALSE}, +which requires a cluster to span at least two time points. If \code{TRUE}, allows length-1 clusters to exist.} + +\item{top_n}{How many clusters to return, in the order of the size of the cluster-mass statistic. +Defaults to \code{Inf} which return all detected clusters.} +} +\value{ +An \code{empirical_clusters} object. +} +\description{ +Detect largest clusters from a time sequence of predictor statistics +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Empirical clusters are derived from the timewise statistics +empirical_statistics <- compute_timewise_statistics(spec) +empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +empirical_clusters + +# Collect as dataframe with `tidy()` +empirical_clusters_df <- tidy(empirical_clusters) +empirical_clusters_df + +# Changing the `threshold` value identifies different clusters +extract_empirical_clusters(empirical_statistics, threshold = 1) + +# A predictor can have zero or multiple clusters associated with it +extract_empirical_clusters(empirical_statistics, threshold = 3) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=compute_timewise_statistics]{compute_timewise_statistics()}} +} diff --git a/man/extract_null_cluster_dists.Rd b/man/extract_null_cluster_dists.Rd new file mode 100644 index 0000000..c3c41e6 --- /dev/null +++ b/man/extract_null_cluster_dists.Rd @@ -0,0 +1,70 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/extract_clusters.R +\name{extract_null_cluster_dists} +\alias{extract_null_cluster_dists} +\title{Construct a null distribution of cluster-mass statistics} +\usage{ +extract_null_cluster_dists(null_statistics, threshold, binned = FALSE) +} +\arguments{ +\item{null_statistics}{A simulation-by-time-by-predictor 3D array of null (permuted) timewise statistics.} + +\item{threshold}{The threshold value that the statistic must pass to contribute to cluster mass. +Interpretation differs on the choice of statistic (more below): +\itemize{ +\item If \code{statistic = "t"}, the threshold for t-value (beta/std.err) from the regression model. +\item If \code{statistic = "chisq"}, the threshold for the p-value of chi-squared statistics from likelihood ratio tests. +}} + +\item{binned}{Whether the data has been aggregated/collapsed into time bins. Defaults to \code{FALSE}, +which requires a cluster to span at least two time points. If \code{TRUE}, allows length-1 clusters to exist.} +} +\value{ +A \code{null_cluster_dists} object. +} +\description{ +Construct a null distribution of cluster-mass statistics +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Null cluster-mass distributions are derived from the permuted timewise statistics +reset_rng_state() +null_statistics <- permute_timewise_statistics(spec, nsim = 100) +null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +null_cluster_dists + +# Collect as dataframe with `tidy()` +# - Each simulation contributes one (largest) cluster-mass statistic to the null +# - When no clusters are found, the `sum_statistic` value is zero +null_cluster_dists_df <- tidy(null_cluster_dists) +null_cluster_dists_df + +# Changing the `threshold` value changes the shape of the null +extract_null_cluster_dists(null_statistics, threshold = 1) +extract_null_cluster_dists(null_statistics, threshold = 3) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=permute_timewise_statistics]{permute_timewise_statistics()}} +} diff --git a/man/figures/README-chickweight-1.png b/man/figures/README-chickweight-1.png new file mode 100644 index 0000000..474c5d5 Binary files /dev/null and b/man/figures/README-chickweight-1.png differ diff --git a/man/figures/README-empirical_statistics-1.png b/man/figures/README-empirical_statistics-1.png new file mode 100644 index 0000000..b3b084a Binary files /dev/null and b/man/figures/README-empirical_statistics-1.png differ diff --git a/man/figures/README-plot-empirical_statistics-1.png b/man/figures/README-plot-empirical_statistics-1.png new file mode 100644 index 0000000..b3b084a Binary files /dev/null and b/man/figures/README-plot-empirical_statistics-1.png differ diff --git a/man/figures/clusterpermute_animation.gif b/man/figures/clusterpermute_animation.gif new file mode 100644 index 0000000..1bc841f Binary files /dev/null and b/man/figures/clusterpermute_animation.gif differ diff --git a/man/figures/clusterpermute_slice.png b/man/figures/clusterpermute_slice.png new file mode 100644 index 0000000..d6fc16d Binary files /dev/null and b/man/figures/clusterpermute_slice.png differ diff --git a/man/figures/deCarvalho_fig7.jpg b/man/figures/deCarvalho_fig7.jpg new file mode 100644 index 0000000..82c2310 Binary files /dev/null and b/man/figures/deCarvalho_fig7.jpg differ diff --git a/man/figures/desktop.ini b/man/figures/desktop.ini new file mode 100644 index 0000000..f76a748 --- /dev/null +++ b/man/figures/desktop.ini @@ -0,0 +1,3 @@ +[LocalizedFileNames] +clusterperm_anim.gif=@clusterperm_anim,0 +clusterperm_slice.png=@clusterperm_slice,0 diff --git a/man/figures/jlmerclusterperm_fn_design.png b/man/figures/jlmerclusterperm_fn_design.png new file mode 100644 index 0000000..9f2d1b4 Binary files /dev/null and b/man/figures/jlmerclusterperm_fn_design.png differ diff --git a/man/figures/jlmerclusterperm_logo_plot.png b/man/figures/jlmerclusterperm_logo_plot.png new file mode 100644 index 0000000..c5b85e1 Binary files /dev/null and b/man/figures/jlmerclusterperm_logo_plot.png differ diff --git a/man/figures/logo.png b/man/figures/logo.png new file mode 100644 index 0000000..cb3f67d Binary files /dev/null and b/man/figures/logo.png differ diff --git a/man/jlmer.Rd b/man/jlmer.Rd new file mode 100644 index 0000000..7692d41 --- /dev/null +++ b/man/jlmer.Rd @@ -0,0 +1,50 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/jlmer.R +\name{jlmer} +\alias{jlmer} +\title{Fit a Julia regression model using jlmer specifications} +\usage{ +jlmer(jlmer_spec, family = c("gaussian", "binomial"), ..., progress = FALSE) +} +\arguments{ +\item{jlmer_spec}{Data prepped for jlmer from \code{make_jlmer_spec()}} + +\item{family}{A GLM family. Currently supports "gaussian" and "binomial".} + +\item{...}{Optional arguments passed to Julia for model fitting.} + +\item{progress}{If \code{TRUE}, prints the timing of iterations.} +} +\value{ +A \code{jlmer_mod} object. +} +\description{ +Fit a Julia regression model using jlmer specifications +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +jlmerclusterperm_setup(verbose = FALSE) + +# Fitting a regression model with a specification object +spec <- make_jlmer_spec(weight ~ 1 + Diet, ChickWeight) +jlmer(spec) + +# `lm()` equivalent +summary(lm(weight ~ 1 + Diet, ChickWeight))$coef + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=make_jlmer_spec]{make_jlmer_spec()}} +} diff --git a/man/jlmerclusterperm-package.Rd b/man/jlmerclusterperm-package.Rd new file mode 100644 index 0000000..f353873 --- /dev/null +++ b/man/jlmerclusterperm-package.Rd @@ -0,0 +1,26 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/jlmerclusterperm-package.R +\docType{package} +\name{jlmerclusterperm-package} +\alias{jlmerclusterperm} +\alias{jlmerclusterperm-package} +\title{jlmerclusterperm: Cluster-Based Permutation Analysis for Densely Sampled Time Data} +\description{ +\if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} + +An implementation of fast cluster-based permutation analysis (CPA) for densely-sampled time data developed in Maris & Oostenveld, 2007 \doi{10.1016/j.jneumeth.2007.03.024}. Supports (generalized, mixed-effects) regression models for the calculation of timewise statistics. Provides both a wholesale and a piecemeal interface to the CPA procedure with an emphasis on interpretability and diagnostics. Integrates 'Julia' libraries 'MixedModels.JL' and 'GLM.JL' for performance improvements, with additional functionalities for interfacing with 'Julia' from 'R' powered by the 'JuliaConnectoR' package. +} +\seealso{ +Useful links: +\itemize{ + \item \url{https://github.com/yjunechoe/jlmerclusterperm} + \item \url{https://yjunechoe.github.io/jlmerclusterperm/} + \item Report bugs at \url{https://github.com/yjunechoe/jlmerclusterperm/issues} +} + +} +\author{ +\strong{Maintainer}: June Choe \email{jchoe001@gmail.com} (\href{https://orcid.org/0000-0002-0701-921X}{ORCID}) [copyright holder] + +} +\keyword{internal} diff --git a/man/jlmerclusterperm_setup.Rd b/man/jlmerclusterperm_setup.Rd new file mode 100644 index 0000000..38edb94 --- /dev/null +++ b/man/jlmerclusterperm_setup.Rd @@ -0,0 +1,34 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/aaa.R +\name{jlmerclusterperm_setup} +\alias{jlmerclusterperm_setup} +\title{Initial setup for the jlmerclusterperm package} +\usage{ +jlmerclusterperm_setup(..., restart = TRUE, verbose = TRUE) +} +\arguments{ +\item{...}{Ignored.} + +\item{restart}{Whether to set up a fresh Julia session, given that one is already running. +If \code{FALSE} and \code{jlmerclusterperm_setup()} has already been called, nothing happens.} + +\item{verbose}{Print progress and messages from Julia in the console} +} +\value{ +TRUE +} +\description{ +Initial setup for the jlmerclusterperm package +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/julia_model_tidiers.Rd b/man/julia_model_tidiers.Rd new file mode 100644 index 0000000..d0b68b4 --- /dev/null +++ b/man/julia_model_tidiers.Rd @@ -0,0 +1,54 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/tidy.R +\name{julia_model_tidiers} +\alias{julia_model_tidiers} +\alias{tidy.jlmer_mod} +\alias{glance.jlmer_mod} +\title{Tidier methods for Julia regression models} +\usage{ +\method{tidy}{jlmer_mod}(x, effects = c("var_model", "ran_pars", "fixed"), ...) + +\method{glance}{jlmer_mod}(x, ...) +} +\arguments{ +\item{x}{An object of class \code{jlmer_mod}} + +\item{effects}{One of "var_model", "ran_pars", or "fixed"} + +\item{...}{Unused} +} +\value{ +A data frame +} +\description{ +Tidier methods for Julia regression models +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +# Fixed-effects only model +mod1 <- to_jlmer(weight ~ 1 + Diet, ChickWeight) +tidy(mod1) +glance(mod1) + +# Mixed model +mod2 <- to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight) +tidy(mod2) +glance(mod2) + +# Select which of fixed/random effects to return +tidy(mod2, effects = "fixed") +tidy(mod2, effects = "ran_pars") + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/julia_progress.Rd b/man/julia_progress.Rd new file mode 100644 index 0000000..042a21c --- /dev/null +++ b/man/julia_progress.Rd @@ -0,0 +1,48 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/interop-utils.R +\name{julia_progress} +\alias{julia_progress} +\title{Set/get options for Julia progress bar} +\usage{ +julia_progress(show, width) +} +\arguments{ +\item{show}{Whether to show the progress bar. You may also pass in a list of \code{"show"} and \code{"width"}.} + +\item{width}{Width of the progress bar. If \code{"auto"}, adjusts the progress bar width to fit the console.} +} +\value{ +Previous values for \code{show} and \code{width} +} +\description{ +Set/get options for Julia progress bar +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +} + +# Show current progress options +julia_progress() + +# Set options and save previous options +old_progress_opts <- julia_progress(show = FALSE, width = 100) +julia_progress() + +# Restoring progress settings by passing a list of old options +old_progress_opts +julia_progress(old_progress_opts) +identical(julia_progress(), old_progress_opts) + +# Alternatively, reset to default settings using this syntax: +julia_progress(show = TRUE, width = "auto") + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/julia_rng.Rd b/man/julia_rng.Rd new file mode 100644 index 0000000..4a85704 --- /dev/null +++ b/man/julia_rng.Rd @@ -0,0 +1,63 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/julia_rng.R +\name{julia_rng} +\alias{julia_rng} +\alias{set_rng_state} +\alias{reset_rng_state} +\alias{get_rng_state} +\alias{set_rng_seed} +\alias{get_rng_seed} +\title{Interface to the Julia RNG} +\usage{ +set_rng_state(i) + +reset_rng_state() + +get_rng_state() + +set_rng_seed(seed) + +get_rng_seed() +} +\arguments{ +\item{i}{Counter number} + +\item{seed}{Seed} +} +\value{ +The current seed or counter +} +\description{ +Interface to the Julia RNG +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +# RNG initializes to seed=1 counter=0 +get_rng_seed() +get_rng_state() + +# setter/getter for RNG counter +set_rng_state(123) +get_rng_state() + +# setter/getter for RNG seed +set_rng_seed(2) +get_rng_seed() + +# restore to initial setting (seed=1, counter=0) +set_rng_seed(1) +set_rng_state(0) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/make_jlmer_spec.Rd b/man/make_jlmer_spec.Rd new file mode 100644 index 0000000..9225b1a --- /dev/null +++ b/man/make_jlmer_spec.Rd @@ -0,0 +1,60 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/jlmer_spec.R +\name{make_jlmer_spec} +\alias{make_jlmer_spec} +\title{Create a specifications object for fitting regression models in Julia} +\usage{ +make_jlmer_spec( + formula, + data, + subject = NULL, + trial = NULL, + time = NULL, + drop_terms = NULL, + ... +) +} +\arguments{ +\item{formula}{Model formula in R syntax} + +\item{data}{A data frame} + +\item{subject}{Column for subjects in the data.} + +\item{trial}{Column for trials in the data. Must uniquely identify a time series within subject +(for example, the column for items in a counterbalanced design where each subject sees exactly one item).} + +\item{time}{Column for time in the data.} + +\item{drop_terms}{(Optional) any terms to drop from the reconstructed model formula} + +\item{...}{Unused, for extensibility.} +} +\value{ +An object of class \code{jlmer_spec}. +} +\description{ +Create a specifications object for fitting regression models in Julia +} +\examples{ +# Bare specification object (minimal spec for fitting a global model) +spec <- make_jlmer_spec(weight ~ 1 + Diet, ChickWeight) +spec + +# Constraints on specification for CPA: +# 1) The combination of `subject`, `trial`, and `time` must uniquely identify rows in the data +# 2) `time` must have constant sampling rate (i.e., evenly spaced values) +spec_wrong <- make_jlmer_spec( + weight ~ 1 + Diet, ChickWeight, + time = "Time" +) +unique(ChickWeight$Time) + +# Corrected specification for the above +spec_correct <- make_jlmer_spec( + weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec_correct + +} diff --git a/man/permute_by_predictor.Rd b/man/permute_by_predictor.Rd new file mode 100644 index 0000000..bf67b41 --- /dev/null +++ b/man/permute_by_predictor.Rd @@ -0,0 +1,83 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/permute.R +\name{permute_by_predictor} +\alias{permute_by_predictor} +\title{Permute data while respecting grouping structure(s) of observations} +\usage{ +permute_by_predictor( + jlmer_spec, + predictors, + predictor_type = c("guess", "between_participant", "within_participant"), + n = 1L +) +} +\arguments{ +\item{jlmer_spec}{Data prepped for jlmer from \code{make_jlmer_spec()}} + +\item{predictors}{A vector of terms from the model. If multiple, the must form the levels of one predictor.} + +\item{predictor_type}{Whether the predictor is \code{"between_participant"} or \code{"within_participant"}. Defaults to \code{"guess"}.} + +\item{n}{Number of permuted samples of the data to generate. Defaults to \code{1L}.} +} +\value{ +A long dataframe of permuted re-samples with \code{.id} column representing replication IDs. +} +\description{ +Permute data while respecting grouping structure(s) of observations +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +# Example data setup +chickweights_df <- ChickWeight +chickweights_df <- chickweights_df[chickweights_df$Time <= 20, ] +chickweights_df$DietInt <- as.integer(chickweights_df$Diet) +head(chickweights_df) + +# Example 1: Spec object using the continuous `DietInt` predictor +chickweights_spec1 <- make_jlmer_spec( + formula = weight ~ 1 + DietInt, + data = chickweights_df, + subject = "Chick", time = "Time" +) +chickweights_spec1 + +# Shuffling `DietInt` values guesses `predictor_type = "between_participant"` +reset_rng_state() +spec1_perm1 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt") +# This calls the same shuffling algorithm for CPA in Julia, so counter is incremented +get_rng_state() + +# Shuffling under shared RNG state reproduces the same permutation of the data +reset_rng_state() +spec1_perm2 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt") +identical(spec1_perm1, spec1_perm2) + +# Example 2: Spec object using the multilevel `Diet` predictor +chickweights_spec2 <- make_jlmer_spec( + formula = weight ~ 1 + Diet, + data = chickweights_df, + subject = "Chick", time = "Time" +) +chickweights_spec2 + +# Levels of a category are automatically shuffled together +reset_rng_state() +spec2_perm1 <- permute_by_predictor(chickweights_spec2, predictors = "Diet2") +reset_rng_state() +spec2_perm2 <- permute_by_predictor(chickweights_spec2, predictors = c("Diet2", "Diet3", "Diet4")) +identical(spec2_perm1, spec2_perm2) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/man/permute_timewise_statistics.Rd b/man/permute_timewise_statistics.Rd new file mode 100644 index 0000000..fadbfef --- /dev/null +++ b/man/permute_timewise_statistics.Rd @@ -0,0 +1,82 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/permute_timewise_statistics.R +\name{permute_timewise_statistics} +\alias{permute_timewise_statistics} +\title{Simulate cluster-mass statistics via bootstrapped permutations} +\usage{ +permute_timewise_statistics( + jlmer_spec, + family = c("gaussian", "binomial"), + statistic = c("t", "chisq"), + nsim = 100L, + predictors = NULL, + ... +) +} +\arguments{ +\item{jlmer_spec}{Data prepped for jlmer from \code{make_jlmer_spec()}} + +\item{family}{A GLM family. Currently supports "gaussian" and "binomial".} + +\item{statistic}{Test statistic for calculating cluster mass. +Can be one of \code{"t"} (default) from the regression model output or +\code{"chisq"} from a likelihood ratio test (takes about twice as long to calculate).} + +\item{nsim}{Number of simulations description} + +\item{predictors}{(Optional) a subset of predictors to test. Defaults to \code{NULL} which tests all predictors.} + +\item{...}{Optional arguments passed to Julia for model fitting. +Defaults to \code{fast = TRUE} (when \code{family = "binomial"}) and \code{progress = FALSE}.} +} +\value{ +A simulation-by-time-by-predictor 3D array of cluster statistics, of class \code{timewise_statistics}. +} +\description{ +Simulate cluster-mass statistics via bootstrapped permutations +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Simulation x Time x Predictor array of t-statistics from regression output +reset_rng_state() +null_statistics <- permute_timewise_statistics(spec, nsim = 3) +round(null_statistics, 2) + +# Collect as dataframe with `tidy()` +permuted_timewise_stats_df <- tidy(null_statistics) +permuted_timewise_stats_df + +# Permutations ran under the same RNG state are identical +reset_rng_state() +null_statistics2 <- permute_timewise_statistics(spec, nsim = 3) +identical(null_statistics, null_statistics2) + +get_rng_state() +null_statistics3 <- permute_timewise_statistics(spec, nsim = 3) +identical(null_statistics, null_statistics3) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=make_jlmer_spec]{make_jlmer_spec()}} +} diff --git a/man/reexports.Rd b/man/reexports.Rd new file mode 100644 index 0000000..c7fc9ed --- /dev/null +++ b/man/reexports.Rd @@ -0,0 +1,17 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/tidy.R +\docType{import} +\name{reexports} +\alias{reexports} +\alias{tidy} +\alias{glance} +\title{Objects exported from other packages} +\keyword{internal} +\description{ +These objects are imported from other packages. Follow the links +below to see their documentation. + +\describe{ + \item{generics}{\code{\link[generics]{glance}}, \code{\link[generics]{tidy}}} +}} + diff --git a/man/to_jlmer.Rd b/man/to_jlmer.Rd new file mode 100644 index 0000000..788a0f0 --- /dev/null +++ b/man/to_jlmer.Rd @@ -0,0 +1,66 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/jlmer.R +\name{to_jlmer} +\alias{to_jlmer} +\title{Fit a Julia regression model using lme4 syntax} +\usage{ +to_jlmer( + formula, + data, + family = c("gaussian", "binomial"), + jlmer_spec_opts = list(), + ..., + progress = FALSE +) +} +\arguments{ +\item{formula}{Model formula in R syntax} + +\item{data}{A data frame} + +\item{family}{A GLM family. Currently supports "gaussian" and "binomial".} + +\item{jlmer_spec_opts}{List of options passed to \code{make_jlmer_spec()}} + +\item{...}{Optional arguments passed to Julia for model fitting.} + +\item{progress}{If \code{TRUE}, prints the timing of iterations.} +} +\value{ +A \code{jlmer_mod} object. +} +\description{ +Fit a Julia regression model using lme4 syntax +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +jlmerclusterperm_setup(verbose = FALSE) + +# Fitting a regression model with R formula syntax +to_jlmer(weight ~ 1 + Diet, ChickWeight) + +# `lm()` equivalent +summary(lm(weight ~ 1 + Diet, ChickWeight))$coef + +# Fitting a mixed model with {lme4} syntax +to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight) + +# Passing MixedModels.jl fit options to the `...` +to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight, REML = TRUE) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} +\seealso{ +\code{\link[=jlmer]{jlmer()}}, \code{\link[=make_jlmer_spec]{make_jlmer_spec()}} +} diff --git a/man/walk_threshold_steps.Rd b/man/walk_threshold_steps.Rd new file mode 100644 index 0000000..9adbe8a --- /dev/null +++ b/man/walk_threshold_steps.Rd @@ -0,0 +1,73 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/threshold_search.R +\name{walk_threshold_steps} +\alias{walk_threshold_steps} +\title{Test the probability of cluster-mass statistics over a range of threshold values} +\usage{ +walk_threshold_steps( + empirical_statistics, + null_statistics, + steps, + top_n = Inf, + binned = FALSE, + add1 = TRUE, + progress = TRUE +) +} +\arguments{ +\item{empirical_statistics}{A predictor-by-time matrix of empirical timewise statistics.} + +\item{null_statistics}{A simulation-by-time-by-predictor 3D array of null (permuted) timewise statistics.} + +\item{steps}{A vector of threshold values to test} + +\item{top_n}{How many clusters to return, in the order of the size of the cluster-mass statistic. +Defaults to \code{Inf} which return all detected clusters.} + +\item{binned}{Whether the data has been aggregated/collapsed into time bins. Defaults to \code{FALSE}, +which requires a cluster to span at least two time points. If \code{TRUE}, allows length-1 clusters to exist.} + +\item{add1}{Whether to add 1 to the numerator and denominator when calculating the p-value. +Use \code{TRUE} to effectively count the observed statistic as part of the permuted +null distribution (recommended with larger \code{nsim} prior to publishing results).} + +\item{progress}{Whether to display a progress bar} +} +\value{ +A data frame of predictor clusters-mass statistics by threshold. +} +\description{ +Test the probability of cluster-mass statistics over a range of threshold values +} +\examples{ +\dontshow{if (JuliaConnectoR::juliaSetupOk()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +\donttest{ +\dontshow{ +options("jlmerclusterperm.nthreads" = 2) +jlmerclusterperm_setup(verbose = FALSE) +julia_progress(show = FALSE) +} + +library(dplyr, warn.conflicts = FALSE) + +# Specification object +spec <- make_jlmer_spec( + weight ~ 1 + Diet, filter(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +spec + +# Compute timewise statistics for the observed and permuted data +empirical_statistics <- compute_timewise_statistics(spec) +reset_rng_state() +null_statistics <- permute_timewise_statistics(spec, nsim = 100) + +# Test cluster mass/probability under different threshold values +walk_threshold_steps(empirical_statistics, null_statistics, steps = 1:3) + +\dontshow{ +JuliaConnectoR::stopJulia() +} +} +\dontshow{\}) # examplesIf} +} diff --git a/tests/testthat.R b/tests/testthat.R new file mode 100644 index 0000000..83ea3c4 --- /dev/null +++ b/tests/testthat.R @@ -0,0 +1,12 @@ +# This file is part of the standard setup for testthat. +# It is recommended that you do not modify it. +# +# Where should you do additional test configuration? +# Learn more about the roles of various files in: +# * https://r-pkgs.org/tests.html +# * https://testthat.r-lib.org/reference/test_package.html#special-files + +library(testthat) +library(jlmerclusterperm) + +test_check("jlmerclusterperm") diff --git a/tests/testthat/helper-skip.R b/tests/testthat/helper-skip.R new file mode 100644 index 0000000..6877201 --- /dev/null +++ b/tests/testthat/helper-skip.R @@ -0,0 +1,8 @@ +no_julia <- !JuliaConnectoR::juliaSetupOk() +skip_conditionally <- function() { + # skip_on_cran() + if (no_julia) { + skip("Julia setup not okay.") + } + invisible() +} diff --git a/tests/testthat/test-aaa.R b/tests/testthat/test-aaa.R new file mode 100644 index 0000000..c984f30 --- /dev/null +++ b/tests/testthat/test-aaa.R @@ -0,0 +1,26 @@ +skip_conditionally() + +#' @srrstats {G5.2} Appropriate message/warning/error tests are in `/tests/testthat` + +# test_that("Setup works", { +# expect_true(jlmerclusterperm_setup()) +# }) + +test_that("Setup with seed works (use 2 for testing)", { + options("jlmerclusterperm.nthreads" = 2) + expect_true(jlmerclusterperm_setup(verbose = FALSE)) + expect_equal(JuliaConnectoR::juliaEval("Threads.nthreads()"), 2) +}) +# +# test_that("Restart as default", { +# expect_true(jlmerclusterperm_setup(verbose = FALSE)) +# }) + +test_that("Don't restart if FALSE", { + expect_message(jlmerclusterperm_setup(restart = FALSE), "already running") +}) + +test_that("RNG initializes to seed=1 counter=0", { + expect_equal(get_rng_seed(), 1) + expect_equal(get_rng_state(), 0) +}) diff --git a/tests/testthat/test-clusterpermute.R b/tests/testthat/test-clusterpermute.R new file mode 100644 index 0000000..8b8ccac --- /dev/null +++ b/tests/testthat/test-clusterpermute.R @@ -0,0 +1,57 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +#' @srrstats {G5.0} Uses the built-in `ChickWeight` dataset for tests +spec <- make_jlmer_spec( + weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) + +#' @srrstats {G5.5} Uses shared Julia RNG state to test correctness +#' @srrstats {G5.9} Tests for stochastic nature of the CPA under different RNG states +test_that("CPAs under the same RNG state are identical", { + reset_rng_state() + a <- clusterpermute(spec, threshold = 2, progress = FALSE) + reset_rng_state() + b <- clusterpermute(spec, threshold = 2, progress = FALSE) + expect_equal(a, b) +}) + +# piecemeal vs. wholesale +reset_rng_state() +wholesale <- clusterpermute(spec, threshold = 2, progress = FALSE) +reset_rng_state() +empirical_statistics <- compute_timewise_statistics(spec) +empirical_clusters <- extract_empirical_clusters(empirical_statistics, threshold = 2) +null_statistics <- permute_timewise_statistics(spec) +null_cluster_dists <- extract_null_cluster_dists(null_statistics, threshold = 2) +empirical_clusters_tested <- calculate_clusters_pvalues(empirical_clusters, null_cluster_dists, add1 = TRUE) +reset_rng_state() + +test_that("Piecemeal and wholesale CPAs are identical", { + expect_equal(wholesale$null_cluster_dists, null_cluster_dists) + expect_equal(tidy(wholesale$null_cluster_dists), tidy(null_cluster_dists)) + expect_equal(wholesale$empirical_clusters, empirical_clusters_tested) + expect_equal(tidy(wholesale$empirical_clusters), tidy(empirical_clusters_tested)) +}) + +test_that("Errors on incompatible clusters", { + expect_error(calculate_clusters_pvalues(extract_empirical_clusters(empirical_statistics, threshold = 3), null_cluster_dists)) + expect_error(calculate_clusters_pvalues(extract_empirical_clusters(compute_timewise_statistics(spec, statistic = "chisq"), threshold = .05), null_cluster_dists)) + expect_error(calculate_clusters_pvalues(empirical_clusters, extract_null_cluster_dists(null_statistics, threshold = 3))) +}) + +test_that("Errors on no predictors", { + spec_intercept <- make_jlmer_spec( + weight ~ 1, subset(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" + ) + expect_error(clusterpermute(spec_intercept, threshold = 1.5, nsim = 1), "No predictors to permute") + expect_error(permute_timewise_statistics(spec_intercept, threshold = 1.5, nsim = 1), "No predictors to permute") +}) + +# Coverage +lapply(list(empirical_statistics, empirical_clusters, null_statistics, null_cluster_dists, empirical_clusters_tested), tidy) +lapply(list(spec, empirical_statistics, empirical_clusters, null_statistics, null_cluster_dists, empirical_clusters_tested), print) +print(wholesale) diff --git a/tests/testthat/test-jlmer.R b/tests/testthat/test-jlmer.R new file mode 100644 index 0000000..a6cb6a5 --- /dev/null +++ b/tests/testthat/test-jlmer.R @@ -0,0 +1,33 @@ +skip_conditionally() + +#' @srrstats {G5.3} Tests for correctness but no explicit test for NA +#' @srrstats {G5.4} Tests compare to R `lm()` output. + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +spec_lm <- make_jlmer_spec(weight ~ 1 + Diet, ChickWeight) +jlm1 <- to_jlmer(weight ~ 1 + Diet, ChickWeight) +jlm2 <- jlmer(spec_lm) + +#' @srrstats {RE7.3} Tests for `print()`, `tidy()`, and `glance()` +test_that("direct and indirect fits identical", { + expect_equal(tidy(jlm1), tidy(jlm2)) + expect_equal(glance(jlm1), glance(jlm2)) +}) + +test_that("returns julia object", { + expect_s3_class(jlm1, "jlmer_mod") + expect_s3_class(jlm2, "jlmer_mod") +}) + +spec_lme <- make_jlmer_spec(weight ~ 1 + Diet + (1 | Chick), ChickWeight) +jlme1 <- to_jlmer(weight ~ 1 + Diet + (1 | Chick), ChickWeight) +jlme2 <- jlmer(spec_lme) + +test_that("direct and indirect fits identical - mixed", { + expect_equal(tidy(jlme1), tidy(jlme2)) + expect_equal(glance(jlme1), glance(jlme2)) +}) + +print(jlm1) +print(jlme1) diff --git a/tests/testthat/test-jlmer_spec.R b/tests/testthat/test-jlmer_spec.R new file mode 100644 index 0000000..4fbecc1 --- /dev/null +++ b/tests/testthat/test-jlmer_spec.R @@ -0,0 +1,13 @@ +test_that("Minimal specification without grouping structures allowed", { + expect_equal(names(make_jlmer_spec(weight ~ 1 + Diet, ChickWeight)), c("formula", "data", "meta")) + expect_equal(names(make_jlmer_spec(weight ~ 1 + Diet + (1 | Chick), ChickWeight)), c("formula", "data", "meta")) +}) + +test_that("Inform misspecifications in grouping structure", { + expect_message(make_jlmer_spec(weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), time = "Time")) + expect_message(make_jlmer_spec(weight ~ 1 + Diet, ChickWeight, subject = "Chick", time = "Time")) +}) + +test_that("Warn uneven time sampling rate", { + expect_message(make_jlmer_spec(weight ~ 1 + Diet, ChickWeight, subject = "Chick", time = "Time"), "Sampling rate") +}) diff --git a/tests/testthat/test-julia_rng.R b/tests/testthat/test-julia_rng.R new file mode 100644 index 0000000..9753d3f --- /dev/null +++ b/tests/testthat/test-julia_rng.R @@ -0,0 +1,28 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +test_that("RNG counter setter/getter", { + expect_equal(set_rng_state(123), 123) + expect_equal(get_rng_state(), 123) +}) + +test_that("RNG seed setter/getter", { + expect_equal(set_rng_seed(2), 2) + expect_equal(get_rng_seed(), 2) +}) + +test_that("RNG restore", { + set_rng_seed(1) + expect_equal(reset_rng_state(), 0) + expect_equal(get_rng_seed(), 1) + expect_equal(get_rng_state(), 0) +}) + +test_that("RNG generate random seed", { + expect_message(rand_seed <- set_rng_seed(), "Using randomly generated seed") + expect_equal(rand_seed, get_rng_seed()) +}) + +set_rng_seed(1) +reset_rng_state() diff --git a/tests/testthat/test-permute.R b/tests/testthat/test-permute.R new file mode 100644 index 0000000..b4fccf4 --- /dev/null +++ b/tests/testthat/test-permute.R @@ -0,0 +1,58 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +# Example 1 +chickweights_df <- ChickWeight +chickweights_df <- chickweights_df[chickweights_df$Time <= 20, ] +chickweights_df$DietInt <- as.integer(chickweights_df$Diet) +chickweights_spec1 <- make_jlmer_spec( + formula = weight ~ 1 + DietInt, + data = chickweights_df, + subject = "Chick", time = "Time" +) +chickweights_spec1 + +test_that("preserves participant structure", { + p_str_original <- table(unique(chickweights_spec1$data[, c("Chick", "DietInt")])$DietInt) + permuted <- permute_by_predictor(chickweights_spec1, predictors = "DietInt", predictor_type = "between_participant") + p_str_permuted <- table(unique(permuted[, c("Chick", "DietInt")])$DietInt) + expect_equal(p_str_original, p_str_permuted) +}) + +test_that("preserves temporal structure", { + t_str_original <- unname(sort(tapply(chickweights_spec1$data$weight, chickweights_spec1$data$Chick, toString))) + permuted <- permute_by_predictor(chickweights_spec1, predictors = "DietInt", predictor_type = "between_participant") + t_str_permuted <- unname(sort(tapply(permuted$weight, permuted$Chick, toString))) + expect_equal(t_str_original, t_str_permuted) +}) + +test_that("guesses type", { + reset_rng_state() + expect_message(spec1_perm1 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt"), "between_participant") +}) + +test_that("increments counter", { + expect_true(get_rng_state() > 0) +}) + +test_that("shuffling reproducibility", { + reset_rng_state() + spec1_perm1 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt", predictor_type = "between_participant") + reset_rng_state() + spec1_perm2 <- permute_by_predictor(chickweights_spec1, predictors = "DietInt", predictor_type = "between_participant") + expect_equal(spec1_perm1, spec1_perm2) +}) + +test_that("levels of a category shuffled together", { + chickweights_spec2 <- make_jlmer_spec( + formula = weight ~ 1 + Diet, + data = chickweights_df, + subject = "Chick", time = "Time" + ) + reset_rng_state() + expect_message(spec2_perm1 <- permute_by_predictor(chickweights_spec2, predictors = "Diet2", predictor_type = "between_participant"), "Diet3") + reset_rng_state() + spec2_perm2 <- permute_by_predictor(chickweights_spec2, predictors = c("Diet2", "Diet3", "Diet4"), predictor_type = "between_participant") + expect_equal(spec2_perm1, spec2_perm2) +}) diff --git a/tests/testthat/test-progress.R b/tests/testthat/test-progress.R new file mode 100644 index 0000000..3e0d6dd --- /dev/null +++ b/tests/testthat/test-progress.R @@ -0,0 +1,23 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +test_that("No side effects when called empty", { + expect_equal(julia_progress(), julia_progress()) +}) + +old_opts <- julia_progress() + +test_that("Side effect by show, width, or both", { + julia_progress(show = FALSE) + expect_equal(julia_progress()$show, FALSE) + julia_progress(width = 100) + expect_equal(julia_progress()$width, 100) + julia_progress(old_opts) + expect_equal(julia_progress(), old_opts) +}) + +test_that("Restore startup option", { + julia_progress(show = TRUE, width = "auto") + expect_equal(old_opts, julia_progress()) +}) diff --git a/tests/testthat/test-singularity.R b/tests/testthat/test-singularity.R new file mode 100644 index 0000000..ebfd5ad --- /dev/null +++ b/tests/testthat/test-singularity.R @@ -0,0 +1,16 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +singularity_spec <- make_jlmer_spec( + formula = weight ~ 1 + Diet + (1 + Diet | Chick), + data = subset(transform(ChickWeight, Diet = as.integer(Chick)), Chick %in% 1:5 & Time <= 20), + subject = "Chick", time = "Time" +) + +test_that("Informs on singularity", { + expect_message(expect_message(expect_message( + compute_timewise_statistics(singularity_spec), + "singular fits" + ), "number of components"), "Chick") +}) diff --git a/tests/testthat/test-threshold_search.R b/tests/testthat/test-threshold_search.R new file mode 100644 index 0000000..117c0e5 --- /dev/null +++ b/tests/testthat/test-threshold_search.R @@ -0,0 +1,16 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) +reset_rng_state() + +spec <- make_jlmer_spec( + weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) +empirical_statistics <- compute_timewise_statistics(spec) +null_statistics <- permute_timewise_statistics(spec, nsim = 100) + +test_that("tests all steps", { + out <- walk_threshold_steps(empirical_statistics, null_statistics, steps = 1:3) + expect_equal(sort(unique(out$threshold)), 1:3) +}) diff --git a/tests/testthat/test-timewise_statistics.R b/tests/testthat/test-timewise_statistics.R new file mode 100644 index 0000000..5210377 --- /dev/null +++ b/tests/testthat/test-timewise_statistics.R @@ -0,0 +1,38 @@ +skip_conditionally() + +jlmerclusterperm_setup(restart = FALSE, verbose = FALSE) + +spec <- make_jlmer_spec( + weight ~ 1 + Diet, subset(ChickWeight, Time <= 20), + subject = "Chick", time = "Time" +) + +empirical_ts <- compute_timewise_statistics(spec, statistic = "t") +empirical_chisqs <- compute_timewise_statistics(spec, statistic = "chisq") + +test_that("chisq bound by 0-1", { + expect_error(extract_empirical_clusters(empirical_chisqs, threshold = 2)) +}) + +test_that("dims of empirical stats", { + expect_equal(dim(empirical_ts), c(3, 11)) + expect_equal(dim(empirical_chisqs), c(1, 11)) +}) + +test_that("tidy dims of empirical stats", { + expect_equal(dim(tidy(empirical_ts)), c(33, 3)) + expect_equal(dim(tidy(empirical_chisqs)), c(11, 3)) +}) + +null_ts <- permute_timewise_statistics(spec, statistic = "t") +null_chisqs <- permute_timewise_statistics(spec, statistic = "chisq") + +test_that("dims of null stats", { + expect_equal(dim(null_ts), c(100, rev(dim(empirical_ts)))) + expect_equal(dim(null_chisqs), c(100, rev(dim(empirical_chisqs)))) +}) + +test_that("tidy dims of null stats", { + expect_equal(dim(tidy(null_ts)), c(nrow(tidy(empirical_ts)) * 100, 4)) + expect_equal(dim(tidy(null_chisqs)), c(nrow(tidy(empirical_chisqs)) * 100, 4)) +}) diff --git a/vignettes/jlmerclusterperm.Rmd b/vignettes/jlmerclusterperm.Rmd new file mode 100644 index 0000000..c987ab1 --- /dev/null +++ b/vignettes/jlmerclusterperm.Rmd @@ -0,0 +1,77 @@ +--- +title: "Introduction to jlmerclusterperm" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Introduction to jlmerclusterperm} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +``` + +`jlmerclusterperm` is an interface to regression-based CPAs which supports both wholesale and piecemeal approaches to conducting the test. + +## Overview of CPA + +**Cluster-based permutation analysis (CPA)** is a simulation-based, non-parametric statistical test of difference between groups in a time series. It is suitable for analyzing **densely-sampled time data** (such as in EEG and eye-tracking research) where the research hypothesis is often specified up to the existence of an effect (e.g., as predicted by higher-order cognitive mechanisms) but agnostic to the temporal details of the effect (such as the precise moment of its emergence). + +CPA formalizes two intuitions about what it means for there to be a difference between groups: + +1) The countable unit of difference (i.e., a **cluster**) is a contiguous, uninterrupted span of _sufficiently large_ differences at each time point (determined via a _threshold_). + +2) The degree of extremity of a cluster (i.e., the **cluster-mass statistic**) is a measure that is sensitive to the magnitude of the difference, its variability, and the sample size (e.g., the t-statistic from a regression). + +The CPA procedure identifies empirical clusters in a time series data and tests the significance of the observed cluster-mass statistics against **bootstrapped permutations** of the data. The bootstrapping procedure samples from the "null" (via random shuffling of condition labels), yielding a distribution of cluster-mass statistics emerging from chance. The statistical significance of a cluster is the probability of observing a cluster-mass statistic as extreme as the cluster's against the simulated null distribution. + +## Package design + +![jlmerclusterperm package function design](https://raw.githubusercontent.com/yjunechoe/jlmerclusterperm/main/man/figures/jlmerclusterperm_fn_design.png) + +`jlmerclusterperm` provides both a wholesale and a piecemeal approach to conducting a CPA. The main workhorse function in the package is `clusterpermute()`, which is composed of five smaller functions that are called internally in succession. The smaller functions representing the algorithmic steps of a CPA are also exported, to allow more control over the procedure (e.g., for debugging and diagnosing a CPA run). + +See the [function documentation](https://yjunechoe.github.io/jlmerclusterperm/reference/index.html#cluster-based-permutation-analysis) for more. + +## Organization of vignettes + +The package [vignettes](https://yjunechoe.github.io/jlmerclusterperm/articles/) are roughly divided into two groups: **topics** and **case studies**. If you are a researcher with data ready for analysis, it is recommended to go through the case studies first and pluck out the relevant bits for your own desired analysis + +In the order of increasing complexity, the **case studies** are: + +1) [Garrison et al. 2020](https://yjunechoe.github.io/jlmerclusterperm/articles/Garrison-et-al-2020.html), which introduces the package's functions for running a CPA, demonstrating both wholesale and piecemeal approaches. It also compares the use of linear vs. logistic regression for the calculation of timewise statistics. + +2) [Geller et al. 2020](https://yjunechoe.github.io/jlmerclusterperm/articles/Geller-et-al-2020.html), which demonstrates the use of random effects and regression contrasts in the CPA specification. It also compares the use of t vs. chisq timewise statistics. + +3) [de Carvalho et al. 2021](https://yjunechoe.github.io/jlmerclusterperm/articles/deCarvalho-et-al-2021.html), which showcases using custom regression contrasts to test the relationship between multiple levels of a factor. It also discusses issues surrounding the complexity of the timewise regression model. + +The **topics** cover general package features: + +- [Tidying output](https://yjunechoe.github.io/jlmerclusterperm/articles/tidying-output.html): `tidy()` and other ways of collecting `jlmerclusterperm` objects as tidy data. + +- [Julia interface](https://yjunechoe.github.io/jlmerclusterperm/articles/julia-interface.html): interacting with Julia objects using the [JuliaConnectoR](https://github.com/stefan-m-lenz/JuliaConnectoR) package. + +- [Reproducibility](https://yjunechoe.github.io/jlmerclusterperm/articles/reproducibility.html): using the Julia RNG to make CPA runs reproducible. + +- [Asynchronous CPA](https://yjunechoe.github.io/jlmerclusterperm/articles/asynchronous-cpa.html): running a CPA in a background process using the [future](https://future.futureverse.org/) package. + +## Compared to other implementations + +There are other R packages for cluster-based permutation analysis, such as [clusterperm](https://github.com/dalejbarr/clusterperm/), [eyetrackingR](https://github.com/jwdink/eyetrackingR), [permuco](https://github.com/jaromilfrossard/permuco/), and [permutes](https://github.com/cvoeten/permutes). + +Compared to existing implementations, `jlmerclusterperm` is the only package optimized for CPAs based on **mixed-effects regression models**, suitable for typical experimental research data with multiple, crossed grouping structures (e.g., subjects and items). It is also the only package with an interface to the individual algorithmic steps of a CPA. Lastly, thanks to the Julia back-end, bootstrapping is fast even for mixed models. + +## Further readings + +- The original method paper for CPA by [Maris & Oostenveld 2007](https://doi.org/10.1016/j.jneumeth.2007.03.024). + +- A more recent paper [Sassenhagen & Draschkow 2019](https://doi.org/10.1111/psyp.13335) on the DOs and DON'Ts of of interpreting and reporting CPA results. + +- A review paper for eyetracking analysis methods [Ito & Knoeferle 2022](https://doi.org/10.3758/s13428-022-01969-3) which compares CPA to other statistical techniques for testing difference between groups in time series data. + +- The JOSS paper for the [permuco](https://jaromilfrossard.github.io/permuco/) package ([Frossard & Renaud 2021](https://www.jstatsoft.org/article/view/v099i15)) and references therein. + +- [Benedikt V. Ehinger's blog post](https://benediktehinger.de/blog/science/statistics-cluster-permutation-test/), which includes a detailed graphical explainer for the CPA procedure.