From 6ca9327bac685b9236b9a2711e36f46b3591bf5a Mon Sep 17 00:00:00 2001 From: Henrik Singmann Date: Tue, 27 Aug 2019 22:30:29 +0000 Subject: [PATCH] version 0.3-3 --- DESCRIPTION | 8 +- MD5 | 32 +- NAMESPACE | 3 + R/check_results.R | 294 ++++++++++---- R/fit_mpt.R | 94 ++--- R/get_info.R | 161 ++++++++ R/mpt_options.R | 72 ++-- R/mptinr.R | 362 +++++++++--------- R/plot_multiverseMPT.R | 359 +++++++++-------- R/treebugs.R | 129 ++++--- build/vignette.rds | Bin 265 -> 266 bytes .../doc/introduction-bayen_kuhlmann_2011.html | 70 ++-- man/check_results.Rd | 46 ++- man/fit_mpt.Rd | 32 +- man/get_info.Rd | 50 +++ man/write_check_results.Rd | 16 - tests/testthat/test-data-structures.R | 43 ++- tests/testthat/test-mpt_options.R | 31 ++ tests/testthat/test-mptinr.R | 164 ++++---- 19 files changed, 1194 insertions(+), 772 deletions(-) create mode 100644 R/get_info.R create mode 100644 man/get_info.Rd delete mode 100644 man/write_check_results.Rd create mode 100644 tests/testthat/test-mpt_options.R diff --git a/DESCRIPTION b/DESCRIPTION index e0c3dff..2a83b27 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: MPTmultiverse Title: Multiverse Analysis of Multinomial Processing Tree Models -Version: 0.2-0 +Version: 0.3-3 Description: Statistical or cognitive modeling usually requires a number of more or less arbitrary choices creating one specific path through a 'garden of forking paths'. @@ -31,18 +31,18 @@ BugReports: https://github.com/mpt-network/MPTmultiverse/issues Depends: R (>= 2.11.1), Imports: parallel, magrittr, tidyr, dplyr, tibble, rlang, reshape2, ggplot2, MPTinR, TreeBUGS, runjags, coda, purrr, readr, - limSolve + limSolve, utils Suggests: knitr, rmarkdown, testthat LazyData: yes VignetteBuilder: knitr RoxygenNote: 6.1.1 License: GPL-2 NeedsCompilation: no -Packaged: 2019-03-11 21:24:50 UTC; henrik +Packaged: 2019-08-27 18:50:43 UTC; henrik Author: Henrik Singmann [aut, cre] (), Daniel W. Heck [aut], Marius Barth [aut], Frederik Aust [ctb] () Maintainer: Henrik Singmann Repository: CRAN -Date/Publication: 2019-03-11 23:52:42 UTC +Date/Publication: 2019-08-27 23:30:29 UTC diff --git a/MD5 b/MD5 index b53b536..dd6169e 100644 --- a/MD5 +++ b/MD5 @@ -1,20 +1,21 @@ -0aef9496de27a3a45758b4d3da803151 *DESCRIPTION -358c7c50c5ff4ec4854c1f7003fd0d78 *NAMESPACE -c58b5b86bc3755c9cdaa14f15e36b8ec *R/check_results.R -762033a6f12bdd6e72ce504f4a153dd3 *R/fit_mpt.R +77a85bb84b0325bbfeaa0d2abd7bf3b1 *DESCRIPTION +843cf2d97f2f95cdd3b7e728fc64fc68 *NAMESPACE +d939cb647fa698aa12adc75a499907f1 *R/check_results.R +91c72a08fdee06ace9b059741d90413a *R/fit_mpt.R a2de0d9664ddbf8e4fa1e7af30b05f3b *R/get_eqn_categories.R +6a0f6bab5550eb15b4b0395fa899042b *R/get_info.R d8604ad9aa9ca4ecfbb5af1f6747c42a *R/make_results_row.R -ffc2311b2c3894ea2fedd163f2bb996b *R/mpt_options.R -8e8fac68ec6a93596de3cfee0a4d0125 *R/mptinr.R +d75725ae664ca9b0860d575356c7e9fe *R/mpt_options.R +57177037d82936fd63d1fff358cc0ea4 *R/mptinr.R c0860dc3bb02743f1ff358a5c9afed03 *R/onload.R -5bdbe313d4f00a95ac9e6359030c6836 *R/plot_multiverseMPT.R +66b4d99a3e9ec352045eb182827c5766 *R/plot_multiverseMPT.R faf7d231083e8038c413991ca81d3b2c *R/prep_data_fitting.R -9dcca834565312ea9744c6dc927241ad *R/treebugs.R +946377fa7e32a9810e3d558556a8ccac *R/treebugs.R 7ba90e1c6fef8537e2ed9d137a801f96 *R/write_results.R cf1d73fdd11829c6652ca98c8def0c50 *README.md -6dd1b953436a21b77ba666b4c357e285 *build/vignette.rds +ca94b68a17030b6e8eaa2d868bfaea95 *build/vignette.rds 5ddc745fc0e241ffe1a2801a5fdf854a *inst/doc/introduction-bayen_kuhlmann_2011.R -8aa59c2a33a86224ab4fe840fedd51bb *inst/doc/introduction-bayen_kuhlmann_2011.html +4b71a5c606597665434286519a504557 *inst/doc/introduction-bayen_kuhlmann_2011.html 8fb0a3da73fb660d23e6e7fb0b8f1dd8 *inst/doc/introduction-bayen_kuhlmann_2011.rmd 378c931503b45c257da3b87e4c5e7490 *inst/extdata/2HTSM_Submodel4.eqn 31de4fabf5cb90e987b2d418a49fed02 *inst/extdata/Kuhlmann_dl7.csv @@ -22,18 +23,19 @@ d9ebea39a1adb10102dbe3cc044395dc *inst/extdata/prospective_memory.eqn 0493e5bef231f7a55d9f2cd2a185d011 *inst/extdata/prospective_memory_example.rda 3acb8d138734233c096e54e6df2c8aa5 *inst/extdata/results_bayen_kuhlmann.RData 89b7767efbae5229542389ae6bd07922 *inst/extdata/smith_et_al_2011.csv -5af14ab49f814b8838fdae66fd8a8c0d *man/check_results.Rd -a7e9b12163013673deb6c74abeb4cd55 *man/fit_mpt.Rd +b9b0e597e96a6905eb91b6f644f74b95 *man/check_results.Rd +0697cba09eacca908321aaadacbb044b *man/fit_mpt.Rd +78ad955cff0898b16a938ba99927ce61 *man/get_info.Rd 9a3023bfc66e5b420f7df718499fb2fa *man/get_pb_output.Rd 8f3811a251eb5a19d2bbfeea686f1d85 *man/make_results_row.Rd 91e20cb185f30a2236eeb41c8108fb7d *man/mpt_options.Rd 70d444459d24aac210c2b85acb17d520 *man/plot.multiverseMPT.Rd -a859ce4d222c2a86d9a3cc7a04364d4a *man/write_check_results.Rd 65b8904cc88c4eea226b24f46b9be466 *man/write_results.Rd dd4bfb25654cdb367a4c4485a712e332 *tests/testthat.R -9c4f3d2af60ba67b298f959e6e07735f *tests/testthat/test-data-structures.R +910e290d3cec7cc844f69da4a542eb04 *tests/testthat/test-data-structures.R 16a5cbce2a4d7eceb5a9ba2d0cdac561 *tests/testthat/test-identifiability_checks.R -bd1e0f4d37b5f4d79272abc5d149d85d *tests/testthat/test-mptinr.R +e72d9c5908546dac0b22cc760890a110 *tests/testthat/test-mpt_options.R +5ffa69e5d65eafdb821e38afd37c70e3 *tests/testthat/test-mptinr.R 15b9a91654c2a73e11ff9b74ca2b83e8 *tests/testthat/test_treebugs.R 7e3b16937fce40bab6396676351497c7 *vignettes/NOT_USED/1_bayen_kuhlmann_2011.html 378c931503b45c257da3b87e4c5e7490 *vignettes/NOT_USED/2HTSM_Submodel4.eqn diff --git a/NAMESPACE b/NAMESPACE index fb5eb06..c1b0d07 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -2,7 +2,9 @@ S3method(plot,multiverseMPT) export(check_results) +export(check_set) export(fit_mpt) +export(get_info) export(mpt_options) export(write_check_results) export(write_results) @@ -13,3 +15,4 @@ importFrom(rlang,sym) importFrom(stats,qnorm) importFrom(tibble,as_tibble) importFrom(tibble,tibble) +importFrom(utils,read.table) diff --git a/R/check_results.R b/R/check_results.R index 3d4b53b..d64d01f 100644 --- a/R/check_results.R +++ b/R/check_results.R @@ -1,93 +1,116 @@ -#' Check results from a multiverse analysis -#' -#' This is a helper function to see if the model estimation worked as intended. -#' +#' Check results from MPTmultiverse +#' +#' @description Set of helper functions that allow checking if model estimation +#' worked as intended. Depending on the method and function, these functions +#' return slightly different information. +#' +#' @details \code{check_results} prints relatively verbose output detailing +#' diagnostic information for each method to the console. For the frequentist +#' methods, this is based on either the rank of the observed Fischer (or +#' Hessian) matrix of the MLE estimate or based on empirical identifiability +#' (based either on repeated re-runs or the width of the bootstrapped +#' parameter distribution). For the Bayesian methods, this is convergence +#' statistics R-hat and number of effective samples. +#' \code{write_check_results} writes the results of \code{check_results} to a +#' specififed file (instead of printing it to the console). +#' +#' \code{check_set} returns a \code{tibble} with one row, where each expected +#' method corresponds to a column with a boolean (\code{TRUE}/\code{FALSE}) +#' value. Entries \code{TRUE} correspond to no problem and \code{FALSE} +#' correspond to problems. \code{FALSE} means the method is either missing +#' from the results file or (for the Bayesian methods) there are core +#' parameters for which the convergence criteria defined in +#' \code{getOption("MPTmultiverse")} are not met. +#' #' @param results An object of class multiverseMPT. + +#' @example examples/examples.check_results.R #' #' @importFrom rlang .data #' @importFrom magrittr %>% #' @export - check_results <- function(results) { #browser() expected <- structure(list( - pooling = c("no", "no", "no", "complete", "no", "complete", "partial", - "partial", "partial", "partial"), - package = c("MPTinR", "MPTinR", "MPTinR", "MPTinR", "TreeBUGS", "TreeBUGS", - "TreeBUGS", "TreeBUGS", "TreeBUGS", "TreeBUGS"), - method = c("NPB/MLE", "PB/MLE", "asymptotic", "asymptotic", "simple", - "simple", "trait", "trait_uncorrelated", "beta", "betacpp")), - .Names = c("pooling", "package", "method"), + pooling = c("no", "no", "no", "complete", "no", "complete", "partial", + "partial", "partial", "partial"), + package = c("MPTinR", "MPTinR", "MPTinR", "MPTinR", "TreeBUGS", "TreeBUGS", + "TreeBUGS", "TreeBUGS", "TreeBUGS", "TreeBUGS"), + method = c("NPB/MLE", "PB/MLE", "asymptotic", "asymptotic", "simple", + "simple", "trait", "trait_uncorrelated", "beta", "betacpp")), + .Names = c("pooling", "package", "method"), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -10L)) missing <- dplyr::anti_join(expected, results[, 3:5], by = c("pooling", "package", "method")) if (nrow(missing) > 0) { - cat("## Following analysis approaches missing from results:\n", - paste(apply(missing, 1, paste, collapse = ", "), collapse = "\n"), + cat("## Following analysis approaches missing from results:\n", + paste(apply(missing, 1, paste, collapse = ", "), collapse = "\n"), "\n\n\n") } - + ### MPTinR: no pooling ### - + cat("## MPTinR: no pooling\n") - + mpt_no_pool <- c("asymptotic", "PB/MLE", "NPB/MLE") mpt_no_pool <- mpt_no_pool[mpt_no_pool %in% results$method] tryCatch({ for(meth in mpt_no_pool){ - - # conv_mptinr_no <- results %>% - # dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% - # dplyr::select("convergence") %>% + + # conv_mptinr_no <- results %>% + # dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% + # dplyr::select("convergence") %>% # tidyr::unnest() - - not_id <- results %>% - dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% - dplyr::select("est_indiv") %>% - tidyr::unnest() %>% - dplyr::group_by(.data$condition, .data$core) %>% - dplyr::summarise(proportion = mean(!.data$identifiable)) - - not_id2 <- results %>% - dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% - dplyr::select("est_indiv") %>% + + not_id <- results %>% + dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% + dplyr::select("est_indiv") %>% + tidyr::unnest() %>% + dplyr::group_by(.data$condition, .data$core) %>% + dplyr::summarise(proportion = mean(!.data$identifiable | + is.na(.data$identifiable))) %>% + dplyr::ungroup() + + not_id2 <- results %>% + dplyr::filter(.data$package == "MPTinR" & .data$pooling == "no" & .data$method == meth) %>% + dplyr::select("est_indiv") %>% tidyr::unnest() %>% - dplyr::filter(!.data$identifiable) %>% - dplyr::group_by(.data$condition, .data$core, .data$parameter) %>% - dplyr::count() %>% + dplyr::filter(!.data$identifiable) %>% + dplyr::group_by(.data$condition, .data$core, .data$parameter) %>% + dplyr::count() %>% dplyr::ungroup() if (any(not_id$proportion > 0)) { cat("Based on", meth, "method, proportion of participants with non-identified parameters:\n") cat(format(not_id, n = Inf)[-c(1,3)], "", sep = "\n") - + cat("Based on", meth, "CIs, table of non-identified parameters:\n") cat(format(not_id2, n = Inf)[-c(1,3)], sep = "\n") - + } else { cat("Based on", meth, "CIs, all parameters of all participants seem to be identifiable.\n") } cat("\n") } - }, error = function(e) + }, error = function(e) cat("Convergence checks failed for unkown reason.\n")) - + cat("\n") - - + + ### MPTinR: complete pooling ### - + cat("## MPTinR: complete pooling\n") - + tryCatch({ conv_mptinr_comp <- results %>% dplyr::filter(.data$package == "MPTinR" & .data$pooling == "complete") %>% dplyr::select("convergence") %>% tidyr::unnest() - - comp_prob <- (conv_mptinr_comp$convergence != 0) | + + comp_prob <- (conv_mptinr_comp$convergence != 0) | (conv_mptinr_comp$rank.fisher != conv_mptinr_comp$n.parameters) - + if (any(comp_prob, na.rm = TRUE)) { cat("Convergence problems:\n") cat(format(conv_mptinr_comp[comp_prob,])[-c(1,3)], "", sep = "\n") @@ -97,56 +120,76 @@ check_results <- function(results) { } else { cat("No convergence problems.\n") } - }, error = function(e) + }, error = function(e) cat("Convergence checks failed for unkown reason.\n")) - + cat("\n\n") - + ### TreeBUGS - res_tree <- results %>% - dplyr::filter(.data$package == "TreeBUGS") %>% - dplyr::select(!!c("pooling", "package", "method", "convergence")) - + res_tree <- results %>% + dplyr::filter(.data$package == "TreeBUGS") %>% + dplyr::select(!!c("model", "dataset", "pooling", "package", "method", "convergence", "est_group")) + for (i in seq_len(nrow(res_tree))) { - cat("## ", paste(res_tree[i, c(2,1,3)], collapse = ", "), ":\n", sep = "") - - tmp_convergence <- res_tree[i, ]$convergence[[1]] %>% - dplyr::filter(.data$Rhat > getOption("MPTmultiverse")$treebugs$Rhat_max) - - if(nrow(tmp_convergence) > 0) { - cat(nrow(tmp_convergence), "parameters with Rhat >", getOption("MPTmultiverse")$treebugs$Rhat_max, ":\n") - cat(paste(tmp_convergence$parameter, collapse = ", ")) + cat("## ", paste(res_tree[i, 1:5], collapse = " // "), ":\n", sep = "") + + params <- res_tree[i,] %>% + tidyr::unnest(.data$est_group) %>% + dplyr::select(.data$parameter, .data$core) + + tmp_convergence <- res_tree[i, ]$convergence[[1]] %>% + dplyr::filter(.data$Rhat > getOption("MPTmultiverse")$treebugs$Rhat_max) %>% + dplyr::mutate(parameter = label_parameter(.data$parameter, params), + core = grepl("COREPARAMETER", .data$parameter), + parameter = gsub("COREPARAMETER", "", x = .data$parameter)) + + if (nrow(tmp_convergence) > 0) { + cat(" ", sum(tmp_convergence$core), " core parameters with Rhat >", + getOption("MPTmultiverse")$treebugs$Rhat_max, ":\n") + + # cat(format(not_id, n = Inf)[-c(1,3)], "", sep = "\n") + + cat(paste(tmp_convergence$parameter[tmp_convergence$core], collapse = ", "), "\n") + cat(" ", sum(!tmp_convergence$core), " auxiliary parameters with Rhat >", + getOption("MPTmultiverse")$treebugs$Rhat_max, ":\n") + cat(paste(tmp_convergence$parameter[!tmp_convergence$core], collapse = ", "), "\n") } else { - cat("All Rhat <", getOption("MPTmultiverse")$treebugs$Rhat_max, ".\n") + cat(" All Rhat <", getOption("MPTmultiverse")$treebugs$Rhat_max, ".\n") } - - tmp_neff <- res_tree[i,]$convergence[[1]] %>% - dplyr::filter(!is.na(.data$Rhat), .data$n.eff < getOption("MPTmultiverse")$treebugs$Neff_min) - + + tmp_neff <- res_tree[i,]$convergence[[1]] %>% + dplyr::filter(!is.na(.data$Rhat), .data$n.eff < getOption("MPTmultiverse")$treebugs$Neff_min) %>% + dplyr::mutate(parameter = label_parameter(.data$parameter, params), + core = grepl("COREPARAMETER", .data$parameter), + parameter = gsub("COREPARAMETER", "", x = .data$parameter)) + if(nrow(tmp_neff) > 0) { - cat(nrow(tmp_neff), "parameters with effect sample size n.eff <", + cat(" ", sum(tmp_neff$core), " core parameters with effect sample size n.eff <", + getOption("MPTmultiverse")$treebugs$Neff_min, ":\n") + cat(paste(tmp_neff$parameter[tmp_neff$core], collapse = ", "), "\n") + cat(" ", sum(!tmp_neff$core), " auxiliary parameters with effect sample size n.eff <", getOption("MPTmultiverse")$treebugs$Neff_min, ":\n") - cat(paste(tmp_neff$parameter, collapse = ", ")) + cat(paste(tmp_neff$parameter[!tmp_neff$core], collapse = ", "), "\n") } else { - cat("All effect sample sizes >", getOption("MPTmultiverse")$treebugs$Neff_min, ".\n") + cat(" All effect sample sizes >", getOption("MPTmultiverse")$treebugs$Neff_min, ".\n") } - + cat("\n\n") } - - + + } -#' Write check_results -#' -#' Helper function to write the output from check_results() to a file. -#' -#' @param DATA_FILE Character. File name to use. -#' @param results An object of class multiverseMPT. -#' @export -write_check_results <- function(DATA_FILE, results){ - sink(paste0(DATA_FILE, "_check_results.txt")) +#' @param DATA_FILE character string. File name to use. +#' @param append logical. If \code{TRUE}, output will be appended to +#' \code{DATA_FILE}; otherwise, it will overwrite the contents of +#' \code{DATA_FILE}. +#' @export +#' @rdname check_results +## @describeIn check_results writes check results to file. +write_check_results <- function(DATA_FILE, results, append = FALSE){ + sink(paste0(DATA_FILE, "_check_results.txt"), append = append) cat("################ OPTIONS ################\n\n") cat("TreeBUGS:\n") ; print(getOption("MPTmultiverse")$treebugs) cat("\nMPTinR:\n") ; print(getOption("MPTmultiverse")$mptinr) @@ -156,3 +199,92 @@ write_check_results <- function(DATA_FILE, results){ print(check_results(results)) sink() } + + + +#' @export +#' @rdname check_results +check_set <- function(results) { + + if (!all(results$model[1] == results$model)) + stop("All results need to use same model") + if (!all(results$dataset[1] == results$dataset)) + stop("All results need to use same dataset") + + expected <- structure(list( + pooling = c("no", "no", "no", "complete", "no", "complete", "partial", + "partial", "partial", "partial"), + package = c("MPTinR", "MPTinR", "MPTinR", "MPTinR", "TreeBUGS", "TreeBUGS", + "TreeBUGS", "TreeBUGS", "TreeBUGS", "TreeBUGS"), + method = c("NPB/MLE", "PB/MLE", "asymptotic", "asymptotic", "simple", + "simple", "trait", "trait_uncorrelated", "beta", "betacpp")), + .Names = c("pooling", "package", "method"), + class = c("tbl_df", "tbl", "data.frame" + ), row.names = c(NA, -10L)) + missing <- dplyr::anti_join(expected, results[, 3:5], by = c("pooling", "package", "method")) + miss <- tidyr::unite(missing, "meth") + + out <- tidyr::unite(expected, "meth") + out$problem <- TRUE + + for (i in seq_len(nrow(miss))) { + out[ out$meth == miss$meth[i] , "problem" ] <- FALSE + } + + ### TreeBUGS + res_tree <- results %>% + dplyr::filter(.data$package == "TreeBUGS") %>% + dplyr::select(!!c("model", "dataset", "pooling", "package", "method", "convergence", "est_group")) + + for (i in seq_len(nrow(res_tree))) { + + cur_meth <- tidyr::unite(res_tree[i,3:5], "meth")$meth + + params <- res_tree[i,] %>% + tidyr::unnest(.data$est_group) %>% + dplyr::select(.data$parameter, .data$core) + + tmp_convergence <- res_tree[i, ]$convergence[[1]] %>% + dplyr::filter(.data$Rhat > getOption("MPTmultiverse")$treebugs$Rhat_max) %>% + dplyr::mutate(parameter = label_parameter(.data$parameter, params), + core = grepl("COREPARAMETER", .data$parameter), + parameter = gsub("COREPARAMETER", "", x = .data$parameter)) + + if (nrow(dplyr::filter(tmp_convergence, .data$core)) > 0) { + out[ out$meth == cur_meth , "problem" ] <- FALSE + } + + tmp_neff <- res_tree[i,]$convergence[[1]] %>% + dplyr::filter(!is.na(.data$Rhat), .data$n.eff < getOption("MPTmultiverse")$treebugs$Neff_min) %>% + dplyr::mutate(parameter = label_parameter(.data$parameter, params), + core = grepl("COREPARAMETER", .data$parameter), + parameter = gsub("COREPARAMETER", "", x = .data$parameter)) + if (nrow(dplyr::filter(tmp_neff, .data$core)) > 0) { + out[ out$meth == cur_meth , "problem" ] <- FALSE + } + } + dplyr::bind_cols( + results[1, 1:2] + , + tidyr::spread(out, "meth", "problem") + ) +} + +label_parameter <- function(mcmc_output, params){ + for (j in seq_len(nrow(params))){ + type <- ifelse(params$core[j], "COREPARAMETER", "") + mcmc_output <- gsub(paste0("([sigma,mean,sd,mu,alph,bet]+)\\[", j, "\\]"), + paste0("\\1[", params$parameter[j], "]", type), + mcmc_output) + mcmc_output <- gsub(paste0("theta\\[",j,",([0-9]+)\\]"), + paste0(params$parameter[j], '[\\1]', type), + mcmc_output) + mcmc_output <- gsub(paste0("rho\\[",j,",([0-9,a-z,A-Z,_]+)\\]"), + paste0("rho[", params$parameter[j], ',\\1]', type), + mcmc_output) + mcmc_output <- gsub(paste0("rho\\[([^,]+),", j, "\\]"), + paste0("rho[\\1,", params$parameter[j], ']', type), + mcmc_output) + } + mcmc_output +} diff --git a/R/fit_mpt.R b/R/fit_mpt.R index 2c0190b..d63b579 100644 --- a/R/fit_mpt.R +++ b/R/fit_mpt.R @@ -37,7 +37,7 @@ #' e.g., \code{core = c("Dn", "Do")}. All other parameters are treated as #' auxiliary parameters. #' @example examples/examples.fit_mpt.R -#' +#' #' @details This functions is a fancy wrapper for packages \pkg{MPTinR} and #' \pkg{TreeBUGS} applying various frequentist and Bayesian estimation methods #' to the same data set with different levels of pooling/aggregation using a @@ -46,7 +46,7 @@ #' (e.g., equating different parameters or fixing them to a constant) need to #' be part of the model (i.e., the \code{.eqn} file) and cannot be passed as #' an argument. -#' +#' #' The settings for the various methods are specified via function #' \code{\link{mpt_options}}. The default settings use all available cores for #' calculating the boostrap distribution as well as independent MCMC chains @@ -61,14 +61,14 @@ #' ordering of the \code{factor} levels in the output cannot be guaranteed. If #' the data has more than one between-subjects condition, these need to be #' combined into one condition for this function. -#' +#' #' Parameter differences or other support for within-subject conditions is not #' provided. The best course of action for within-subjects conditions is to #' simply include separate trees and separate sets of parameters for each #' within-subjects condition. This allows to at least compare the estimates #' for each within-subjects condition across levels of pooling and estimation #' methods. -#' +#' #' \subsection{Pooling}{ #' The following pooling levels are provided (not all by all estimation approaches, see below). #' \itemize{ @@ -95,7 +95,7 @@ #' the individual-level parameters. } #' } #' } -#' +#' #' \subsection{Implemented Estimation Methods}{ #' Maximum-likelihood estimation with \pkg{MPTinR} via #' \code{\link[MPTinR]{fit.mpt}}: @@ -106,21 +106,21 @@ #' \item{\code{"pb_no"}: }{Parametric bootstrap, no pooling} #' \item{\code{"npb_no"}: }{Nonparametric bootstrap, no pooling} #' } -#' +#' #' Bayesian estimation with \pkg{TreeBUGS} #' \itemize{ #' \item{\code{"simple"}: }{Bayesian estimation, no pooling (C++, #' \link[TreeBUGS]{simpleMPT})} -#' \item{\code{"simple_pooling"}: }{Bayesian estimation, complete pooling +#' \item{\code{"simple_pooling"}: }{Bayesian estimation, complete pooling #' (C++, \link[TreeBUGS]{simpleMPT})} -#' \item{\code{"trait"}: }{latent-trait model, partial pooling (JAGS, +#' \item{\code{"trait"}: }{latent-trait model, partial pooling (JAGS, #' \link[TreeBUGS]{traitMPT})} #' \item{\code{"trait_uncorrelated"}: }{latent-trait model without #' correlation parameters, partial pooling (JAGS, #' \link[TreeBUGS]{traitMPT})} -#' \item{\code{"beta"}: }{beta-MPT model, partial pooling (JAGS, +#' \item{\code{"beta"}: }{beta-MPT model, partial pooling (JAGS, #' \link[TreeBUGS]{betaMPT})} -#' \item{\code{"betacpp"}: }{beta-MPT model, partial pooling (C++, +#' \item{\code{"betacpp"}: }{beta-MPT model, partial pooling (C++, #' \link[TreeBUGS]{betaMPTcpp})} #' } #' } @@ -132,7 +132,7 @@ #' the pooled standard error of the individual parameters. The overall fit #' (column \code{gof}) is based on an additional fit to the completely #' aggregated data. -#' +#' #' For the \emph{no pooling asymptotic approach}, the individual-level #' maximum-likelihood estimates are reported in column \code{est_indiv} and #' \code{gof_indiv} and provide the basis for the other results. Whether or @@ -154,7 +154,7 @@ #' (again, after excluding non-identifiable estimates). The CIs of the #' difference are based on the SEs (which are derived from a linear model #' equivalent to the t-test). -#' +#' #' The individual-level estimates of the \code{bootstrap based no-pooling} #' approaches are identical to the asymptotic ones. However, the SE is the #' SD of the bootstrapped distribution of parameter estimates, the CIs are @@ -190,7 +190,7 @@ #' not estimate this correlation matrix (i.e., parameters can still be #' correlated across individuals, but this is not accounted for in the #' model). -#' +#' #' For all Bayesian methods, the posterior distribution of the parameters is #' summarized by the posterior mean (in the column \code{est}), posterior #' standard deviation (\code{se}), and credbility intervals (\code{ci_*}). @@ -202,7 +202,7 @@ #' "mean"}) and the T2 statistic (observed vs. posterior-predicted #' covariance of frequencies, \code{focus = "cov"}). #' } -#' +#' #' @return A \code{tibble} with one row per estimation \code{method} and the #' following columns: #' \enumerate{ @@ -219,16 +219,16 @@ #' "simple", "trait", "trait_uncorrelated", "beta", "betacpp")} #' \item \code{est_group}: Group-level parameter estimates per condition/group. #' \item \code{est_indiv}: Individual-level parameter estimates (if provided -#' by method). +#' by method). #' \item \code{est_rho}: Estimated correlation of individual-level parameters -#' on the probit scale (only in \code{method="trait"}). +#' on the probit scale (only in \code{method="trait"}). #' \item \code{test_between}: Parameter differences between the levels of the #' between-subjects condition (if specified). #' \item \code{gof}: Overall goodness of fit across all individuals. #' \item \code{gof_group}: Group-level goodness of fit. #' \item \code{gof_indiv}: Individual-level goodness of fit. -#' \item \code{fungibility}: Posterior correlation of the group-level means -#' \code{pnorm(mu)} (only in \code{method="trait"}). +#' \item \code{fungibility}: Posterior correlation of the group-level means +#' \code{pnorm(mu)} (only in \code{method="trait"}). #' \item \code{test_homogeneity}: Chi-square based test of participant #' homogeneity proposed by Smith and Batchelder (2008). This test is the same #' for each estimation method. @@ -248,13 +248,13 @@ #' \item \code{options}: Options used for estimation. Obtained by running #' \code{\link{mpt_options}()} #' } -#' +#' #' With the exception of the first five columns (i.e., after \code{method}) all #' columns are \code{list} columns typically holding one \code{tibble} per cell. #' The simplest way to analyze the results is separately per column using #' \code{\link[tidyr]{unnest}}. Examples for this are given below. #' -#' @references +#' @references #' Smith, J. B., & Batchelder, W. H. (2008). Assessing individual differences #' in categorical data. \emph{Psychonomic Bulletin & Review}, 15(4), 713-731. #' \url{https://doi.org/10.3758/PBR.15.4.713} @@ -271,7 +271,7 @@ fit_mpt <- function( , core = NULL , method ) { - + available_methods <- c( # MPTinR ---- "asymptotic_complete" @@ -286,81 +286,81 @@ fit_mpt <- function( , "beta" , "betacpp" ) - + # catch the function call that was used, # and other stuff that should be save along the results matched_call <- match.call() used_model <- utils::read.table(model, skip = 1, stringsAsFactors = FALSE) - + if(missing(method)) { method <- available_methods } - + method <- match.arg( arg = method , choices = available_methods , several.ok = TRUE ) - + # set options ---- silent_jags <- getOption("MPTmultiverse")$silent_jags - runjags::runjags.options(silent.jags = silent_jags, + runjags::runjags.options(silent.jags = silent_jags, silent.runjags = silent_jags) - + # prepare data ---- if (missing(data)) { data <- as.data.frame(readr::read_csv(dataset)) } - + if(is.null(condition)) { data$ExpCond <- "no_condition" condition <- "ExpCond" } - + if(is.null(id)) { data$Subject <- 1:nrow(data) id <- "Subject" } - + # Ensure that all variables are character data$ExpCond <- as.character(data[[condition]]) data$Subject <- as.character(data[[id]]) - - + + # check MPT file mpt_model <- TreeBUGS::readEQN(model) - + if(!is.data.frame(mpt_model)) { "I can't comprehend your .eqn file." } - + # remove extraneous colums and check if all specified columns are present # in data freq_cols <- get_eqn_categories(model) valid_cols <- c(id, condition, freq_cols) check_cols <- valid_cols %in% colnames(data) - + if(!all(check_cols)) { stop("Variable \"", paste(valid_cols[!check_cols], collapse = ", "), "\" not found in data.frame.") } - + data <- data[, valid_cols] - - + + # Check NAs ---- nas_found <- unlist(lapply(X = data, FUN = anyNA)) - + if(any(nas_found)) { stop("Variable \"", paste(valid_cols[nas_found], collapse = ", "), "\" contains missing values.") } - + # Check whether freqencies are integer ---- not_integer <- unlist(lapply(X = data[, freq_cols], FUN = function(x) { any(as.integer(x)!=x) } )) - + if(any(not_integer)) { stop("Variable \"", paste(freq_cols[not_integer], collapse = ", "), "\" contains non-integer values.") } @@ -376,15 +376,15 @@ fit_mpt <- function( warning("With your current mpt_options(), it is not possible to obtain the specified number of effective samples.") } } - - + + # Ensure that id and condition are character, also drops unused levels data[[id]] <- as.character(data[[id]]) data[[condition]] <- as.character(data[[condition]]) - + res <- list() - + # MPTinR part ---- if (any(method %in% c("asymptotic_complete", "asymptotic_no", "pb_no", "npb_no"))) { res[["mptinr"]] <- mpt_mptinr( @@ -397,7 +397,7 @@ fit_mpt <- function( , core = core ) } - + # TreeBUGS part ---- res[["treebugs"]] <- dplyr::bind_rows( purrr::map( @@ -423,4 +423,4 @@ fit_mpt <- function( attr(y, "condition") <- condition attr(y, "core") <- core y -} \ No newline at end of file +} diff --git a/R/get_info.R b/R/get_info.R new file mode 100644 index 0000000..7514840 --- /dev/null +++ b/R/get_info.R @@ -0,0 +1,161 @@ +#' Collect Model Equations and Data per Tree +#' +#' @description Helper function that collects model equation and data per tree +#' for further analysis. +#' +#' @return A list. If \code{autosave = TRUE}, the list is also saved in the +#' current working directory. +#' +#' @inheritParams fit_mpt +#' @param include_data If \code{FALSE} (the default) the response frequencies +#' are not part of the output, but only the number of observations per tree. +#' If \code{TRUE}, the full data is part of the output. +#' @param autosave If \code{TRUE} (the default) the results are automatically +#' saved in the current working directory in a file with name derived from +#' both model and data. +#' +#' @importFrom utils read.table +#' @export +get_info <- function( + model + , dataset + , data + , id = NULL + , condition = NULL + , include_data = FALSE + , core = NULL + , autosave = TRUE +) { + + # catch the function call that was used, + # and other stuff that should be save along the results + matched_call <- match.call() + ##used_model <- utils::read.table(model, skip = 1, stringsAsFactors = FALSE) + + # prepare data ---- + if (missing(data)) { + data <- as.data.frame(readr::read_csv(dataset)) + } + + if(is.null(condition)) { + data$ExpCond <- "no_condition" + condition <- "ExpCond" + } + + if (is.null(data[[id]])) { + stop("id column is not in data.", call. = FALSE) + } + + if(is.null(id)) { + data$Subject <- 1:nrow(data) + id <- "Subject" + } + + # Ensure that all variables are character + data$ExpCond <- as.character(data[[condition]]) + data$Subject <- as.character(data[[id]]) + + + # check MPT file + mpt_model <- TreeBUGS::readEQN(model) + + if(!is.data.frame(mpt_model)) { + "I can't comprehend your .eqn file." + } + + + # remove extraneous colums and check if all specified columns are present + # in data + freq_cols <- get_eqn_categories(model) + valid_cols <- c(id, condition, freq_cols) + check_cols <- valid_cols %in% colnames(data) + + if(!all(check_cols)) { + stop("Variable \"", paste(valid_cols[!check_cols], collapse = ", "), "\" not found in data.frame.") + } + + data <- data[, valid_cols] + + + # Check NAs ---- + nas_found <- unlist(lapply(X = data, FUN = anyNA)) + + if(any(nas_found)) { + stop("Variable \"", paste(valid_cols[nas_found], collapse = ", "), "\" contains missing values.") + } + + # Check whether freqencies are integer ---- + not_integer <- unlist(lapply(X = data[, freq_cols], FUN = function(x) { + any(as.integer(x)!=x) + } + )) + + if(any(not_integer)) { + stop("Variable \"", paste(freq_cols[not_integer], collapse = ", "), "\" contains non-integer values.") + } + + # Ensure that id and condition are character, also drops unused levels + data[[id]] <- as.character(data[[id]]) + data[[condition]] <- as.character(data[[condition]]) + + # summarize the design of your study + participants <- table(data[[condition]]) + trees <- get_eqn_trees(model) + names(trees) <- freq_cols + n_per_tree <- vector(mode = "list", length = length(unique(trees))) + names(n_per_tree) <- unname(unique(trees)) + + model_branches <- try(read.EQN.model(model)) + try(names(model_branches) <- unname(unique(trees))) + + for(i in trees) { + n_per_tree[[i]] <- rowSums(data[, names(trees[trees == i])]) + } + if (include_data) { + data_out <- data + } else { + data_out <- NULL + } + + data_tree <- cbind(data[,c(id, condition)], n_per_tree) + + out <- list( + model = basename(model) + , model_eqn = mpt_model + , model_branches = model_branches + , dataset = basename(dataset) + , data = data_out + , data_tree = data_tree + , core = core + ) + if (autosave) { + outname <- make.names(paste(basename(dataset), basename(model), "info", sep = "_")) + assign(outname, out) + save(list = outname, file = paste0(outname, ".RData")) + } + return(out) + +} + +read.EQN.model <- function (model.filename) +{ + parse.eqn <- function(x) { + branches <- unique(x[, 2]) + l.tree <- length(branches) + tree <- vector("expression", l.tree) + for (branch in 1:l.tree) { + tree[branch] <- parse(text = paste(x[x$V2 == branches[branch], + "V3"], collapse = " + ")) + } + tree + } + tmp.in <- read.table(model.filename, skip = 1, stringsAsFactors = FALSE) + tmp.ordered <- tmp.in[order(tmp.in$V1), ] + tmp.spl <- split(tmp.ordered, factor(tmp.ordered$V1)) + tmp.spl <- lapply(tmp.spl, function(d.f) d.f[order(d.f[, + 2]), ]) + model <- lapply(tmp.spl, parse.eqn) + names(model) <- NULL + model +} + diff --git a/R/mpt_options.R b/R/mpt_options.R index 5df7de3..3e3b633 100644 --- a/R/mpt_options.R +++ b/R/mpt_options.R @@ -1,8 +1,8 @@ #' Options Settings for MPT Comparison -#' +#' #' Set and examine a variety of \emph{options} which affect the way MPT models #' are estimated. -#' +#' #' @param ... Named parameters to set. Possible values are: #' \itemize{ #' \item{\code{bootstrap_samples}: }{Numeric. The number of bootstrap samples to be drawn for the calculation parametric bootstrap confidence intervals.} @@ -23,50 +23,54 @@ # ' TODO \item{\code{catch_warnings}: }{Logical. Whether to store warnings and errors as additional columns in the output.} #' \item{\code{save_models}: }{Logical. Default is \code{FALSE} which does not save the individual MCMC samples in \code{.RData} files. Instead only summairzes are retained in \code{results} object.} #' } -#' -#' -#' @examples +#' +#' +#' @examples #' # Examine options: #' mpt_options() -#' +#' #' # Set number of MCMC chains to 20: #' mpt_options(n.chains = 20) #' mpt_options() -#' +#' #' @export - + mpt_options <- function(...){ - - fetched <- getOption("MPTmultiverse") - args <- c(...) - - if(length(args)==0L) return(fetched) - + + opts <- getOption("MPTmultiverse") + args <- list(...) + + if(length(args)==0L) return(opts) + # Provide some shorthand terms: - if(args[[1]][[1]] %in% c("test", "default")){ - changed <- switch( + if ((args[[1]][[1]] %in% c("test", "default"))[[1]]){ + opts <- switch( args[[1]] , test = set_test_options() , default = set_default_options() ) } else { - changed <- lapply( - X = fetched - , FUN = function(x, args){ - if(is.list(x)) { - sub_args <- args[names(args)%in%names(x)] - x[names(sub_args)] <- sub_args + if (is.list(args[[1]])) { + args <- args[[1]] + } + for (i in names(args)) { + if(i %in% names(opts$mptinr)) { + opts$mptinr[[i]] <- unname(args[[i]]) + } + if(i %in% names(opts$treebugs)) { + opts$treebugs[[i]] <- unname(args[[i]]) + } + if (i %in% names(opts)) { + if (is.list(args[[i]])) { + opts[[i]][names(opts[[i]])[names(opts[[i]]) %in% names(args[[i]])]] <- + args[[i]][names(opts[[i]])[names(opts[[i]]) %in% names(args[[i]])]] + } else { + opts[[i]] <- unname(args[[i]]) } - x } - , args = args - ) - - sub_args <- args[names(args)%in%names(fetched)] - changed[names(sub_args)] <- sub_args - + } } - options(list(MPTmultiverse = changed)) + options(list(MPTmultiverse = opts)) } @@ -75,7 +79,7 @@ mpt_options <- function(...){ set_test_options <- function() { # nocov start cat("Setting options for a quick test run.\nDo not interpret results!") - + list( mptinr = list( bootstrap_samples = 1e1 @@ -91,6 +95,7 @@ set_test_options <- function() { # nocov start , Neff_min = 2 , extend_max = 1 , n.PPP = 4e1 + , prior.beta = "dgamma(1,.1)" ) , silent_jags = FALSE # , catch_warnings = TRUE @@ -105,7 +110,7 @@ set_test_options <- function() { # nocov start #' @keywords internal set_default_options <- function() { - + list( mptinr = list( bootstrap_samples = 1e3 @@ -121,6 +126,7 @@ set_default_options <- function() { , Neff_min = 2e3 , extend_max = 2e1 , n.PPP = 5e3 + , prior.beta = "dgamma(1,.1)" ) , silent_jags = TRUE # , catch_warnings = TRUE @@ -144,4 +150,4 @@ tidy_options <- function(x) { y$max_ci_indiv <- x$max_ci_indiv # y$save_models <- x$save_models tibble::as_tibble(y) -} \ No newline at end of file +} diff --git a/R/mptinr.R b/R/mptinr.R index ef8e973..eb1a3fa 100644 --- a/R/mptinr.R +++ b/R/mptinr.R @@ -17,7 +17,7 @@ mpt_mptinr <- function( , id = id , condition = condition ) - + # Homogeneity tests homogeneity_tests <- dplyr::bind_rows( lapply(X = prepared$freq_list, FUN = function(x) { @@ -30,9 +30,9 @@ mpt_mptinr <- function( }) , .id = "condition" ) - + res <- list() - + if(any(method %in% c("asymptotic_complete"))) { res[["complete_pooling"]] <- mpt_mptinr_complete( dataset = dataset @@ -44,7 +44,7 @@ mpt_mptinr <- function( , core = core ) } - + if(any(method %in% c("asymptotic_no", "pb_no", "npb_no"))) { res[["no_pooling"]] <- mpt_mptinr_no( dataset = dataset @@ -56,12 +56,12 @@ mpt_mptinr <- function( , core = core ) } - + for(i in seq_len(length(res))) { - if (nrow(res[[i]]) > 0) + if (nrow(res[[i]]) > 0) res[[i]]$test_homogeneity[[1]] <- homogeneity_tests } - + # return dplyr::bind_rows(res) } @@ -73,36 +73,36 @@ test_between_no <- function(test_between, indiv_pars, CI_SIZE, cols_ci) { tmp_par <- test_between$parameter[i] tmp_c1 <- test_between$condition1[i] tmp_c2 <- test_between$condition2[i] - + tmp_df <- indiv_pars %>% dplyr::filter( .data$identifiable, .data$parameter == tmp_par, - .data$condition %in% + .data$condition %in% c(tmp_c1, tmp_c2) ) - + ## skip calculation in case of only one observation per group try({ - - tmp_t <- stats::t.test(tmp_df[ tmp_df$condition == tmp_c1, ]$est, + + tmp_t <- stats::t.test(tmp_df[ tmp_df$condition == tmp_c1, ]$est, tmp_df[ tmp_df$condition == tmp_c2, ]$est) - + tmp_lm <- stats::lm(est ~ condition, tmp_df) - + tmp_se <- stats::coef(stats::summary.lm(tmp_lm))[2,"Std. Error"] - - test_between[ i , c("est_diff" , "se", "p") ] <- + + test_between[ i , c("est_diff" , "se", "p") ] <- c(diff(rev(tmp_t$estimate)), tmp_se, tmp_t$p.value) - - test_between[i, cols_ci] <- test_between[i, ]$est_diff + + + test_between[i, cols_ci] <- test_between[i, ]$est_diff + stats::qnorm(CI_SIZE)* test_between[i, ]$se }, silent = TRUE) } return(test_between) } - + ################ ## no pooling ## ################ @@ -127,40 +127,40 @@ mpt_mptinr_no <- function( MPTINR_OPTIONS <- OPTIONS$mptinr CI_SIZE <- OPTIONS$ci_size MAX_CI_INDIV <- OPTIONS$max_ci_indiv - + bootstrap <- c() - + if ("pb_no" %in% method) { bootstrap <- c(bootstrap, "pb") } if ("npb_no" %in% method) { bootstrap <- c(bootstrap, "npb") } - + t0 <- Sys.time() fit_mptinr <- MPTinR::fit.mpt(prepared$data[,prepared$col_freq], model.filename = model, n.optim = MPTINR_OPTIONS$n.optim, fit.aggregated = FALSE, - show.messages = FALSE, output = "full", + show.messages = FALSE, output = "full", ci = (1 - stats::pnorm(1))*2*100) t1 <- Sys.time() - additional_time <- t1 - t0 + additional_time <- difftime(t1, t0, units = "secs") convergence <- tibble::tibble( id = prepared$data[[id]] , condition = as.character(prepared$data[[condition]]) , rank.fisher = fit_mptinr$model.info$individual$rank.fisher , n.parameters = fit_mptinr$model.info$individual$n.parameters - , convergence = vapply(fit_mptinr$best.fits$individual, + , convergence = vapply(fit_mptinr$best.fits$individual, FUN = function(x) x$convergence, FUN.VALUE = 0) ) - + res <- vector("list", length(method)) names(res) <- method - + if ("asymptotic_no" %in% method) { - + res[["asymptotic_no"]] <- make_results_row(model = model, dataset = dataset, pooling = "no", @@ -171,22 +171,22 @@ mpt_mptinr_no <- function( id = id, condition = condition, core = core) - - - + + + res[["asymptotic_no"]]$gof_indiv[[1]]$type <- "G2" res[["asymptotic_no"]]$gof_indiv[[1]]$focus <- "mean" - res[["asymptotic_no"]]$gof_indiv[[1]]$stat_obs <- + res[["asymptotic_no"]]$gof_indiv[[1]]$stat_obs <- fit_mptinr$goodness.of.fit$individual$G.Squared - res[["asymptotic_no"]]$gof_indiv[[1]]$stat_df <- + res[["asymptotic_no"]]$gof_indiv[[1]]$stat_df <- fit_mptinr$goodness.of.fit$individual$df - res[["asymptotic_no"]]$gof_indiv[[1]]$p <- + res[["asymptotic_no"]]$gof_indiv[[1]]$p <- fit_mptinr$goodness.of.fit$individual$p.value - - + + ## make est_indiv and gof_indiv for (i in seq_len(nrow(prepared$data))) { - + ## get results from fitting runs to check if parameters are identified all_pars <- do.call( "rbind" @@ -194,22 +194,22 @@ mpt_mptinr_no <- function( ) colnames(all_pars) <- dimnames(fit_mptinr$parameters$individual)[[1]] all_pars <- as.data.frame(all_pars) - all_pars$objective <- purrr::map_dbl(fit_mptinr$optim.runs$individual[[i]], "objective") + all_pars$objective <- purrr::map_dbl(fit_mptinr$optim.runs$individual[[i]], "objective") all_pars$minimum <- abs(all_pars$objective - min(all_pars$objective)) < 0.01 - + for (p in prepared$parameters) { - + res[["asymptotic_no"]]$est_indiv[[1]][ res[["asymptotic_no"]]$est_indiv[[1]]$id == prepared$data[i,"id"] & res[["asymptotic_no"]]$est_indiv[[1]]$parameter == p, "est" ] <- fit_mptinr$parameters$individual[p,"estimates",i] - + res[["asymptotic_no"]]$est_indiv[[1]][ res[["asymptotic_no"]]$est_indiv[[1]]$id == prepared$data[i,"id"] & res[["asymptotic_no"]]$est_indiv[[1]]$parameter == p, "se" ] <- - fit_mptinr$parameters$individual[p, "upper.conf",i] - + fit_mptinr$parameters$individual[p, "upper.conf",i] - fit_mptinr$parameters$individual[p,"estimates",i] - + ## check if individual parameter is identifiable and flag otherwise pars_at_optimum <- all_pars[all_pars$minimum, p] if (length(pars_at_optimum) > 1) { @@ -225,45 +225,45 @@ mpt_mptinr_no <- function( FALSE } } - + } } - + for (i in seq_along(CI_SIZE)) { res[["asymptotic_no"]]$est_indiv[[1]][, prepared$cols_ci[i]] <- res[["asymptotic_no"]]$est_indiv[[1]][,"est"] + stats::qnorm(CI_SIZE[i])*res[["asymptotic_no"]]$est_indiv[[1]][,"se"] } - + #### make est_group #### - + est_group2 <- res[["asymptotic_no"]]$est_indiv[[1]] %>% - dplyr::filter(.data$identifiable | is.na(.data$identifiable)) %>% + dplyr::filter(.data$identifiable | is.na(.data$identifiable)) %>% dplyr::group_by(.data$condition, .data$parameter, .data$core) %>% dplyr::summarise(estN = mean(.data$est), - se = stats::sd(.data$est) / + se = stats::sd(.data$est) / sqrt(sum(!is.na(.data$est)))) %>% dplyr::ungroup() %>% dplyr::rename(est = .data$estN) for (i in seq_along(CI_SIZE)) { - est_group2[, prepared$cols_ci[i]] <- est_group2[,"est"] + + est_group2[, prepared$cols_ci[i]] <- est_group2[,"est"] + stats::qnorm(CI_SIZE[i])*est_group2[,"se"] } - + res[["asymptotic_no"]]$est_group[[1]] <- dplyr::right_join( - est_group2, res[["asymptotic_no"]]$est_group[[1]][,c("condition", + est_group2, res[["asymptotic_no"]]$est_group[[1]][,c("condition", "parameter")], by = c("condition", "parameter")) - - + + # ---------------------------------------------------------------------------- # make gof_group for asymptotic approach - + res[["asymptotic_no"]]$gof_group[[1]]$type <- "G2" res[["asymptotic_no"]]$gof_group[[1]]$focus <- "mean" - + tmp <- fit_mptinr$goodness.of.fit$individual - tmp$condition <- factor(as.character(prepared$data$condition), + tmp$condition <- factor(as.character(prepared$data$condition), levels = prepared$conditions) ## creation of factor above (which is reverted below) ensures that output ## has same ordering as gof_group for other messages @@ -272,7 +272,7 @@ mpt_mptinr_no <- function( dplyr::summarise(stat_obs = sum(.data$G.Squared), stat_pred = NA_real_, stat_df = sum(.data$df)) - + gof_group2$p <- stats::pchisq( q = gof_group2$stat_obs , df = gof_group2$stat_df @@ -284,16 +284,16 @@ mpt_mptinr_no <- function( # gof_group2$condition # , levels = levels(res[["asymptotic_no"]]$gof_group[[1]]$condition) # ) - - res[["asymptotic_no"]]$gof_group[[1]] <- + + res[["asymptotic_no"]]$gof_group[[1]] <- dplyr::right_join(res[["asymptotic_no"]]$gof_group[[1]][, c("condition", "type", "focus")], gof_group2, by = c("condition")) - - + + # ---------------------------------------------------------------------------- # make gof - + gof <- tibble::tibble( type = "G2" , focus = "mean" @@ -302,7 +302,7 @@ mpt_mptinr_no <- function( , stat_df = fit_mptinr$goodness.of.fit$sum$df , p = NA_real_ ) - + # Calculate *p* value from Chi-squared distribution res[["asymptotic_no"]]$gof[[1]] <- gof res[["asymptotic_no"]]$gof[[1]]$p <- stats::pchisq( @@ -310,29 +310,29 @@ mpt_mptinr_no <- function( , df = res[["asymptotic_no"]]$gof[[1]]$stat_df , lower.tail = FALSE ) - - # ---------------------------------------------------------------------------- + + # ---------------------------------------------------------------------------- # make test_between - - res[["asymptotic_no"]]$test_between[[1]] <- - test_between_no(test_between = res[["asymptotic_no"]]$test_between[[1]], - indiv_pars = res[["asymptotic_no"]]$est_indiv[[1]], + + res[["asymptotic_no"]]$test_between[[1]] <- + test_between_no(test_between = res[["asymptotic_no"]]$test_between[[1]], + indiv_pars = res[["asymptotic_no"]]$est_indiv[[1]], CI_SIZE = CI_SIZE, cols_ci = prepared$cols_ci) - + ### copy information that is same ---- - + res[["asymptotic_no"]]$convergence <- list(convergence) # res[["asymptotic_no"]]$test_between <- res[["pb_no"]]$test_between # bugfix--MB--2018-10-14 - + # write estimation time to results_row ---- res[["asymptotic_no"]]$estimation[[1]] <- tibble::tibble( condition = "individual" , time_difference = additional_time ) - + } - - + + if ("pb" %in% bootstrap) { try(res[["pb_no"]] <- get_pb_results(dataset = dataset , prepared = prepared @@ -371,16 +371,16 @@ get_pb_results <- function(dataset , additional_time , convergence , core = core) { - + OPTIONS <- getOption("MPTmultiverse") MPTINR_OPTIONS <- OPTIONS$mptinr CI_SIZE <- OPTIONS$ci_size MAX_CI_INDIV <- OPTIONS$max_ci_indiv - + cl <- parallel::makeCluster(rep("localhost", OPTIONS$n.CPU)) parallel::clusterEvalQ(cl, library("MPTinR")) parallel::clusterSetRNGStream(cl, iseed = sample.int(.Machine$integer.max, 1)) - + if (bootstrap == "pb") { res <- make_results_row( model = model @@ -433,30 +433,30 @@ get_pb_results <- function(dataset ) t2 <- Sys.time() } - + res$gof_indiv[[1]]$type <- "G2" res$gof_indiv[[1]]$focus <- "mean" res$gof_indiv[[1]]$stat_obs <- fit_mptinr$goodness.of.fit$individual$G.Squared res$gof_indiv[[1]]$stat_df <- fit_mptinr$goodness.of.fit$individual$df - + ## make est_indiv and gof_indiv for (i in seq_len(nrow(prepared$data))) { - + for (p in prepared$parameters) { res$est_indiv[[1]][ res$est_indiv[[1]]$id == prepared$data[i,"id"] & res$est_indiv[[1]]$parameter == p, "est" ] <- fit_mptinr$parameters$individual[p,"estimates",i] - + res$est_indiv[[1]][ res$est_indiv[[1]]$id == prepared$data[i,"id"] & res$est_indiv[[1]]$parameter == p, prepared$cols_ci ] <- stats::quantile(fit_pb[[i]]$parameters$individual[p,"estimates",], probs = CI_SIZE) - + res$est_indiv[[1]][ res$est_indiv[[1]]$id == prepared$data[i,"id"] & res$est_indiv[[1]]$parameter == p, "se" ] <- - stats::sd(fit_pb[[i]]$parameters$individual[p,"estimates",]) + stats::sd(fit_pb[[i]]$parameters$individual[p,"estimates",]) } # gof_indiv res$gof_indiv[[1]][ @@ -464,98 +464,98 @@ get_pb_results <- function(dataset (sum(fit_pb[[i]]$goodness.of.fit$individual$G.Squared > fit_mptinr$goodness.of.fit$individual[i,"G.Squared"]) + 1) / (MPTINR_OPTIONS$bootstrap_samples + 1) - + } - + ## check for identifiability - - tmp_range_ci <- res$est_indiv[[1]][, prepared$cols_ci[length(prepared$cols_ci)]][[1]] - + + tmp_range_ci <- res$est_indiv[[1]][, prepared$cols_ci[length(prepared$cols_ci)]][[1]] - res$est_indiv[[1]][, prepared$cols_ci[1]][[1]] res$est_indiv[[1]]$identifiable <- tmp_range_ci < MAX_CI_INDIV - + #### make est_group #### - - non_identified_pars <- res$est_indiv[[1]] %>% - dplyr::filter(!.data$identifiable) %>% - dplyr::group_by(.data$id) %>% - dplyr::summarise(parameter = paste0(.data$parameter, collapse = ", ")) %>% + + non_identified_pars <- res$est_indiv[[1]] %>% + dplyr::filter(!.data$identifiable) %>% + dplyr::group_by(.data$id) %>% + dplyr::summarise(parameter = paste0(.data$parameter, collapse = ", ")) %>% dplyr::ungroup() - - res$convergence <- + + res$convergence <- list(tibble::as_tibble(dplyr::left_join(convergence, non_identified_pars, by = "id"))) - + if (nrow(non_identified_pars) > 0) { warning("MPTinR-no: IDs and parameters with ", bootstrap, "-CIs > ", - MAX_CI_INDIV, " (i.e., non-identified):\n", - apply(non_identified_pars, + MAX_CI_INDIV, " (i.e., non-identified):\n", + apply(non_identified_pars, 1, function(x) paste0(x[["id"]], ": ", x[["parameter"]], "\n") ), - call. = FALSE) + call. = FALSE) } - + est_group <- res$est_indiv[[1]] %>% dplyr::filter(.data$identifiable) %>% ## exclude non-identified pars. dplyr::group_by(.data$condition, .data$parameter, .data$core) %>% dplyr::summarise(estN = mean(.data$est), - se = stats::sd(.data$est) / + se = stats::sd(.data$est) / sqrt(sum(!is.na(.data$est)))) %>% dplyr::ungroup() %>% dplyr::rename(est = .data$estN) for (i in seq_along(CI_SIZE)) { - est_group[, prepared$cols_ci[i]] <- est_group[,"est"] + + est_group[, prepared$cols_ci[i]] <- est_group[,"est"] + stats::qnorm(CI_SIZE[i])*est_group[,"se"] } - + res$est_group[[1]] <- dplyr::right_join(est_group, res$est_group[[1]][, c("condition", "parameter")], by = c("condition", "parameter")) - - # ---------------------------------------------------------------------------- + + # ---------------------------------------------------------------------------- # make test_between - - res$test_between[[1]] <- - test_between_no(test_between = res$test_between[[1]], - indiv_pars = res$est_indiv[[1]], + + res$test_between[[1]] <- + test_between_no(test_between = res$test_between[[1]], + indiv_pars = res$est_indiv[[1]], CI_SIZE = CI_SIZE, cols_ci = prepared$cols_ci) - - # ---------------------------------------------------------------------------- + + # ---------------------------------------------------------------------------- # make gof_group for parametric-bootstrap approach - + res$gof_group[[1]]$type <- paste0(bootstrap, "-G2") res$gof_group[[1]]$focus <- "mean" - + tmp <- fit_mptinr$goodness.of.fit$individual tmp$condition <- as.character(prepared$data$condition) gof_group <- tmp %>% - dplyr::group_by(.data$condition) %>% + dplyr::group_by(.data$condition) %>% dplyr::summarise(stat_obs = sum(.data$G.Squared), stat_df = sum(.data$df)) gof_group$p <- NA_real_ - + g2_all <- vapply(fit_pb, function(x) x$goodness.of.fit$individual$G.Squared, rep(0, MPTINR_OPTIONS$bootstrap_samples)) - + g2_cond <- vector("list", length(prepared$conditions)) - + for (i in seq_along(prepared$conditions)) { - g2_cond[[i]] <- apply(g2_all[ , - prepared$data$condition == + g2_cond[[i]] <- apply(g2_all[ , + prepared$data$condition == prepared$conditions[i]], 1, sum) - res$gof_group[[1]][ - res$gof_group[[1]]$condition == - prepared$conditions[i], "stat_obs" ] <- + res$gof_group[[1]][ + res$gof_group[[1]]$condition == + prepared$conditions[i], "stat_obs" ] <- gof_group[ gof_group$condition == prepared$conditions[i], "stat_obs"] - res$gof_group[[1]][ res$gof_group[[1]]$condition == - prepared$conditions[i], "stat_df" ] <- + res$gof_group[[1]][ res$gof_group[[1]]$condition == + prepared$conditions[i], "stat_df" ] <- gof_group[ gof_group$condition == prepared$conditions[i], "stat_df"] - res$gof_group[[1]][ res$gof_group[[1]]$condition == + res$gof_group[[1]][ res$gof_group[[1]]$condition == prepared$conditions[i], "p" ] <- - (sum(gof_group[ gof_group$condition == + (sum(gof_group[ gof_group$condition == prepared$conditions[i], "stat_obs"][[1]] < g2_cond[[i]]) + 1) / (MPTINR_OPTIONS$bootstrap_samples + 1) } - + gof <- tibble::tibble( type = "G2" , focus = "mean" @@ -564,10 +564,10 @@ get_pb_results <- function(dataset , stat_df = fit_mptinr$goodness.of.fit$sum$df , p = NA_real_ ) - + # Calculate *p* value from distribution of bootstrapped G2 values res$gof[[1]] <- gof - + g2_all <- vapply( X = fit_pb , FUN = function(x) x$goodness.of.fit$individual$G.Squared @@ -578,22 +578,22 @@ get_pb_results <- function(dataset , MARGIN = 1 , FUN = sum ) - + res$gof[[1]]$p <- (sum(res$gof[[1]]$stat_obs < g2_cond) + 1) / (MPTINR_OPTIONS$bootstrap_samples + 1) - + # write estimation time to results_row ---- res$estimation[[1]] <- tibble::tibble( condition = "individual" , time_difference = (t2 - t1) + additional_time ) - + parallel::stopCluster(cl) - + return(res) - + } @@ -611,7 +611,7 @@ get_pb_output <- function( , model_file , col_freq , MPTINR_OPTIONS) { - + gen_data <- MPTinR::gen.data( fit_mptinr$parameters$individual[,"estimates", i] , samples = MPTINR_OPTIONS$bootstrap_samples @@ -632,7 +632,7 @@ get_npb_output <- function( , model_file , col_freq , MPTINR_OPTIONS) { - + gen_data <- MPTinR::sample.data( data = unlist(data[i, col_freq]) , samples = MPTINR_OPTIONS$bootstrap_samples @@ -653,19 +653,19 @@ get_npb_output <- function( #' @importFrom magrittr %>% #' @keywords internal -mpt_mptinr_complete <- function(dataset, - prepared, +mpt_mptinr_complete <- function(dataset, + prepared, model, method, - id, + id, condition, core = NULL) { OPTIONS <- getOption("MPTmultiverse") MPTINR_OPTIONS <- OPTIONS$mptinr CI_SIZE <- OPTIONS$ci_size MAX_CI_INDIV <- OPTIONS$max_ci_indiv - - + + res <- make_results_row( model = model , dataset = dataset @@ -678,36 +678,36 @@ mpt_mptinr_complete <- function(dataset, , condition = condition , core = core ) - + res$est_indiv <- list(tibble::tibble()) res$gof_indiv <- list(tibble::tibble()) - + #### fully aggregated: - + t0 <- Sys.time() fit_mptinr_agg <- MPTinR::fit.mpt(colSums(prepared$data[,prepared$col_freq]), model.filename = model, n.optim = MPTINR_OPTIONS$n.optim, show.messages = FALSE, output = "full") - + res$estimation[[1]]$time_difference[ res$estimation[[1]]$condition == "complete_data" - ] <- Sys.time() - t0 + ] <- difftime(Sys.time(), t0, units = "secs") ## gof - + res$gof[[1]][1,"type"] <- "G2" res$gof[[1]][1,"focus"] <- "mean" - + res$gof[[1]][1, c("stat_obs", "stat_df", "p")] <- fit_mptinr_agg$goodness.of.fit[, c("G.Squared", "df", "p.value")] - + #### aggregated by condition - + res$gof_group[[1]][, "type"] <- "G2" res$gof_group[[1]][, "focus"] <- "mean" - + convergence <- vector("list", 1 + length(prepared$conditions)) names(convergence) <- c("aggregated", prepared$conditions) @@ -716,7 +716,7 @@ mpt_mptinr_complete <- function(dataset, , n.parameters = fit_mptinr_agg$model.info$n.parameters , convergence = fit_mptinr_agg$best.fits[[1]]$convergence ) - + for (i in seq_along(prepared$conditions)) { t0 <- Sys.time() @@ -726,98 +726,98 @@ mpt_mptinr_complete <- function(dataset, n.optim = MPTINR_OPTIONS$n.optim, show.messages = FALSE, output = "full") - + res$estimation[[1]]$time_difference[ res$estimation[[1]]$condition == prepared$conditions[i] - ] <- Sys.time() - t0 - + ] <- difftime(Sys.time(), t0, units = "secs") + res$gof_group[[1]][ - res$gof_group[[1]]$condition == + res$gof_group[[1]]$condition == prepared$conditions[i] , c("stat_obs", "stat_df", "p")] <- fit_mptinr_tmp$goodness.of.fit[, c("G.Squared", "df", "p.value")] - + res$est_group[[1]][ - res$est_group[[1]]$condition == prepared$conditions[i], "est"] <- + res$est_group[[1]]$condition == prepared$conditions[i], "est"] <- fit_mptinr_tmp$parameters[ res$est_group[[1]][res$est_group[[1]]$condition == prepared$conditions[i], ]$parameter, "estimates"] par_se <- rep(NA_real_, length(rownames(fit_mptinr_tmp$parameters))) par_se <- tryCatch(sqrt(diag(solve(fit_mptinr_tmp$hessian[[1]]))), - error = function(x) + error = function(x) sqrt(diag(limSolve::Solve(fit_mptinr_tmp$hessian[[1]])))) names(par_se) <- rownames(fit_mptinr_tmp$parameters) - + res$est_group[[1]][ - res$est_group[[1]]$condition == + res$est_group[[1]]$condition == prepared$conditions[i], "se" ] <- par_se[ res$est_group[[1]][ - res$est_group[[1]]$condition == + res$est_group[[1]]$condition == prepared$conditions[i], ]$parameter] - + convergence[[prepared$conditions[i]]] <- tibble::tibble( rank.fisher = fit_mptinr_tmp$model.info$rank.fisher , n.parameters = fit_mptinr_tmp$model.info$n.parameters , convergence = fit_mptinr_tmp$best.fits[[1]]$convergence ) } - + for (i in seq_along(CI_SIZE)) { res$est_group[[1]][, prepared$cols_ci[i]] <- res$est_group[[1]][,"est"] + stats::qnorm(CI_SIZE[i])*res$est_group[[1]][,"se"] } - + # ---------------------------------------------------------------------------- # test_between est_group <- res$est_group[[1]] test_between <- res$test_between[[1]] - + for (i in seq_len(nrow(test_between))) { - + p <- test_between$parameter[i] c1 <- test_between$condition1[i] c2 <- test_between$condition2[i] - test_between[i, "est_diff"] <- + test_between[i, "est_diff"] <- est_group[est_group$condition == c1 & est_group$parameter == p, ]$est - est_group[est_group$condition == c2 & est_group$parameter == p, ]$est - + # standard errors test_between[i, "se"] <- sqrt( (est_group[est_group$condition == c1 & est_group$parameter == p, ]$se)^2 + (est_group[est_group$condition == c2 & est_group$parameter == p, ]$se)^2 ) } - + # CIs from standard errors for(k in CI_SIZE) { - test_between[[paste0("ci_", k)]] <- test_between$est_diff + + test_between[[paste0("ci_", k)]] <- test_between$est_diff + test_between$se * qnorm(p = k) } - + res$test_between[[1]] <- test_between - + # ---------------------------------------------------------------------------- # convergence - + tmp <- names(convergence) convergence <- do.call("rbind", convergence) convergence <- dplyr::bind_cols(condition = tmp, convergence) - + res$convergence <- list(convergence) warn_conv <- convergence$convergence != 0 if (any(warn_conv)) { - warning("MPTinR-complete: Convergence code != 0 for: ", - paste0(names(warn_conv)[warn_conv], collapse = ", "), + warning("MPTinR-complete: Convergence code != 0 for: ", + paste0(names(warn_conv)[warn_conv], collapse = ", "), call. = FALSE) } - + # res$estimation[[1]] <- tibble::tibble( # condition = c("complete_data", names(t_cond)) # , time_difference = as.difftime(c(t_complete_data, unlist(t_cond))) # ) - + return(res) } diff --git a/R/plot_multiverseMPT.R b/R/plot_multiverseMPT.R index d886df8..2d88141 100644 --- a/R/plot_multiverseMPT.R +++ b/R/plot_multiverseMPT.R @@ -1,36 +1,36 @@ # plot_results <- function (results, save = TRUE, write.csv = TRUE){ -# +# # shapes <- c(16, 18, 15, 1, 0, 8, 11, 12) -# -# prefix <- paste0(gsub("\\.eqn", "", results$model[1]), "_", +# +# prefix <- paste0(gsub("\\.eqn", "", results$model[1]), "_", # gsub("\\.", "_", paste0(results$dataset[1],"_"))) -# +# # if (write.csv){ # readr::write_csv(tidyr::unnest(results, est_group), paste0(prefix,"estimates.csv")) # readr::write_csv(tidyr::unnest(results, gof), paste0(prefix,"gof.csv")) # } -# -# +# +# # dd <- ggplot2::position_dodge(w = .75) # gg_est1 <- tidyr::unnest(results, est_group) %>% -# ggplot2::ggplot(ggplot2::aes(y = est, x = parameter, +# ggplot2::ggplot(ggplot2::aes(y = est, x = parameter, # col=interaction(method, pooling,package), # shape=interaction(method, pooling, package))) + # ggplot2::facet_grid(.~condition) + -# ggplot2::geom_errorbar(ggplot2::aes(ymin = est-se, ymax = est+se), position = dd, +# ggplot2::geom_errorbar(ggplot2::aes(ymin = est-se, ymax = est+se), position = dd, # width = 0.4)+ -# ggplot2::geom_point(position = dd) + ylim(0,1) + +# ggplot2::geom_point(position = dd) + ylim(0,1) + # ggplot2::scale_shape_manual(values=shapes) + # ggplot2::theme_bw() # plot(gg_est1) # if(save) ggplot2::ggsave(paste0(prefix,"estimates.pdf"), gg_est1, h = 4.5, w = 10) -# -# -# res_between <- tidyr::unnest(results, test_between) +# +# +# res_between <- tidyr::unnest(results, test_between) # if (nrow(res_between) > 0){ -# +# # if (write.csv) readr::write_csv(tidyr::unnest(results, test_between), paste0(prefix,"test_between.csv")) -# +# # gg_est2 <- ggplot2::ggplot( # res_between # , ggplot2::aes( @@ -41,38 +41,38 @@ # ) # ) + # ggplot2::facet_grid(condition2~condition1) + -# ggplot2::geom_errorbar(ggplot2::aes(ymin = ci_0.025, ymax = ci_0.975), +# ggplot2::geom_errorbar(ggplot2::aes(ymin = ci_0.025, ymax = ci_0.975), # position = dd, width = 0.5) + -# ggplot2::geom_point(position = dd) + ylim(-1,1) + +# ggplot2::geom_point(position = dd) + ylim(-1,1) + # ggplot2::scale_shape_manual(values=shapes) + -# ggplot2::theme_bw() + +# ggplot2::theme_bw() + # ggplot2::geom_hline(yintercept = 0, lty = 2) -# +# # if(save) ggplot2::ggsave(paste0(prefix,"test_between.pdf"), gg_est2, h = 4.5, w = 8) # } -# -# +# +# # gg_gof1 <- tidyr::unnest(results, gof) %>% # # filter(focus == "mean") %>% -# ggplot2::ggplot(ggplot2::aes(y = p, -# x = interaction(method, pooling, package))) + -# ggplot2::geom_point() + ylim(0, 1) + +# ggplot2::ggplot(ggplot2::aes(y = p, +# x = interaction(method, pooling, package))) + +# ggplot2::geom_point() + ylim(0, 1) + # ggplot2::geom_hline(yintercept = .05, lty = 2)+ # ggplot2::theme_bw() + ggplot2::coord_flip() + # ggplot2::facet_wrap(~focus) + # ggplot2::ggtitle("Goodness of fit") # plot(gg_gof1) # if(save) ggplot2::ggsave(paste0(prefix,"gof.pdf"), gg_gof1, h = 4, w = 6) -# -# +# +# # if (nrow(res_between) > 0){ # if (write.csv) readr::write_csv(tidyr::unnest(results, gof_group), paste0(prefix,"gof_group.csv")) -# +# # gg_gof2 <- tidyr::unnest(results, gof_group) %>% -# ggplot(ggplot2::aes(y = p, -# x = interaction(method, pooling, package), -# col = condition)) + -# ggplot2::geom_point() + ylim(0, 1) + +# ggplot(ggplot2::aes(y = p, +# x = interaction(method, pooling, package), +# col = condition)) + +# ggplot2::geom_point() + ylim(0, 1) + # ggplot2::geom_hline(yintercept = .05, lty = 2)+ # ggplot2::theme_bw() + # ggplot2::coord_flip() + @@ -85,9 +85,9 @@ #' Plot multiverseMPT -#' +#' #' Plot the results from a multiverse MPT analysis. -#' +#' #' @param x An object of class \code{multiverseMPT}. #' @param which Character. Which information should be plotted? Possible #' values are @@ -97,189 +97,174 @@ #' \code{"gof2"} for group-wise goodness-of-fit statistics. #' @param save Logical. Indicates whether the plot should also be saved as a .pdf file. #' @param ... ignored. -#' +#' #' @importFrom rlang .data #' @importFrom magrittr %>% #' @importFrom graphics plot #' @export plot.multiverseMPT <- function(x, which = "est", save = FALSE, ...){ - + args <- list(...) - + if(is.null(args$shapes)) { shapes <- seq_len(nrow(x)) } else { shapes <- args$shapes } - - - results <- x - prefix <- paste0(gsub("\\.eqn", "", results$model[1]), "_", - gsub("\\.", "_", paste0(results$dataset[1],"_"))) - - # if (write.csv){ - # readr::write_csv(tidyr::unnest(results, .data$est_group), paste0(prefix,"estimates.csv")) - # readr::write_csv(tidyr::unnest(results, .data$gof), paste0(prefix,"gof.csv")) - # } - - + + prefix <- paste0(gsub("\\.eqn", "", x$model[1]), "_", + gsub("\\.", "_", paste0(x$dataset[1],"_"))) + + if("est" %in% which) { + gg_est1 <- plot_est(x, shapes = shapes) + if(save) ggplot2::ggsave(paste0(prefix, "estimates.pdf"), gg_est1, h = 4.5, w = 10) + return(gg_est1) + } + + if("test_between" %in% which){ + gg_est2 <- plot_test_between(x, shapes = shapes) + if(save) ggplot2::ggsave(paste0(prefix,"test_between.pdf"), gg_est2, h = 4.5, w = 8) + return(gg_est2) + } + + if("gof1" %in% which) { + gg_gof1 <- plot_gof1(x) + if(save) ggplot2::ggsave(paste0(prefix,"gof.pdf"), gg_gof1, h = 4, w = 6) + return(gg_gof1) + } + + if("gof2" %in% which) { + gg_gof2 <-plot_gof2(x) + if(save) ggplot2::ggsave(paste0(prefix, "gof_group.pdf"), gg_gof2, h = 4, w = 8) + return(gg_gof2) + } + if("gof" %in% which) { + gg_gof <- plot_gof(x) + if(save) ggplot2::ggsave(paste0(prefix, "gof_group.pdf"), gg_gof, h = 4, w = 7) + return(gg_gof) + } +} + + + +#' @keywords internal + +plot_est <- function(x, shapes, ...) { + + args <- list(...) + dd <- ggplot2::position_dodge(width = .75) - - est_group <- tidyr::unnest(data = results, .data$est_group) - est_group$approach <- interaction(est_group$method, est_group$pooling, est_group$package) - - gg_est1 <- - ggplot2::ggplot(est_group) + - ggplot2::aes_( - y = ~ est - , x = ~ parameter - , col = ~ approach - # , shape = shapes - ) + + + x %>% + tidyr::unnest(.data$est_group) %>% + dplyr::mutate(approach = interaction(.data$method, .data$pooling, .data$package, sep = " ")) %>% + ggplot2::ggplot() + + ggplot2::aes_(y = ~ est, x = ~ parameter, col = ~ approach, shape = ~ approach) + ggplot2::facet_grid(facets = ". ~ condition") + - ggplot2::geom_errorbar(ggplot2::aes_(ymin = ~ci_0.025, ymax = ~ci_0.975), position = dd, width = 0.4) + - ggplot2::geom_point(position = dd) + + ggplot2::geom_errorbar( + ggplot2::aes_(ymin = ~ci_0.025, ymax = ~ci_0.975) + , position = dd + , width = 0.4 + ) + + ggplot2::geom_point(position = dd) + ggplot2::coord_cartesian(ylim = c(0, 1)) + - # ggplot2::ylim(0, 1) + + # ggplot2::ylim(0, 1) + ggplot2::scale_shape_manual(values = shapes) +} - if(save) ggplot2::ggsave(paste0(prefix, "estimates.pdf"), gg_est1, h = 4.5, w = 10) - - if("est" %in% which) - return(gg_est1) - - res_between <- tidyr::unnest(results, .data$test_between) - res_between$approach <- interaction(res_between$method, res_between$pooling, res_between$package) - - if (nrow(res_between) > 0){ - # if (write.csv) readr::write_csv(res_between, paste0(prefix,"test_between.csv")) - - - gg_est2 <- ggplot2::ggplot( - res_between - , ggplot2::aes_( - y = ~ est_diff, x = ~ parameter - , col = ~ approach - , shape = ~ approach - ) - ) + + + +#' @keywords internal + +plot_test_between <- function(x, shapes, ...) { + + test_between <- x %>% + tidyr::unnest(.data$test_between) + + if(nrow(test_between) == 0) { + warning( + "No between-subjects conditions present in your results. Therefore, no between-subjects comparisons were plotted." + , call. = FALSE + ) + return(ggplot2::ggplot()) + } + + dd <- ggplot2::position_dodge(width = .75) + + test_between %>% + dplyr::mutate(approach = interaction(.data$method, .data$pooling, .data$package, sep = " ")) %>% + ggplot2::ggplot() + + ggplot2::aes_(y = ~ est_diff, x = ~ parameter, col = ~ approach, shape = ~ approach) + ggplot2::facet_grid("condition2 ~ condition1") + ggplot2::geom_errorbar( - ggplot2::aes_( - ymin = ~ ci_0.025 - , ymax = ~ ci_0.975 - ) - , position = dd, width = 0.5 + ggplot2::aes_(ymin = ~ ci_0.025, ymax = ~ ci_0.975) + , position = dd + , width = 0.4 ) + - ggplot2::geom_point(position = dd) + ggplot2::ylim(-1, 1) + - ggplot2::scale_shape_manual(values=shapes) + + ggplot2::geom_point(position = dd) + ggplot2::ylim(-1, 1) + + ggplot2::scale_shape_manual(values = shapes) + ggplot2::geom_hline(yintercept = 0, lty = 2) +} - if(save) ggplot2::ggsave(paste0(prefix,"test_between.pdf"), gg_est2, h = 4.5, w = 8) - - if("test_between" %in% which){ - return(gg_est2) - } - } - - gof <- tidyr::unnest(results, .data$gof) - gof$approach <- interaction(gof$method, gof$pooling, gof$package) - - gg_gof1 <- - # filter(focus == "mean") %>% - ggplot2::ggplot( - gof, - ggplot2::aes_(y = ~ p, x = ~ approach) - ) + - ggplot2::geom_point() + - ggplot2::ylim(0, 1) + + + +#' @keywords internal + +plot_gof1 <- function(x, ...) { + + x %>% + tidyr::unnest(.data$gof) %>% + dplyr::mutate(approach = interaction(.data$method, .data$pooling, .data$package, sep = " ")) %>% + ggplot2::ggplot() + + ggplot2::aes_(y = ~ p, x = ~ approach) + + ggplot2::geom_point() + + ggplot2::ylim(0, 1) + ggplot2::geom_hline(yintercept = .05, lty = 2)+ ggplot2::coord_flip() + ggplot2::facet_wrap( ~ focus) + ggplot2::ggtitle("Goodness of fit") - - if(save) ggplot2::ggsave(paste0(prefix,"gof.pdf"), gg_gof1, h = 4, w = 6) - - if("gof1" %in% which) - return(gg_gof1) - - - if (nrow(res_between) > 0){ - # if (write.csv) readr::write_csv(tidyr::unnest(results, .data$gof_group), paste0(prefix,"gof_group.csv")) - - gof_group <- tidyr::unnest(results, .data$gof_group) - gof_group$approach <- interaction(gof_group$method, gof_group$pooling, gof_group$package) - - gg_gof2 <- - ggplot2::ggplot( - gof_group - , ggplot2::aes_( - y = ~ p - , x = ~ approach - , col = ~ condition) - ) + - ggplot2::geom_point() + ggplot2::ylim(0, 1) + - ggplot2::geom_hline(yintercept = .05, lty = 2)+ - ggplot2::coord_flip() + - ggplot2::facet_wrap(~ focus) + - ggplot2::ggtitle("Goodness of fit") - if(save) ggplot2::ggsave(paste0(prefix, "gof_group.pdf"), gg_gof2, h = 4, w = 8) - - if("gof2" %in% which) - return(gg_gof2) +} + + + +#' @keywords internal + +plot_gof2 <- function(x, ...) { + + gof_group <- x %>% + tidyr::unnest(.data$gof_group) + + if(nrow(gof_group) == 0) stop("No between-subjects conditions present in your results.") + + gof_group %>% + dplyr::mutate(approach = interaction(.data$method, .data$pooling, .data$package, sep = " ")) %>% + ggplot2::ggplot() + + ggplot2::aes_(y = ~ p, x = ~ approach, col = ~ condition) + + ggplot2::geom_point() + + ggplot2::ylim(0, 1) + + ggplot2::geom_hline(yintercept = .05, lty = 2)+ + ggplot2::coord_flip() + + ggplot2::facet_wrap(~ focus) + + ggplot2::ggtitle("Goodness of fit") +} + + + +#' @keywords internal + +plot_gof <- function(x, ...) { + gof_group <- x %>% + tidyr::unnest(.data$gof_group) + + n_conditions <- length(unique(gof_group$condition)) + + if(n_conditions > 1) { + plot_gof2(x) + } else { + plot_gof1(x) } } -# x <- results -# -# -# x <- tidyr::tidyr::unnest(x, est_group) -# -# y.values <- data.frame( -# condition = x$condition -# , method = paste0(x$method, "_", x$pooling, "_pooling") -# , parameter = x$parameter -# , tendency = x$est -# , dispersion = x$se -# , lower_limit = x$ci_0.025 -# , upper_limit = x$ci_0.025 -# ) -# -# # re-order factor -# y.values$method <- factor( -# y.values$method -# , levels = c( -# "asymptotic_complete_pooling" -# , "simple_complete_pooling" -# , "asymptotic_no_pooling" -# , "PB/MLE_no_pooling" -# , "simple_no_pooling" -# , "beta_partial_pooling" -# , "trait_uncorrelated_partial_pooling" -# , "trait_partial_pooling" -# ) -# ) -# -# x$method <- paste0(x$method, "_", x$pooling, "_pooling") -# -# par(mar = c(5, 4, 4, 14), las = 1) -# plot_options <- papaja:::apa_factorial_plot_single( -# aggregated = x -# , y.values = y.values -# , id = "id" -# , dv = "est" -# , factors = c("parameter", "method", "condition") -# , ylim = c(0, 1) -# , plot = c("points", "error_bars") -# , args_points = list( -# cex = .8 -# , pch = c(rep(21, 2), rep(22, 3), rep(23, 3)) -# , bg = c("seagreen4", "indianred3", "skyblue") -# ) -# , args_error_bars = list(length = .01) -# , jit = .5 -# , args_legend = list(x = "right", inset = -.5, xpd = NA) -# ) diff --git a/R/treebugs.R b/R/treebugs.R index a10b5d3..4575da1 100644 --- a/R/treebugs.R +++ b/R/treebugs.R @@ -13,7 +13,7 @@ aggregate_ppp <- function(ppp_list, stat = "T1"){ #' @importFrom rlang .data #' @importFrom magrittr %>% - + mpt_treebugs <- function ( method , dataset @@ -24,10 +24,10 @@ mpt_treebugs <- function ( , core = NULL ){ all_options <- getOption("MPTmultiverse") - + TREEBUGS_MCMC <- all_options$treebugs CI_SIZE <- all_options$ci_size - + # dlist <- prepare_data(model, data, id = "id", condition = "condition") conditions <- unique(data[[condition]]) parameters <- as.character(MPTinR::check.mpt(model)$parameters) @@ -35,13 +35,13 @@ mpt_treebugs <- function ( data$id <- data[[id]] data$condition <- data[[condition]] - + freq_list <- split(data[, col_freq, drop = FALSE], f = data[[condition]]) - pooling <- switch(method, - "simple" = "no", + pooling <- switch(method, + "simple" = "no", "simple_pooling" = "complete", "partial") - + result_row <- make_results_row(model = model, dataset = dataset, pooling = pooling, @@ -52,7 +52,7 @@ mpt_treebugs <- function ( id = id, condition = condition, core = core) - + # Homogeneity tests result_row$test_homogeneity[[1]] <- dplyr::bind_rows( lapply(X = freq_list, FUN = function(x) { @@ -65,68 +65,88 @@ mpt_treebugs <- function ( }) , .id = "condition" ) - - + + if (method == "simple_pooling"){ method <- "simple" - + # pooling: aggregate across participants data <- stats::aggregate(data[, col_freq], list(condition = data$condition), sum) data[[condition]] <- data$condition data[[id]] <- data$id <- as.character(1:nrow(data)) - if(condition!="condition"){ + if(condition != "condition"){ data$condition <- NULL } - + freq_list <- lapply(freq_list, function(x) as.matrix(colSums(x))) - } + } + + # ---------------------------------------------------------------------------- + # Customize prior if necessary if (method == "trait_uncorrelated"){ method <- "trait" prior_args <- list(df = 1, V = NA, xi = "dnorm(0,1)") } else { prior_args <- NULL } - + + if (method == "beta") { + prior_args <- list( + alpha = TREEBUGS_MCMC$prior.beta, + beta = TREEBUGS_MCMC$prior.beta + ) + } + + # ---------------------------------------------------------------------------- + # Run MCMC function from TreeBUGS gof_group <- list() treebugs_fit <- list() - + for (i in seq_along(conditions)){ cond <- conditions[i] sel_condition <- data[[condition]] == conditions[i] data_group <- data[sel_condition, col_freq] #freq_list[[i]] rownames(data_group) <- data[[id]][sel_condition] - - fit_args <- list(eqnfile=model, + + fit_args <- list(eqnfile = model, data = data_group, n.chains = TREEBUGS_MCMC$n.chains, n.iter = TREEBUGS_MCMC$n.iter, n.adapt = TREEBUGS_MCMC$n.adapt, n.burnin = TREEBUGS_MCMC$n.burnin, n.thin = TREEBUGS_MCMC$n.thin) + if (method == "betacpp" && gsub(TREEBUGS_MCMC$prior.beta, pattern = " |\t", replacement = "") != "dgamma(1,.1)") + stop( + 'betacpp not compatible with custom priors as defined via mpt_options(prior.beta = "', + TREEBUGS_MCMC$prior.beta, + '")' + ) if (method %in% c("simple", "betacpp")){ - fit_args["n.adapt"] <- NULL + fit_args$n.adapt <- fit_args$alpha <- fit_args$beta <- NULL fit_args <- c(fit_args, cores = unname(all_options$n.CPU)) } - # print(c(fit_args, prior_args)) + + t0 <- Sys.time() - treebugs_function <- ifelse(method == "betacpp", + treebugs_function <- ifelse(method == "betacpp", "TreeBUGS::betaMPTcpp", paste0("TreeBUGS::", method, "MPT")) - treebugs_fit[[i]] <- do.call(eval(parse(text = treebugs_function)), + treebugs_fit[[i]] <- do.call(eval(parse(text = treebugs_function)), args = c(fit_args, prior_args)) summ <- treebugs_fit[[i]]$mcmc.summ - - # continue MCMC sampling (only for betaMPT and traitMPT) + + # -------------------------------------------------------------------------- + # adaptively continue MCMC sampling (only available for betaMPT and traitMPT) ext_cnt <- 0 try({ while ( ext_cnt < TREEBUGS_MCMC$extend_max && method %in% c("beta", "trait") && (any(stats::na.omit(summ[,"Rhat"]) > TREEBUGS_MCMC$Rhat_max) || any(summ[summ[,"n.eff"] > 0,"n.eff"] < TREEBUGS_MCMC$Neff_min, na.rm = TRUE)) ){ - cat("Drawing additional samples for method = ", method, + cat("Drawing additional samples for method = ", method, ". max(Rhat) = ", round(max(stats::na.omit(summ[summ[,"Rhat"] > 0,"Rhat"])), 2), " ; min(n.eff) = ", round(min(summ[summ[,"n.eff"] > 0,"n.eff"], na.rm = TRUE), 1), "\n") - + treebugs_fit[[i]] <- TreeBUGS::extendMPT(treebugs_fit[[i]], n.iter = TREEBUGS_MCMC$n.iter, n.adapt = TREEBUGS_MCMC$n.adapt) @@ -134,28 +154,29 @@ mpt_treebugs <- function ( ext_cnt <- ext_cnt + 1 } }) - + result_row$estimation[[1]]$time_difference[ result_row$estimation[[1]]$condition == cond - ] <- Sys.time() - t0 + ] <- difftime(Sys.time(), t0, units = "secs") # convergence summary (n.eff / Rhat / all estimates) - tsum <- tibble::as_tibble(summ) %>% + tsum <- tibble::as_tibble(summ) %>% dplyr::mutate(parameter = rownames(summ), - condition = as.character(cond)) %>% + condition = as.character(cond)) %>% dplyr::select(.data$condition, .data$parameter, .data$Mean : .data$Rhat) + result_row$convergence[[1]] <- dplyr::bind_rows(result_row$convergence[[1]], tsum) - + # parameter estimates summMPT <- TreeBUGS::summarizeMPT(treebugs_fit[[i]]$runjags$mcmc, mptInfo = treebugs_fit[[i]]$mptInfo, probs = CI_SIZE, summ = treebugs_fit[[i]]$mcmc.summ) - + sel_group <- result_row$est_group[[1]]$condition == conditions[i] result_row$est_group[[1]][sel_group,-(1:3)] <- summMPT$groupParameters$mean[paste0("mean_", parameters),1:6] - + if (pooling != "complete"){ # # old: array filled into data frame # result_row$est_indiv[[1]][sel_ind,-(1:4)] <- @@ -163,7 +184,7 @@ mpt_treebugs <- function ( sel_ind <- result_row$est_indiv[[1]]$condition == conditions[i] dimnames(summMPT$individParameters)$ID <- rownames(data_group) tmp <- summMPT$individParameters[parameters,,1:(2+length(CI_SIZE)), drop = FALSE] %>% - reshape2::melt() %>% + reshape2::melt() %>% tidyr::spread("Statistic", "value") tmp$identifiable <- NA @@ -177,7 +198,7 @@ mpt_treebugs <- function ( dplyr::select("id", "condition", "parameter", "core"), tmp, by = c("parameter", "id", condition = condition)) } - + gof_group[[i]] <- TreeBUGS::PPP(treebugs_fit[[i]], M = TREEBUGS_MCMC$n.PPP, type = "G2", T2 = pooling != "complete", nCPU = all_options$n.CPU) @@ -193,7 +214,7 @@ mpt_treebugs <- function ( stat_pred = mean(gof_group[[i]]$T1.pred), p = gof_group[[i]]$T1.p ) - + if (pooling != "complete"){ result_row$gof_group[[1]] <- tibble::add_row(result_row$gof_group[[1]], condition = cond, @@ -201,7 +222,7 @@ mpt_treebugs <- function ( stat_obs = mean(gof_group[[i]]$T2.obs), stat_pred = mean(gof_group[[i]]$T2.pred), p = gof_group[[i]]$T2.p) - + sel_fog_ind <- result_row$gof_indiv[[1]]$condition == conditions[i] result_row$gof_indiv[[1]][sel_fog_ind,] <- result_row$gof_indiv[[1]][sel_fog_ind,] %>% @@ -214,33 +235,33 @@ mpt_treebugs <- function ( p = gof_group[[i]]$ind.T1.p) } } - + # between subject comparisons if (length(conditions) > 1){ for (i in 1:(length(conditions) - 1)){ for (j in 2:length(conditions)){ for(p in parameters){ - test_between <- TreeBUGS::betweenSubjectMPT(treebugs_fit[[i]], treebugs_fit[[j]], + test_between <- TreeBUGS::betweenSubjectMPT(treebugs_fit[[i]], treebugs_fit[[j]], par1 = p, stat = "x-y") - test_summ <- TreeBUGS::summarizeMCMC(test_between$mcmc, - probs = CI_SIZE, + test_summ <- TreeBUGS::summarizeMCMC(test_between$mcmc, + probs = CI_SIZE, batchSize = 2) bayesp <- mean(do.call("rbind", test_between$mcmc) <= 0) - - sel_row <- + + sel_row <- result_row$test_between[[1]]$parameter == p & result_row$test_between[[1]]$condition1 == conditions[i] & result_row$test_between[[1]]$condition2 == conditions[j] - - result_row$test_between[[1]][sel_row,-(1:4)] <- - c(test_summ[,c("Mean", "SD")], + + result_row$test_between[[1]][sel_row,-(1:4)] <- + c(test_summ[,c("Mean", "SD")], p = ifelse(bayesp > .5, 1 - bayesp, bayesp) * 2, # two-sided Bayesian p values test_summ[,2 + seq_along(CI_SIZE)]) } } } } - + # don't save T2 if complete pooling was used ---- # Why? I think it would be worthwhile # Daniel: T2 refers to the covariance matrix, which is not defined for aggregated frequencies. @@ -250,9 +271,9 @@ mpt_treebugs <- function ( } result_row$gof[[1]]$type <- c("T1", if(pooling!="complete"){"T2"}) result_row$gof[[1]]$focus <- c("mean", if(pooling!="complete"){"cov"}) - + result_row$gof[[1]][1,-(1:2)] <- aggregate_ppp(gof_group) - + # estimation_time <- unlist(estimation_time) # result_row$estimation[[1]] <- tibble::tibble( @@ -264,7 +285,7 @@ mpt_treebugs <- function ( parnames <- coda::varnames(treebugs_fit[[1]]$runjags$mcmc) par_mat <- expand.grid("parameter1" = parameters, "parameter2" = parameters, stringsAsFactors = FALSE) - + # Parameter correlations & fungibility sel_rho <- grep("rho[", parnames, fixed = TRUE, value = TRUE) sel_mean <- grep("mean[", parnames, fixed = TRUE, value = TRUE) @@ -275,11 +296,11 @@ mpt_treebugs <- function ( batchSize = 2) bayesp <- colMeans(do.call("rbind", mcmc) <= 0) res <- data.frame(par_mat, - rho_summ[,c("Mean", "SD")], + rho_summ[,c("Mean", "SD")], p = ifelse(bayesp > .5, 1 - bayesp, bayesp) * 2, # two-sided Bayesian p values rho_summ[,2 + seq_along(CI_SIZE)]) colnames(res)[3:9] <- colnames(result_row$est_rho[[1]])[6:12] - + samples <- do.call("rbind", treebugs_fit[[i]]$runjags$mcmc[,sel_mean]) colnames(samples) <- parameters rmat <- stats::cor(samples) @@ -291,13 +312,13 @@ mpt_treebugs <- function ( result_row$est_rho[[1]]$condition == conditions[i] if (sum(sel_row) > 0){ result_row$est_rho[[1]][sel_row,-(1:5)] <- res[j,-(1:2)] - result_row$fungibility[[1]][sel_row, "correlation"] <- rmat[par_mat$parameter1[j], + result_row$fungibility[[1]][sel_row, "correlation"] <- rmat[par_mat$parameter1[j], par_mat$parameter2[j]] } } } } - + # save model objects to the working directory if requested by user ---- if(all_options$save_models){ dataset_name <- gsub("^.*[/\\]","", dataset) diff --git a/build/vignette.rds b/build/vignette.rds index ccc7bab217141d9005fc67f19a409005276b617b..d1d1f5cba599f27185f8832c2eedafe7ebfc29e3 100644 GIT binary patch delta 254 zcmVajOLy@)8{p*Kl)M-0u!lFhn$^Ovi$)-5$H zI53k*nCG1t^3fuM_#}ut;s=-p9Sjlr6S_x2(!{gNl{Gpo63exE{bYMktFK}sBvWeD zf9v&PBU6Xz&$eK;97?L!bUI(sX(24fw;bN-K+(~T$xHwpWOte139FPHg_pZsH&zM< zdcF%SaNo{__$JXp|ChppViIykwlRbZR9ZbB|L47vgFH@PUush`H#O0Aho zKbK%W9A8s!;m+_!|7QNC(Pdok(5`;>fZ>GZE}96IyNk9`W?8&8$X9cp0q8^3qaOkQ E0AZ_ns{jB1 delta 253 zcmVux&yQ0P*A<5li7k!a$z~XW$;P+iVpWorUGa`O?Cy3S*h$Oyg%%^u~InD z>qBUc`?kM{9}+F}Zxm{ZNyr@8rvap(lJfcVKkuC# + - + Overview of MPT Multiverse: An Example Application @@ -70,7 +71,7 @@ if (rule.type !== rule.STYLE_RULE || rule.selectorText !== "div.sourceCode") continue; var style = rule.style.cssText; // check if color or background-color is set - if (rule.style.color === '' || rule.style.backgroundColor === '') continue; + if (rule.style.color === '' && rule.style.backgroundColor === '') continue; // replace div.sourceCode by a pre.sourceCode rule sheets[i].deleteRule(j); sheets[i].insertRule('pre.sourceCode{' + style + '}', j); @@ -83,6 +84,9 @@ + + + @@ -91,8 +95,8 @@

Overview of MPT Multiverse: An Example Application

-

Marius Barth and Henrik Singmann

-

2019-03-11

+

Marius Barth and Henrik Singmann

+

2019-08-27

@@ -161,7 +165,7 @@

Example Data: Bayen & Kuhlmann (2011)

## We then plot the response frequencies using plotFreq from the TreeBUGS package TreeBUGS::plotFreq(data, boxplot = FALSE, eqn = EQN_FILE)
-

+

The look at the data.frame tells us which columns contain the subject identifier and the variable encoding group membership. We need to record these variables for the use in fit_mpt.

The plot of the individual response frequencies shows quite some individual variability, but nothing concerning.

Next, we prepare the data for the fitting.

@@ -230,8 +234,7 @@

Checking the Fit

check_results(results)
 #> ## MPTinR: no pooling
 #> Based on asymptotic method, proportion of participants with non-identified parameters:
-#> # Groups:   condition [2]
-#>   <chr>     <lgl>      <dbl>
+#>   condition core  proportion
 #> 1 load      FALSE     0     
 #> 2 load      TRUE      0.0417
 #> 3 no_load   FALSE     0.0208
@@ -244,8 +247,7 @@ 

Checking the Fit

#> 3 no_load TRUE d 1 #> #> Based on PB/MLE method, proportion of participants with non-identified parameters: -#> # Groups: condition [2] -#> <chr> <lgl> <dbl> +#> condition core proportion #> 1 load FALSE 0 #> 2 load TRUE 0.354 #> 3 no_load FALSE 0.0625 @@ -257,8 +259,7 @@

Checking the Fit

#> 2 no_load FALSE g 3 #> #> Based on NPB/MLE method, proportion of participants with non-identified parameters: -#> # Groups: condition [2] -#> <chr> <lgl> <dbl> +#> condition core proportion #> 1 load FALSE 0 #> 2 load TRUE 0.333 #> 3 no_load FALSE 0.0625 @@ -274,34 +275,34 @@

Checking the Fit

#> No convergence problems. #> #> -#> ## TreeBUGS, no, simple: -#> All Rhat < 10 . -#> All effect sample sizes > 2 . +#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // no // TreeBUGS // simple: +#> All Rhat < 10 . +#> All effect sample sizes > 2 . #> #> -#> ## TreeBUGS, complete, simple: -#> All Rhat < 10 . -#> All effect sample sizes > 2 . +#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // complete // TreeBUGS // simple: +#> All Rhat < 10 . +#> All effect sample sizes > 2 . #> #> -#> ## TreeBUGS, partial, trait: -#> All Rhat < 10 . -#> All effect sample sizes > 2 . +#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // partial // TreeBUGS // trait: +#> All Rhat < 10 . +#> All effect sample sizes > 2 . #> #> -#> ## TreeBUGS, partial, trait_uncorrelated: -#> All Rhat < 10 . -#> All effect sample sizes > 2 . +#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // partial // TreeBUGS // trait_uncorrelated: +#> All Rhat < 10 . +#> All effect sample sizes > 2 . #> #> -#> ## TreeBUGS, partial, beta: -#> All Rhat < 10 . -#> All effect sample sizes > 2 . +#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // partial // TreeBUGS // beta: +#> All Rhat < 10 . +#> All effect sample sizes > 2 . #> #> -#> ## TreeBUGS, partial, betacpp: -#> All Rhat < 10 . -#> All effect sample sizes > 2 .
+#> ## 2HTSM_Submodel4.eqn // Kuhlmann_dl7.csv // partial // TreeBUGS // betacpp: +#> All Rhat < 10 . +#> All effect sample sizes > 2 .

In this example, for the no-pooling asymptotic approaches the rate of participants with non-identified parameters is very low. For the bootstrap-based approaches the results pattern is different. Here we see that the rate of participants with non-identified parameters in the load condition is considerably higher, around .17 versus .03 in the no_load condition. Particularly, the \(d\) parameter shows problematic behavior.

For the Bayesian approaches, the betacpp did not reach an effective sample size \(\mathit{ESS} > 2{,}000\). Increasing the number of iterations by typing mpt_options(n.iter = 2e5), and re-running, should solve this problem.

@@ -368,16 +369,16 @@

Plotting Results

The analysis output results is an object of class multiverseMPT, that has its own plot() method. The plot method returns ggplot2 objects, which allows further customization (such as using themes). Type ?plot.multiverseMPT to see the documentation of possible arguments to this method.

To plot group-level parameter estimates use:

plot(results, save = FALSE, "est")
-

+

To plot between-subjects comparisons across all parameters:

plot(results, save = FALSE, "test_between")
-

+

To plot overall goodness-of-fit use:

plot(results, save = FALSE, "gof1")
-

+

To plot group-wise goodness-of-fit use:

plot(results, save = FALSE, "gof2")
-

+

@@ -413,6 +414,9 @@

References

+ + +