-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/check up to date #338
Changes from all commits
76d0bd6
4fdbbdc
cca4094
9ae5f7a
5b08369
e286aa7
80c51cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
#' Check model and data up to date with outputs | ||
#' | ||
#' Functions for checking that the model outputs on disk match the md5 hashes | ||
#' stored in `bbi_config.json` at run time. In other words, checking that the | ||
#' model and data files have _not_ changed since the model was last run. See | ||
#' Details section for what specific files are checked. | ||
#' | ||
#' @details | ||
#' Different files are checked depending on what type of model is being checked. | ||
#' | ||
#' **For NONMEM models** | ||
#' | ||
#' * The model file (control stream) | ||
#' | ||
#' * The data file (referenced in `$DATA` within the control stream) | ||
#' | ||
#' **Currently only NONMEM implemented.** | ||
#' | ||
#' @param .bbi_object the object to check. Could be | ||
#' a `bbi_{.model_type}_model` object, | ||
#' a `bbi_{.model_type}_summary` object, | ||
#' or a `bbi_log_df` tibble. | ||
#' @param ... Arguments passed through (currently none). | ||
#' | ||
#' @return | ||
#' | ||
#' **`bbi_model`** method invisibly returns a logical vector of length 2. The | ||
#' first element (named `"model"`) refers to the model files mentioned above. | ||
#' The second element (named `"data"`) refers to the data files mentioned above. | ||
#' For both elements, they will be `TRUE` if nothing has changed, `FALSE` if | ||
#' anything has changed. Note: _if no file exists_ at the specified path, | ||
#' `FALSE` will be returned because that is technically a "change." The file | ||
#' used to exist and now it does not. | ||
#' | ||
#' **`bbi_log_df`** method invisibly returns a named list of lists, with one | ||
#' element for each row in the input tibble, with the name corresponding to the | ||
#' value in the `run` column for that row. Each element of the list will contain | ||
#' the two-element list returned from the `bbi_model` method (described above) | ||
#' for the relevant model. | ||
#' | ||
#' There is no `add_up_to_date()` function because **if you would like to add | ||
#' these columns to a `bbi_log_df` tibble** you can use [add_config()], which | ||
#' contains `model_has_changed` and `data_has_changed` columns. Please note: | ||
#' these contain the opposite boolean values (`check_up_to_date()` returns | ||
#' `TRUE` if up to date, `*_has_changed` returns `TRUE` if _changed_). | ||
#' | ||
#' **The returned value is invisible because a message is printed** alerting the | ||
#' user of the specific files that have changed, if any. This facilitates | ||
#' calling the function for this side effect without explicitly handling the | ||
#' returned value. | ||
#' | ||
#' @export | ||
check_up_to_date <- function(.bbi_object, ...) { | ||
UseMethod("check_up_to_date") | ||
} | ||
|
||
#' @export | ||
check_up_to_date.bbi_nonmem_model <- function(.bbi_object, ...) { | ||
check_up_to_date_nonmem(.bbi_object) | ||
} | ||
|
||
#' @export | ||
check_up_to_date.bbi_nonmem_summary <- function(.bbi_object, ...) { | ||
check_up_to_date_nonmem(.bbi_object) | ||
} | ||
|
||
#' @export | ||
check_up_to_date.bbi_log_df <- function(.bbi_object, ...) { | ||
|
||
check_list <- map(.bbi_object[[ABS_MOD_PATH]], function(.p) { | ||
tryCatch( | ||
check_up_to_date(read_model(.p)), | ||
error = function(.e) { | ||
.error_msg <- paste(as.character(.e$message), collapse = " -- ") | ||
if (grepl(CHECK_UP_TO_DATE_ERR_MSG, .error_msg, fixed = TRUE)) { | ||
message(.error_msg) | ||
return(as.logical(c(model = NA, data = NA))) | ||
} else { | ||
stop(.e) | ||
} | ||
} | ||
) | ||
}) | ||
|
||
names(check_list) <- .bbi_object[[RUN_ID_COL]] | ||
return(invisible(check_list)) | ||
} | ||
|
||
#################################### | ||
# PRIVATE implementation functions | ||
#################################### | ||
|
||
|
||
#' Private implementation to check that NONMEM model is up-to-date | ||
#' | ||
#' Specifically, check that control stream and data file on disk have not | ||
#' changed since the model was run. This is accomplished by taking their md5 | ||
#' hashes and comparing it to the hashes stored in `bbi_config.json`. | ||
#' | ||
#' @importFrom jsonlite fromJSON | ||
#' @importFrom fs file_exists | ||
#' @importFrom tidyr replace_na | ||
#' | ||
#' @inheritParams check_up_to_date | ||
#' | ||
#' @keywords internal | ||
check_up_to_date_nonmem <- function(.mod) { | ||
config_path <- file.path(get_output_dir(.mod, .check_exists = FALSE), "bbi_config.json") | ||
if (!fs::file_exists(config_path)) { | ||
stop(paste(glue("Model {get_model_id(.mod)}:"), CHECK_UP_TO_DATE_ERR_MSG)) | ||
} | ||
config <- jsonlite::fromJSON(config_path) | ||
|
||
# check necessary files for changes | ||
model_file <- get_model_path(.mod) | ||
data_file <- get_data_path(.mod) | ||
|
||
changed_files <- c( | ||
config[[CONFIG_MODEL_MD5]] != tools::md5sum(model_file), | ||
config[[CONFIG_DATA_MD5]] != tools::md5sum(data_file) | ||
) | ||
|
||
any_changes <- any(changed_files) | ||
if(isTRUE(any_changes)) { | ||
message(paste( | ||
glue("The following files have changed in {get_model_id(.mod)}"), | ||
paste("*", names(which(changed_files)), collapse = "\n"), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe this is just a me thing but it's kinda weird to mix
paste works so this isn't wrong I just don't see why we are using it if we have already imported the glue package There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea, I see that. I always just use paste unless I explicitly need the |
||
sep = "\n" | ||
)) | ||
} | ||
|
||
na_files <- is.na(changed_files) | ||
if(isTRUE(any(na_files))) { | ||
message(paste( | ||
glue("The following files in {get_model_id(.mod)} ARE NO LONGER PRESENT"), | ||
paste("*", names(changed_files[na_files]), collapse = "\n"), | ||
sep = "\n" | ||
)) | ||
} | ||
|
||
# build return value | ||
res <- replace_na(!changed_files, FALSE) | ||
names(res) <- c("model", "data") | ||
|
||
return(invisible(res)) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,7 +118,6 @@ config_log_impl <- function(.mods) { | |
#' | ||
#' @param path A string giving the path to `bbi_config.json`. | ||
#' @param fields A character vector of fields to include. | ||
#' @param model_path_field A string giving the field name for the model path. | ||
#' | ||
#' @return A list whose elements include | ||
#' | ||
|
@@ -137,28 +136,27 @@ config_log_impl <- function(.mods) { | |
#' | ||
#' @keywords internal | ||
config_log_entry <- function(path, | ||
fields = CONFIG_KEEPERS, | ||
model_path_field = ABS_MOD_PATH) { | ||
fields = CONFIG_KEEPERS) { | ||
checkmate::assert_file_exists(path) | ||
checkmate::assert_character(fields) | ||
checkmate::assert_string(model_path_field) | ||
|
||
cfg_mod <- read_model_from_config(path) | ||
config <- jsonlite::fromJSON(path) | ||
|
||
if (!all(fields %in% names(config))) { | ||
msg <- glue::glue( | ||
glue::glue( | ||
msg <- paste( | ||
glue( | ||
"{path} is missing the required keys:", | ||
"`{paste(fields[!(fields %in% names(config))], collapse = ', ')}`", | ||
"and will be skipped.", | ||
.sep = " " | ||
), | ||
glue::glue( | ||
glue( | ||
"This is likely because it was run with an old version of bbi.", | ||
"Model was run on version {config[['bbi_version']]}", | ||
.sep = " " | ||
), | ||
glue::glue( | ||
glue( | ||
"User can call `bbi_current_release()` to see the most recent release", | ||
"version, and call `use_bbi(options('bbr.bbi_exe_path'))` to", | ||
"upgrade to the version.", | ||
|
@@ -171,32 +169,16 @@ config_log_entry <- function(path, | |
return(NULL) | ||
} | ||
|
||
# the model_path field may not exist, e.g., it could be the home directory of | ||
# the user who ran the model, so we cannot use it directly | ||
output_dir <- dirname(path) | ||
model_path <- fs::path_ext_set(output_dir, config[["model_extension"]]) | ||
matches <- suppressMessages(check_up_to_date(cfg_mod)) | ||
|
||
data_path <- fs::path_norm( | ||
file.path( | ||
output_dir, | ||
config[["data_path"]] | ||
) | ||
) | ||
|
||
matches <- purrr::map2_lgl( | ||
c(model_path, data_path), | ||
c(config[["model_md5"]], config[["data_md5"]]), | ||
file_matches | ||
) | ||
|
||
config[["model_has_changed"]] <- !matches[1] | ||
config[["data_has_changed"]] <- !matches[2] | ||
config[["nm_version"]] <- resolve_nonmem_version(config) | ||
config[[model_path_field]] <- output_dir | ||
config[["model_has_changed"]] <- as.logical(!matches["model"]) # use as.logical to strip off names | ||
config[["data_has_changed"]] <- as.logical(!matches["data"]) # use as.logical to strip off names | ||
config[["nm_version"]] <- resolve_nonmem_version(config) %||% NA_character_ | ||
config[[ABS_MOD_PATH]] <- cfg_mod[[ABS_MOD_PATH]] | ||
|
||
out_fields <- c( | ||
ABS_MOD_PATH, | ||
fields, | ||
model_path_field, | ||
"nm_version", | ||
"model_has_changed", | ||
"data_has_changed" | ||
|
@@ -205,27 +187,6 @@ config_log_entry <- function(path, | |
config[out_fields] | ||
} | ||
|
||
#' Compare a file to an MD5 sum | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this moved to utils.R |
||
#' | ||
#' @param path String giving the path to the file. | ||
#' @param md5 String giving expected MD5 sum. | ||
#' | ||
#' @return `TRUE` if `path` matches `md5`, otherwise `FALSE` (including if | ||
#' `path` doesn't exist). | ||
#' | ||
#' @keywords internal | ||
file_matches <- function(path, md5) { | ||
checkmate::assert_string(path) | ||
checkmate::assert_string(md5) | ||
|
||
if (file.exists(path)) { | ||
res <- tools::md5sum(path) == md5 | ||
} else { | ||
res <- FALSE | ||
} | ||
|
||
res | ||
} | ||
|
||
#' Determine the NONMEM version used | ||
#' | ||
|
@@ -247,3 +208,41 @@ resolve_nonmem_version <- function(x) { | |
} | ||
ver | ||
} | ||
|
||
#' Create a model object from a bbi_config.json path | ||
#' | ||
#' This is non-trivial because, while configs | ||
#' always sit in the ouput directory, for NONMEM models | ||
#' this is one dir deeper than the YAML and for Stan | ||
#' models this is two dirs deeper than the YAML. | ||
#' Hence, this is abstracted into an obnoxious private | ||
#' helper function. | ||
#' @importFrom fs path_norm path_ext_set file_exists | ||
#' @importFrom stringr str_detect | ||
#' @importFrom checkmate assert_file_exists assert_true | ||
#' @param .config_path The absolute path a `bbi_config.json` file | ||
#' @return a `bbi_{.model_type}_model` object | ||
#' @keywords internal | ||
read_model_from_config <- function(.config_path) { | ||
checkmate::assert_true(stringr::str_detect(.config_path, "bbi_config\\.json$")) | ||
checkmate::assert_file_exists(.config_path) | ||
|
||
potential_nm_path <- file.path(.config_path, "..") %>% | ||
fs::path_norm() %>% | ||
fs::path_ext_set("yaml") | ||
|
||
potential_stan_path <- file.path(.config_path, "..", "..") %>% | ||
fs::path_norm() %>% | ||
fs::path_ext_set("yaml") | ||
|
||
winner <- names(which(fs::file_exists(c(potential_nm_path, potential_stan_path)))) | ||
|
||
if (length(winner) != 1) { | ||
dev_error(glue("read_model_from_config() checked {potential_nm_path} and {potential_stan_path} and the following exist: {paste(winner, collapse = ', ')}")) | ||
} | ||
|
||
mod <- suppressMessages( | ||
read_model(tools::file_path_sans_ext(winner)) | ||
) | ||
return(mod) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, the bbi_nonmem_summary and bbi_nonmem_model methods are identical so maybe consider just setting this as the default method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the point of this is prepare for adding Stan stuff so, even though these are the same, they're not gonna be the default methods. I considered adding another parent class that they would share, but that seemed a little excessive to have something like
(
"bbi_model"
is already the parent class that they'll share with Stan and other kinds of models)Anyway, that would be a totally valid way to do it, but I opted for just duplicating a couple dispatches instead (and made them use the same private helper to avoid copy/paste).