Skip to content

Commit

Permalink
Merge pull request #106 from epiforecasts/feature-missing-reference-d…
Browse files Browse the repository at this point in the history
…ates-preprocessing

Feature missing reference dates preprocessing
  • Loading branch information
seabbs committed Jul 8, 2022
2 parents 3dc4ac0 + d314b7f commit 47eae4a
Show file tree
Hide file tree
Showing 46 changed files with 716 additions and 214 deletions.
1 change: 1 addition & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ RUN apt-get update \
RUN Rscript -e 'devtools::install_github("mdlincoln/docthis")'
RUN Rscript -e 'devtools::install_github("lorenzwalthert/precommit")'
RUN Rscript -e 'devtools::install_github("milesmcbain/fnmate")'
RUN Rscript -e 'devtools::install_github("milesmcbain/datapasta")'

# add dependencies for logo making
RUN install2.r --error --skipinstalled --repos ${CRAN} --ncpus -1 \
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ export(enw_add_metaobs_features)
export(enw_add_pooling_effect)
export(enw_as_data_list)
export(enw_assign_group)
export(enw_complete_dates)
export(enw_construct_data)
export(enw_dates_to_factors)
export(enw_delay_filter)
Expand All @@ -26,6 +27,7 @@ export(enw_inits)
export(enw_latest_data)
export(enw_manual_formula)
export(enw_metadata)
export(enw_missing_reference)
export(enw_model)
export(enw_new_reports)
export(enw_nowcast_samples)
Expand Down Expand Up @@ -59,6 +61,7 @@ importFrom(cmdstanr,cmdstan_model)
importFrom(data.table,":=")
importFrom(data.table,.N)
importFrom(data.table,.SD)
importFrom(data.table,CJ)
importFrom(data.table,as.data.table)
importFrom(data.table,copy)
importFrom(data.table,data.table)
Expand Down
6 changes: 4 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ This is a major release and contains multiple breaking changes. If needing the o
## Package

* A new helper function `enw_delay_metadata()` has been added. This produces metadata about the delay distribution vector that may be helpful in future modelling. This prepares the way for [#4](https://github.com/epiforecasts/epinowcast/issues/4) where this data frame will be combined with the reference metadata in order to build non-parametric hazard reference and delay based models. In addition to adding this function, it has also been added to the output of `enw_preprocess_data()` in order to make the metadata readily available to end-users. See [#80](https://github.com/epiforecasts/epinowcast/pull/80) by [@seabbs](https://github.com/seabbs).
* Two new helper functions `enw_filter_reference_dates()` and `enw_filter_report_dates()` have been added. These replace `enw_retrospective_data()` but allow users to similarly construct retrospective data. Splitting these functions out into components also allows for additional use cases that were not previously possible. See [#82](https://github.com/epiforecasts/epinowcast/pull/82) by [@sbfnk](https://github.com/sbfnk) and [@seabbs](https://github.com/seabbs).
* Two new helper functions `enw_filter_reference_dates()` and `enw_filter_report_dates()` have been added. These replace `enw_retrospective_data()` but allow users to similarly construct retrospective data. Splitting these functions out into components also allows for additional use cases that were not previously possible. Note that by definition it is assumed that a report date for a given reference date must be the equal or greater (i.e a report cannot happen before the event being reported occurs). See [#82](https://github.com/epiforecasts/epinowcast/pull/82) by [@sbfnk](https://github.com/sbfnk) and [@seabbs](https://github.com/seabbs).
* The internal grouping variables have been refactored to reduce the chance of clashes with columns in the data frames supplied by the user. There will also be an error thrown in case of a variable clash, making preprocessing safer. See [#102](https://github.com/epiforecasts/epinowcast/pull/102) by [@adrian-lison](https://github.com/adrian-lison) and [@seabbs](https://github.com/seabbs), which solves [#99](https://github.com/epiforecasts/epinowcast/issues/99).
* Support for preprocessing observations with missing reference dates has been added along with a new data object returned by `enw_preprocess_data()` that highlights this information to the user (alternatively can be accessed by users using `enw_missing_reference()`). In addition, these missing observations have been setup to be passed to stan in order to allow their use in modelling. This feature is in preparation of adding full support for missing observations (see [#43](https://github.com/epiforecasts/epinowcast/issues/43)). See
[#106](https://github.com/epiforecasts/epinowcast/pull/106) by [@adrian-lison](https://github.com/adrian-lison) and [@seabbs](https://github.com/seabbs).

## Model

Expand All @@ -19,7 +21,7 @@ This is a major release and contains multiple breaking changes. If needing the o

## Documentation

* The model descriptipn has been updated to reflect the currently implemented model and to improve readability. The use use of reference and report date nomenclature has also been standardised across the package. See [#71](https://github.com/epiforecasts/epinowcast/pull/71) by [@sbfnk](https://github.com/sbfnk) and [@seabbs](https://github.com/seabbs).
* The model description has been updated to reflect the currently implemented model and to improve readability. The use use of reference and report date nomenclature has also been standardised across the package. See [#71](https://github.com/epiforecasts/epinowcast/pull/71) by [@sbfnk](https://github.com/sbfnk) and [@seabbs](https://github.com/seabbs).

## Internals

Expand Down
2 changes: 1 addition & 1 deletion R/model-tools.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ enw_formula_as_data_list <- function(data, reference_effects, report_effects) {
#' @param distribution Character string indicating the type of distribution to
#' use for reference date effects. The default is to use a lognormal but other
#' options available include the exponential and gamma distributions. If "none"
#' is specfied then no parametric delay distribution is used.
#' is specified then no parametric delay distribution is used.
#'
#' @param nowcast Logical, defaults to `TRUE`. Should a nowcast be made using
#' posterior predictions of the unobserved future reported notifications.
Expand Down
28 changes: 24 additions & 4 deletions R/model.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ enw_priors <- function() {
"logmean_sd",
"logsd_sd",
"rd_eff_sd",
"sqrt_phi"
"sqrt_phi",
"alpha_int",
"alpha_sd"
),
description = c(
"Standard deviation for expected final observations",
Expand All @@ -28,7 +30,10 @@ enw_priors <- function() {
"Standard deviation of scaled pooled logmean effects",
"Standard deviation of scaled pooled logsd effects",
"Standard deviation of scaled pooled report date effects",
"One over the square of the reporting overdispersion"
"One over the square of the reporting overdispersion",
"Logit start value for share of cases with known reference date",
"Standard deviation of random walk for share of cases with known
reference date"
),
distribution = c(
"Zero truncated normal",
Expand All @@ -37,10 +42,12 @@ enw_priors <- function() {
"Zero truncated normal",
"Zero truncated normal",
"Zero truncated normal",
"Zero truncated normal",
"Normal",
"Zero truncated normal"
),
mean = c(0, 1, 0.5, rep(0, 4)),
sd = rep(1, 7)
mean = c(0, 1, 0.5, rep(0, 4), 0, 0),
sd = c(rep(1, 7), 1, 0.1)
)
}

Expand Down Expand Up @@ -105,6 +112,19 @@ enw_obs_as_data_list <- function(pobs) {
flat_obs = flat_obs,
latest_obs = latest_matrix
)
if (nrow(pobs$missing_reference[[1]]) > 0) {
# obs with missing reference date
missing_reference <- data.table::copy(pobs$missing_reference[[1]])
data.table::setorderv(missing_reference, c(".group", "report_date"))
missing_reference <- as.matrix(
data.table::dcast(
missing_reference, .group ~ report_date,
value.var = "confirm",
fill = 0
)[, -1]
)
data$missing_ref <- missing_reference
}
return(data)
}

Expand Down
Loading

0 comments on commit 47eae4a

Please sign in to comment.