Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
59ca31a
Created layer_add_forecast_date and ..._target_date as well as tests …
Jun 18, 2022
cf9aaa9
recipes::step
Jun 18, 2022
6b82626
recipes:: to Rd file for added layers
Jun 18, 2022
bcfda7f
parsnip::fit
Jun 18, 2022
7c7e6e4
recipes::all_predictors()
Jun 18, 2022
85e5232
epipredict:::frosting()
Jun 18, 2022
26aa627
epipredict::add_frosting()
Jun 18, 2022
8780a36
Removed epipredict::: from doc
Jun 22, 2022
a64a549
Removed id from user facing
Jun 22, 2022
fcdddaf
Setting up for changes to make once able to access the preprocessor
Jun 22, 2022
0ae474a
Trying to see if recipe is accessible
Jun 25, 2022
ab795c1
testing forecast date layer
Jun 25, 2022
5e79dc3
Re-added forecast date
Jun 25, 2022
0c21a6d
testing
Jul 1, 2022
b8211fa
Updating this branch to reflect previous updates to frosting
Jul 1, 2022
c3491db
Put id back to where it was before
Jul 1, 2022
882a493
Add id
Jul 1, 2022
12351d3
Updated documentation & fixed fun
Jul 2, 2022
606fd89
removed object
Jul 2, 2022
af49a7e
Updated doc and ex
Jul 2, 2022
b05cff5
added import
Jul 2, 2022
33b8018
Got ahead from recipe
Jul 6, 2022
235cba8
some updates to forecast_date script
Jul 6, 2022
b608c2e
Fixed layer_add_forecast_date to remove parameter for newdata
Jul 6, 2022
ddf3d94
Added more details
Jul 6, 2022
8726dd9
Changed around spacing of doc.
Jul 7, 2022
2ebf283
Updates as per comments left
Jul 8, 2022
2792bc4
Enabled user to specify a target date
Jul 8, 2022
1037c79
Minor rewording
Jul 8, 2022
14d8344
Made suggested changes
Jul 9, 2022
0a053d3
Removed white space
Jul 9, 2022
0b2f048
Better way to access ahead
Jul 9, 2022
77ae185
Reformatted
Jul 9, 2022
3010355
is.null()
Jul 9, 2022
0084c5f
remove test
Jul 9, 2022
77e7e86
Merge branch 'frosting' of https://github.com/cmu-delphi/epipredict i…
Jul 9, 2022
78ecd9d
Had to call ahead another way after update from frosting
Jul 9, 2022
6997017
Took out test
Jul 9, 2022
9a304ed
Simplify code a little
Jul 9, 2022
09a7e94
take out test
Jul 9, 2022
b2ba576
To force update this branch to what is on frosting (used git pull ori…
Jul 18, 2022
cb1c753
Update to match frosting branch
Jul 19, 2022
2745927
extract_argument() to get ahead
Jul 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ S3method(print,step_epi_lag)
S3method(quantile,dist_quantiles)
S3method(refresh_blueprint,default_epi_recipe_blueprint)
S3method(run_mold,default_epi_recipe_blueprint)
S3method(slather,layer_add_forecast_date)
S3method(slather,layer_add_target_date)
S3method(slather,layer_naomit)
S3method(slather,layer_predict)
S3method(slather,layer_predictive_distn)
Expand Down Expand Up @@ -81,6 +83,8 @@ export(knn_iteraive_ar_forecaster)
export(knnarx_args_list)
export(knnarx_forecaster)
export(layer)
export(layer_add_forecast_date)
export(layer_add_target_date)
export(layer_naomit)
export(layer_predict)
export(layer_predictive_distn)
Expand Down
92 changes: 92 additions & 0 deletions R/layer_add_forecast_date.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#' Postprocessing step to add the forecast date
#'
#' @param frosting a `frosting` postprocessor
#' @param forecast_date The forecast date to add as a column to the `epi_df`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Sorry, I missed this. To my mind the forecast_date is the date on which the forecast is made. So, by default, it should be max(time_value) from the training data. The target_date should be "the date the forecast is for". So that one should be max(time_value) + ahead by default.

It looks like they're both the same currently right?

Copy link
Contributor Author

@rachlobay rachlobay Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost… Current defaults: forecast_date = max_time_value + ahead (where max_time_value is for test data), target_date = time_value + ahead (based on simple.forecasts.Rmd). But I will change to what you specified. A couple qs for that...

For forecast_date, to get the max time value from the training data, is that using mold from components? Then, I check if the forecast_date < as_of_date of the test data and throw a warning there (that says "forecast_date is less than the most recent update date of the data.”, yes?

For target_date, by default, that is the max time value in the in the test data + ahead or no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficulty here is that leading/lagging increments the time_value. So this could all be a bit dangerous. In an ideal setting:

  1. target_date would be the max time value in the test data + ahead, as you said.
  2. forecast_date would be the max time value in the test data. (So seemingly, the date of the most recent data available to you). Acting like we have data up to (and including) today, and we produce a forecast today, then this should work.
    However, lots of weird crud can happen that would screw these up. If data isn't available for today but only for yesterday, then that would throw things off. If we accidentally lead the test data, then it'll produce "future" time_values.

#' For most cases, this should be specified in the form "yyyy-mm-dd". Note that
#' when the forecast date is left unspecified, it is set to the maximum time
#' value in the test data after any processing (ex. leads and lags) has been
#' applied.
#' @param id a random id string
#'
#' @return an updated `frosting` postprocessor
#'
#' @details To use this function, either specify a forecast date or leave the
#' forecast date unspecifed here. In the latter case, the forecast date will
#' be set as the maximum time value in the processed test data. In any case,
#' when the forecast date is less than the most recent update date of the data
#' (ie. the `as_of` value), an appropriate warning will be thrown.
#'
#' @export
#' @examples
#' jhu <- case_death_rate_subset %>%
#' dplyr::filter(time_value > "2021-11-01", geo_value %in% c("ak", "ca", "ny"))
#' r <- epi_recipe(jhu) %>%
#' step_epi_lag(death_rate, lag = c(0, 7, 14)) %>%
#' step_epi_ahead(death_rate, ahead = 7) %>%
#' recipes::step_naomit(recipes::all_predictors()) %>%
#' recipes::step_naomit(recipes::all_outcomes(), skip = TRUE)
#' wf <- epi_workflow(r, parsnip::linear_reg()) %>% parsnip::fit(jhu)
#' latest <- jhu %>%
#' dplyr::filter(time_value >= max(time_value) - 14)
#'
#' # Specify a `forecast_date` that is greater than or equal to `as_of` date
#' f <- frosting() %>% layer_predict() %>%
#' layer_add_forecast_date(forecast_date = "2022-05-31") %>%
#' layer_naomit(.pred)
#' wf1 <- wf %>% add_frosting(f)
#'
#' p1 <- predict(wf1, latest)
#' p1
#'
#' # Specify a `forecast_date` that is less than `as_of` date
#' f2 <- frosting() %>%
#' layer_predict() %>%
#' layer_add_forecast_date(forecast_date = "2021-12-31") %>%
#' layer_naomit(.pred)
#' wf2 <- wf %>% add_frosting(f2)
#'
#' p2 <- predict(wf2, latest)
#' p2
#'
#' # Do not specify a forecast_date
#' f3 <- frosting() %>%
#' layer_predict() %>%
#' layer_add_forecast_date() %>%
#' layer_naomit(.pred)
#' wf3 <- wf %>% add_frosting(f3)
#'
#' p3 <- predict(wf3, latest)
#' p3
layer_add_forecast_date <-
function(frosting, forecast_date = NULL, id = rand_id("add_forecast_date")) {
add_layer(
frosting,
layer_add_forecast_date_new(
forecast_date = forecast_date,
id = id
)
)
}

layer_add_forecast_date_new <- function(forecast_date, id = id) {
layer("add_forecast_date", forecast_date = forecast_date, id = id)
}

#' @export
slather.layer_add_forecast_date <- function(object, components, the_fit, the_recipe, ...) {

if (is.null(object$forecast_date)) {
max_time_value <- max(components$keys$time_value)
object$forecast_date <- max_time_value
}

as_of_date <- as.Date(attributes(components$keys)$metadata$as_of)

if (object$forecast_date < as_of_date) {
warning("forecast_date is less than the most recent update date of the data.")
}

components$predictions <- dplyr::bind_cols(components$predictions,
forecast_date = as.Date(object$forecast_date))
components
}
79 changes: 79 additions & 0 deletions R/layer_add_target_date.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#' Postprocessing step to add the target date
#'
#' @param frosting a `frosting` postprocessor
#' @param target_date The target date to add as a column to the `epi_df`.
#' By default, this is the maximum `time_value` from the processed test
#' data plus `ahead`, where `ahead` has been specified in preprocessing
#' (most likely in `step_epi_ahead`). The user may override this with a
#' date of their own (that will usually be in the form "yyyy-mm-dd").
#' @param id a random id string
#'
#' @return an updated `frosting` postprocessor
#'
#' @details By default, this function assumes that a value for `ahead`
#' has been specified in a preprocessing step (most likely in
#' `step_epi_ahead`). Then, `ahead` is added to the maximum `time_value`
#' in the test data to get the target date.
#'
#' @export
#' @examples
#' jhu <- case_death_rate_subset %>%
#' dplyr::filter(time_value > "2021-11-01", geo_value %in% c("ak", "ca", "ny"))
#' r <- epi_recipe(jhu) %>%
#' step_epi_lag(death_rate, lag = c(0, 7, 14)) %>%
#' step_epi_ahead(death_rate, ahead = 7) %>%
#' recipes::step_naomit(recipes::all_predictors()) %>%
#' recipes::step_naomit(recipes::all_outcomes(), skip = TRUE)
#' wf <- epi_workflow(r, parsnip::linear_reg()) %>% parsnip::fit(jhu)
#' latest <- jhu %>%
#' dplyr::filter(time_value >= max(time_value) - 14)
#'
#' # Use ahead from preprocessing
#' f <- frosting() %>% layer_predict() %>%
#' layer_add_target_date() %>% layer_naomit(.pred)
#' wf1 <- wf %>% add_frosting(f)
#'
#' p <- predict(wf1, latest)
#' p
#'
#' # Override default behaviour by specifying own target date
#' f2 <- frosting() %>% layer_predict() %>%
#' layer_add_target_date(target_date = "2022-01-08") %>% layer_naomit(.pred)
#' wf2 <- wf %>% add_frosting(f2)
#'
#' p2 <- predict(wf2, latest)
#' p2
layer_add_target_date <-
function(frosting, target_date = NULL, id = rand_id("add_target_date")) {
add_layer(
frosting,
layer_add_target_date_new(
target_date = target_date,
id = id
)
)
}

layer_add_target_date_new <- function(id = id, target_date = target_date) {
layer("add_target_date", target_date = target_date, id = id)
}

#' @export
slather.layer_add_target_date <- function(object, components, the_fit, the_recipe, ...) {

if (is.null(object$target_date)) {
max_time_value <- max(components$keys$time_value)
ahead <- extract_argument(the_recipe, "step_epi_ahead", "ahead")

if (is.null(ahead)){
stop("`ahead` must be specified in preprocessing.")
}
target_date = max_time_value + ahead
} else{
target_date = as.Date(object$target_date)
}

components$predictions <- dplyr::bind_cols(components$predictions,
target_date = target_date)
components
}
2 changes: 1 addition & 1 deletion man/epi_workflow.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

77 changes: 77 additions & 0 deletions man/layer_add_forecast_date.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

63 changes: 63 additions & 0 deletions man/layer_add_target_date.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading