initial layer adjustments #334

dsweber2 · 2024-05-15T23:17:49Z

This is to keep the discussion on the layer additions a little bit separate from the other PR, because that is becoming unwieldy in length. The documentation is still a WIP.

dsweber2 · 2024-05-24T01:59:59Z

ok, also added in some stuff to deal with inter-geo latency, based on @lcbrooks discussion on the other PR. The option is called epi_keys_checked, and defaults to just "geo_value", but can group by any of the epi_keys

R/step_adjust_latency.R

dajmcdon · 2024-05-28T15:19:21Z

R/step_adjust_latency.R

+#'   amount to offset the ahead or lag by. If a single integer, this is used for
+#'   all columns; if a labeled vector, the labels must correspond to the base
+#'   column names (before lags/aheads).  If `NULL`, the latency is the distance
+#'   between the `epi_df`'s `max_time_value` and either the
+#'   `fixed_forecast_date` or the `epi_df`'s `as_of` field (the default for
+#'   `forecast_date`).
+#' @param fixed_forecast_date either a date of the same kind used in the
+#'   `epi_df`, or `NULL`. Exclusive with `fixed_latency`. If a date, it gives
+#'   the date from which the forecast is actually occurring. If `NULL`, the
+#'   `forecast_date` is determined either via the `fixed_latency`, or is set to


This (and the previous parameters) all seems like a lot of complexity/interaction with each other. I'm especially concerned with the "exclusive with" components. But in any case, it's quite the tree (hard to test?).

Maybe this is the cleanest it gets, but it might be worth brainstorming.

fixed_* are mostly about giving the user easy options, depending on which feature of latency they actually know, rather than having to compute it by hand. My explanation may not be particularly good though, and if you have suggestions for a different interface I'd welcome that.

A lot of it shakes out to simple checking for NULL before actually setting a value, since I have to set both latency and forecast_date anyways. The option just replaces the calculated versions with a fixed one.

After thinking through the logic again, epi_keys_checked is always used, either as part of computing the forecast_date given a fixed_latency or as part of computing the latency given a fixed_forecast_date, so it doesn't actually get caught up in this.

dajmcdon · 2024-05-28T15:23:26Z

R/utils-latency.R

+      # null and "" don't work in `group_by`
+      if (!is.null(epi_keys_checked) && epi_keys_checked != "") {
+        group_by(., get(epi_keys_checked))
+      } else {
+        .
+      }


This pattern (and same below) is pretty hard to parse. Just for the sake of clarity, maybe break the pipe sequence into a few lines?

sure, I was just following the convention in the function previously of "in-piping" the logic.

and by in the function previously, I actually mean in arx_forecaster

epipredict/R/arx_forecaster.R

Lines 129 to 146 in de6e1db

r <- r %>%

step_epi_ahead(!!outcome, ahead = args_list$ahead) %>%

step_epi_naomit() %>%

step_training_window(n_recent = args_list$n_training) %>%

{

if (!is.null(args_list$check_enough_data_n)) {

check_enough_train_data(

.,

all_predictors(),

!!outcome,

n = args_list$check_enough_data_n,

epi_keys = args_list$check_enough_data_epi_keys,

drop_na = FALSE

)

} else {

.

}

}

I guess I missed that when reviewing that PR...

gotcha, I'll switch them both then

You were going to switch these right?

forgot there were 3 instances instead of just 2

dajmcdon · 2024-05-28T15:25:47Z

R/utils-latency.R

+        if (inherits(this_recipe$steps[[3]], "step_adjust_latency")) x$as_of
+      }
+    ) %>% Filter(Negate(is.null), .)
+    if (length(handpicked_as_of) > 0) {
+      max_time_value <- handpicked_as_of[[1]]
+    } else {


What's the significance of the list element by position here ([[3]] and [[1]])? This can be potentially dangerous.

the this_recipe$steps[[3]] was left over from development, surprised it didn't cause any errors yet.

handpicked_as_of should have only value (otherwise they have multiple step_adjust_latencys, which shouldn't work). I suppose I should add a check that there's only one step_adjust_latency during step creation

- drop multiline pipes - better docs - check exclusive parameters aren't used simultaneously - inherit typo - additional placeholders for future tests

dajmcdon · 2024-06-11T15:15:24Z

@dsweber2 Is this the PR I'm blocking? Is there something in particular I should focus on in review?

dsweber2 · 2024-06-13T17:10:47Z

This one and it's parent adjustAhead are both more or less waiting on your review. Couldn't remember when you were back, probably should've been a bit more explicit and given you a summary to work with when I thought it was ready.

As far as where to focus, probably the arguments to layer_add_forecast_date, layer_add_target_date, and the arx_* functions, and the tests are where I would focus first if I were reviewing it, but that's pretty generic.

Rough summary of this PR is:

layer_add_forecast_date and layer_add_target_date default to using the date specified by step_adjust_latency
adding adjust_latency to arx_args_list, with a default of NULL, to preserve previous behavior.
forecast_date and target_date both inherit values from step_adjust_latency's logic if adjust_latency is present. (Just realized I should move this logic to arx_args_list, since it's currently in arx_forecaster but not arx_classifier; will do when no longer traveling)
added get_forecast_date_in_layer to get the right forecast date in the layers rather than the steps
changes as_of to forecast_date for step_adjust_latency
swap out some awkward piping that used braces as a quasi-function

In hindsight, there are some changes I made here that I probably should've made in the other PR and rebased onto, sorry about all the as_of -> forecast_date's in the diffs.

dajmcdon

I left a number of comments, but I suspect there are some missing tests. Or else I'm doing something wrong locally. Checks fail.

[I see, remote checks are only running stylr]

R/arx_forecaster.R

R/step_adjust_latency.R

dajmcdon · 2024-06-14T15:03:03Z

R/step_adjust_latency.R

           fixed_latency = NULL,
-           fixed_asof = NULL,
+           fixed_forecast_date = NULL,
           default = NA,
           skip = FALSE,
           columns = NULL,


What does this do? Is it just populated by the tidy selection? You can inherit these from other step_* functions. (Same applies to skip and id)

The default for id should be rand_id("adjust_latency")

I was unsure which of these were necessary boilerplate for all steps, and which were args only the other step would need. Would gladly drop most of the generic steps if possible

I've been basing it off the instructions here (though they may have changed since I last made one):
https://www.tidymodels.org/learn/develop/recipes/

oh you mean inherit the documentation probably?

yes, I was wondering specifically if columns is necessary. The documentation makes it look like something set by the user, but it's actually ... not used at all?

hmm, yeah columns is definitely a cargo-cult argument, I'll drop it. ... are used to actually restrict the terms used to specific columns (though I don't have a test for that, going to add one to make sure it's working properly).

That's ok. In many example step_*(), columns is an argument. I'm not honestly sure why. Take a look at

?recipes::step_lag View(recipes::step_lag) View(recipes:::step_lag_new) View(recipes:::prep.step_lag)

The columns argument gets populated by terms at prep time. I see why they use it in step_lag_new, but I don't understand why it is kept as an argument to step_lag.

Long story short, they say to use it. But it needs to inherit the documentation. You can use

#' @inheritParams recipes::step_lag

R/step_adjust_latency.R

dajmcdon · 2024-06-14T15:12:01Z

R/step_adjust_latency.R

@@ -267,9 +299,9 @@ print.step_adjust_latency <-
    } else {
      terms <- x$terms
    }
-    if (!is.null(x$as_of)) {
+    if (!is.null(x$forecast_date)) {


Printing looks very strange. This is the example recipe above:

oh I didn't have any tests for this and changed the format. Surprised printing wasn't throwing errors. I'll let you know when I think this is fixed, leaving open

R/utils-latency.R

dajmcdon · 2024-06-14T15:24:47Z

R/utils-latency.R

+      # null and "" don't work in `group_by`
+      if (!is.null(epi_keys_checked) && epi_keys_checked != "") {
+        group_by(., get(epi_keys_checked))
+      } else {
+        .
+      }


You were going to switch these right?

R/utils-latency.R

dsweber2 · 2024-06-14T20:58:32Z

Checks fail.

The tests at least were running locally for me at the last PR. I'm generally not in a habit of running the full checks locally since I expect the remote to handle that (and they really slow down the feedback loop); I guess b/c this is a PR on a PR it isn't running the full checks.

Doing so, looks like its mostly things not being in the namespace. Check now passes locally, with 4 notes (mostly some local files it should ignore, the ubiquitous global * definition, and some ::: quasi-imports).

I've marked as resolved the things I thought were straightforward in being addressed, and left open things I'm still confused on or working through.

dajmcdon · 2024-06-14T21:02:23Z

R/step_adjust_latency.R

           fixed_latency = NULL,
-           fixed_asof = NULL,
+           fixed_forecast_date = NULL,
           default = NA,
           skip = FALSE,
           columns = NULL,


yes, I was wondering specifically if columns is necessary. The documentation makes it look like something set by the user, but it's actually ... not used at all?

dajmcdon · 2024-06-14T21:04:23Z

vignettes/articles/symptom-surveys.Rmd

I suspect these were all OK because there's a library(dplyr) at the top.

dsweber2 requested a review from dajmcdon as a code owner May 15, 2024 23:17

dsweber2 requested review from dajmcdon and removed request for dajmcdon May 15, 2024 23:18

dsweber2 force-pushed the adjustAheadLayerAdditions branch from 5fa3b5f to 5eed24d Compare May 16, 2024 20:44

dsweber2 force-pushed the adjustAhead branch from 2902cc2 to a6b0d3f Compare May 17, 2024 16:16

dsweber2 added 7 commits May 17, 2024 12:08

tests for utils-latency and accompanying fixes

1b6a0af

adding stringr

b9bed37

nothing but rlang::abort -> cli::cli_aborts

9c914e8

moving shift detection earlier,dropping string*dep

6411c76

+purrr, styling

655b141

rec formatting things, dropping purrr

fea65c4

initial layer adjustments

1a84212

dsweber2 force-pushed the adjustAheadLayerAdditions branch from 2146526 to 1a84212 Compare May 17, 2024 17:25

dsweber2 added 7 commits May 17, 2024 12:32

namespace and doc fixes

b9189fb

full rebase fixes

5fdc3e4

adding latency adjusting to arx_forecaster

2b339a2

arx_classifier more or less free

7b6f933

formatting and snapshots

2808f6a

updated man pages

2becf68

group_by options to get the max_time_value

a170ec1

dajmcdon reviewed May 28, 2024

View reviewed changes

dsweber2 added 3 commits May 29, 2024 16:10

PR review recs

c5d3c9d

- drop multiline pipes - better docs - check exclusive parameters aren't used simultaneously - inherit typo - additional placeholders for future tests

typo in multiline pipe replacement

b0239e8

happy styler

240583c

dajmcdon requested changes Jun 14, 2024

View reviewed changes

dsweber2 added 2 commits June 14, 2024 15:37

various requested changes, check passes

795abeb

style fix

28db575

dajmcdon approved these changes Jun 14, 2024

View reviewed changes

dsweber2 added 2 commits June 14, 2024 17:47

inheritParams, correct print, test adjust subset

c5136b3

space

cf8fed6

dsweber2 merged commit f36f6fa into adjustAhead Jun 17, 2024
1 check passed

dshemetov deleted the adjustAheadLayerAdditions branch June 18, 2024 02:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial layer adjustments #334

initial layer adjustments #334

dsweber2 commented May 15, 2024 •

edited

Loading

dsweber2 commented May 24, 2024

dajmcdon May 28, 2024

dsweber2 May 28, 2024

dajmcdon May 28, 2024

dsweber2 May 28, 2024

dsweber2 May 28, 2024

dajmcdon May 28, 2024

dsweber2 May 28, 2024

dajmcdon Jun 14, 2024

dsweber2 Jun 14, 2024

dajmcdon May 28, 2024

dsweber2 May 28, 2024

dajmcdon commented Jun 11, 2024

dsweber2 commented Jun 13, 2024

dajmcdon left a comment •

edited

Loading

dajmcdon Jun 14, 2024

dajmcdon Jun 14, 2024

dsweber2 Jun 14, 2024

dajmcdon Jun 14, 2024

dsweber2 Jun 14, 2024

dajmcdon Jun 14, 2024

dsweber2 Jun 14, 2024

dajmcdon Jun 14, 2024

dajmcdon Jun 14, 2024

dsweber2 Jun 14, 2024

dajmcdon Jun 14, 2024

dsweber2 commented Jun 14, 2024

dajmcdon Jun 14, 2024

dajmcdon Jun 14, 2024

	r <- r %>%
	step_epi_ahead(!!outcome, ahead = args_list$ahead) %>%
	step_epi_naomit() %>%
	step_training_window(n_recent = args_list$n_training) %>%
	{
	if (!is.null(args_list$check_enough_data_n)) {
	check_enough_train_data(
	.,
	all_predictors(),
	!!outcome,
	n = args_list$check_enough_data_n,
	epi_keys = args_list$check_enough_data_epi_keys,
	drop_na = FALSE
	)
	} else {
	.
	}
	}

initial layer adjustments #334

initial layer adjustments #334

Conversation

dsweber2 commented May 15, 2024 • edited Loading

dsweber2 commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dajmcdon commented Jun 11, 2024

dsweber2 commented Jun 13, 2024

dajmcdon left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsweber2 commented Jun 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsweber2 commented May 15, 2024 •

edited

Loading

dajmcdon left a comment •

edited

Loading