Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments on germany-age-stratified-nowcasting.Rmd #259

Open
seabbs opened this issue Apr 25, 2023 · 3 comments
Open

Comments on germany-age-stratified-nowcasting.Rmd #259

seabbs opened this issue Apr 25, 2023 · 3 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested

Comments

@seabbs
Copy link
Collaborator

seabbs commented Apr 25, 2023

From @parksw3:

germany-age-stratified-nowcasting.Rmd

  • "reporting delays are fixed across age groups and time" Do you mean that the delay "distribution" is fixed?

  • I can't figure out where you're specifying the data when you're fitting the model here. Something missing? Need to be explained a bit more clearly.

fit <- enw_fit_opts(
  save_warmup = FALSE, output_loglik = TRUE, pp = TRUE,
  chains = 2, threads_per_chain = threads, 
  iter_sampling = 500, iter_warmup = 500,
  show_messages = FALSE, refresh = 0,
  adapt_delta = 0.98, max_treedepth = 15
)
nowcast <- epinowcast(pobs,
  fit = fit,
  model = multithread_model
)
  • It looks like data = pobs argument is missing from the multithread_model?

  • "To speed up model fitting we make use of posterior information from the previous model (with some inflation) for some parameters. Note that this is not a truly Bayesian approach and in some situations may be problematic." I think this is always problematic, right? You already have some sort of posterior from the data, and you're using the posterior to fit to the same data again (even though you're using the same model). So you're fitting to the same data twice right? I'm not saying you shouldn't do this but the limitation should be made clearer.

  • Reference day of the week effect section. It looks like the original fit also captures the day of the week effect. So I'm a bit confused how the original model captures the day of the week effect. Also not sure what the differences are between two fits (besides the fact that they're using different models). It looks like the fits are qualiatatively similar? Maybe something wrong with the first figure? The code shouldn't even run because you haven't specified the data yet...?

  • Posterior predictions figure is impossible to read... need to zoom in to show a concrete example?

  • "As noted using the posterior predictions from the simple model fit above there appears to be a day of the week effect for reported observations" this is really hard to tell... also need to explain the differences between the day of the week effect for reporting day vs reference day more clearly. I didn't get you were talking about the reference vs reporting day until I got here. I also wonder if it's easier to explain the reporting day effect first and reference day after? I feel like you meant to do this because the reference day of the week model is more complicated:

expectation_module <- enw_expectation(
  ~ 0 + (1 | day_of_week) + (1 | day:.group), data = pobs
)

than the reporting day of the week model:

report_module_dow <- enw_report(~ (1 | day_of_week), data = pobs)

I think it makes sense to introduce the more complicated model later?

Original: #215 (comment)

@seabbs seabbs added documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested labels Apr 25, 2023
@seabbs
Copy link
Collaborator Author

seabbs commented Apr 26, 2023

  • "reporting delays are fixed across age groups and time" Do you mean that the delay "distribution" is fixed?

Yes

  • I can't figure out where you're specifying the data when you're fitting the model here. Something missing? Need to be explained a bit more clearly.

If you look at the docs for enw_fit_opts() you will see it is for setting fitting options and not for actually doing fitting. Data is passed here via the pobs preprocessed argument. I agree that could be made clearer. An easy change would be to actually write data = pobs here. Another easy change would be to write some of what I have said here explicitly in the vignette.

  • It looks like data = pobs argument is missing from the multithread_model?

This step precompiles the model and so doesn't need data (the actual fitting happens in epinowcast but combining all he user settings and the applying them to the pre-compiled object. There is some text around this but obviously it isn't clear enough.

  • "To speed up model fitting we make use of posterior information from the previous model (with some inflation) for some parameters. Note that this is not a truly Bayesian approach and in some situations may be problematic." I think this is always problematic, right? You already have some sort of posterior from the data, and you're using the posterior to fit to the same data again (even though you're using the same model). So you're fitting to the same data twice right? I'm not saying you shouldn't do this but the limitation should be made clearer.

The important point here is it isn't the same model. It is a more complex model for which we are using inflated posteriors from a few parameters to speed up fitting (so we get a better idea of what these are). This isn't strictly required but does improve performance a little. The actual use case where it helps more is when doing rolling nowcasting as you can use prior data to inform your current priors. It looks like above could do with a reword.

  • Posterior predictions figure is impossible to read... need to zoom in to show a concrete example?

The idea was that people could zoom in on there own time but agree could pull out a sub plot. In general plotting of posteior predictions needs some work. Ideally probably a summary plot and then individual-level plots.

  • "As noted using the posterior predictions from the simple model fit above there appears to be a day of the week effect for reported observations" this is really hard to tell...

This should come across by noting the periodicity in the reporting distribution. Not being able to tell this likely comes from above issue using the posterior predictions plot and more hand-holding text being needed. The other thing we really need here is improved visualisation of the raw data (which @FelixGuenther has been working on/off/on and we will hopefully have something for soon).

I think it makes sense to introduce the more complicated model later?

The reason to do this is I think it is a good idea to encourage users to define the model in the order of the process, so generation of reference day counts, reporting distribution by reference date, and then reporting day hazard effects. I agree that the expectation model formula does look a bit scary which isn't ideal. More text would probably help?

I didn't get you were talking about the reference vs reporting day until I got here. I also wonder if it's easier to explain the reporting day effect first and reference day after? I feel like you meant to do this because the reference day of the week model is more complicate

The idea behind leading with the reporting day model is that this effect is more obvious in the raw data and likely has the biggest impact on results (it is also likely much more common - you probably want this in nearly every model). I agree understanding the hazard stuff is more complex though. All modules could obviously do with more description.

U

@seabbs
Copy link
Collaborator Author

seabbs commented Apr 27, 2023

We need to decide if we are aiming to close this out for 0.2.1 or punting to 0.3.0

@seabbs
Copy link
Collaborator Author

seabbs commented Apr 28, 2023

Going to punt to 0.3.0 unless anyone has objections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

1 participant