Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to compute empirical share of cases with missing reference date #97

Closed
Tracked by #92
adrian-lison opened this issue Jul 5, 2022 · 4 comments · Fixed by #106
Closed
Tracked by #92
Assignees
Labels
enhancement New feature or request high-priority

Comments

@adrian-lison
Copy link
Collaborator

Add function enw_add_share_missing to preprocess.R that computes the share of cases with missing reference date.

There are two main options:

  • The easy, approximate one: compute share by reporting date and shift backwards (by mean empirical backwards reporting delay estimated over all dates)
  • The complicated, more exact one: compute empirical backward reporting delay distribution for each reporting day, shift cases with missing reference date backwards according to the distribution to get their reference dates, compute empirical share by reference date

The complicated version could also be the more instable one as case numbers are low, because then the empirical backward delays cannot be estimated accurately. On the other hand, the easy version will be biased in situations where the delay changes a lot.

One further aspect to decide on is whether to add this to obs or somewhere else. Technically, the share of cases with missing reference date is only indexed by reference date, so it does not fit directly to the structure of obs...

@adrian-lison adrian-lison added the enhancement New feature or request label Jul 5, 2022
@adrian-lison adrian-lison self-assigned this Jul 5, 2022
@seabbs
Copy link
Collaborator

seabbs commented Jul 5, 2022

So in my head, I thought we were talking about empirical missingness by report date which would obviously be much simpler. Do you think there is a strong argument for trying to back out the approximate missingness by date of reference?

In terms of where this should go I would have thought in the obs_miss part of the preprocessed data?

@seabbs seabbs mentioned this issue Jul 5, 2022
21 tasks
@adrian-lison
Copy link
Collaborator Author

Okay, I agree. For downstream use in seeding/offset, we may need it by date of reference, but we can handle it in an offset helper function if we go for the simpler option mentioned above.

@seabbs seabbs linked a pull request Jul 7, 2022 that will close this issue
@seabbs
Copy link
Collaborator

seabbs commented Jul 7, 2022

I've added a prototype of this to #106 (enw_missing_reference)

@seabbs
Copy link
Collaborator

seabbs commented Jul 8, 2022

Done in #106

@seabbs seabbs closed this as completed Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high-priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants