You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add function enw_add_share_missing to preprocess.R that computes the share of cases with missing reference date.
There are two main options:
The easy, approximate one: compute share by reporting date and shift backwards (by mean empirical backwards reporting delay estimated over all dates)
The complicated, more exact one: compute empirical backward reporting delay distribution for each reporting day, shift cases with missing reference date backwards according to the distribution to get their reference dates, compute empirical share by reference date
The complicated version could also be the more instable one as case numbers are low, because then the empirical backward delays cannot be estimated accurately. On the other hand, the easy version will be biased in situations where the delay changes a lot.
One further aspect to decide on is whether to add this to obs or somewhere else. Technically, the share of cases with missing reference date is only indexed by reference date, so it does not fit directly to the structure of obs...
The text was updated successfully, but these errors were encountered:
So in my head, I thought we were talking about empirical missingness by report date which would obviously be much simpler. Do you think there is a strong argument for trying to back out the approximate missingness by date of reference?
In terms of where this should go I would have thought in the obs_miss part of the preprocessed data?
Okay, I agree. For downstream use in seeding/offset, we may need it by date of reference, but we can handle it in an offset helper function if we go for the simpler option mentioned above.
Add function
enw_add_share_missing
topreprocess.R
that computes the share of cases with missing reference date.There are two main options:
The complicated version could also be the more instable one as case numbers are low, because then the empirical backward delays cannot be estimated accurately. On the other hand, the easy version will be biased in situations where the delay changes a lot.
One further aspect to decide on is whether to add this to
obs
or somewhere else. Technically, the share of cases with missing reference date is only indexed by reference date, so it does not fit directly to the structure of obs...The text was updated successfully, but these errors were encountered: