Skip to content

CTIS standard errors should incorporate survey weights #1271

@capnrefsmmat

Description

@capnrefsmmat

Actual behavior

The standard errors for binary proportion signals (i.e. most of them) do not use weights.

Expected behavior

The standard errors should incorporate the weights, as documented.

Constraints

  • If we simply change the SE code, SEs issued before the changeover date would be unweighted, and those afterwards would be weighted: very confusing
  • We want to retain the issue history of our signals
  • Regenerating the issue history of fb-survey from scratch would be prohibitively time-intensive, both computationally and for engineering to figure out precisely how to do it

Proposal

  1. Update the code to produce weighted SEs
  2. For old SEs in our API's database, serve them to clients with the SE labeled "unweighted SE" or some similar name; make sure the documentation explains the difference. This would require an update to the API server in delphi-epidata
  3. To the extent possible, backfill past dates with the new SEs (as a new issue).

Suppose day S is the day we switch to producing weighted SEs. Then:

  • All estimates with an issue date on or after S would have weighted SEs
  • All estimates with an issue date before S would have unweighted SEs
  • But for time_values before S, we would create a new issue on S that contains the same estimate and corrected standard error

Hence the only people who would see unweighted SEs would be people who use the as_of argument to our API; anyone simply asking for estimates for a specific date would get the weighted SE.

Open questions

  1. Is it plausible to report old SEs with a different column name? This would require some code changes to the API server; @krivard would such source-specific hacks be welcome in the code?
  2. How long will it take to rewrite the standard error code and test it?
  3. How long would it take to produce new estimates for all past time_values, and would that be an issue for our database? I'm guessing multiple days just of code runtime; not sure about database ingestion of a new issue for every geo_value and time_value and signal; @krivard would this be prohibitively database-killing?

Metadata

Metadata

Assignees

Labels

CTISImprovements and reporting for CTIS

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions