Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add forecast versioning #273

Closed
matthewcornell opened this issue Oct 27, 2020 · 3 comments
Closed

Add forecast versioning #273

matthewcornell opened this issue Oct 27, 2020 · 3 comments
Assignees

Comments

@matthewcornell
Copy link
Member

matthewcornell commented Oct 27, 2020

Google doc from discussion: https://docs.google.com/document/d/18BJaMxqEsNl1BzKwm_vx8RV0bVITJJHeG3hgTm61fWc/edit

Many changes required, e.g.,

  • Forecast versions are named/identified by a new Forecast.issue_date DateField, which defaults to date(Forecast.created_at) (i.e., the timestamp when Zoltar created the Forecast).
  • Migrating current data (Django migration complexity): default to date(Forecast.created_at). backup first!
  • Forecasts are versioned by allowing multiple Forecasts per TimeZero (i.e., we relax the current business rule of at most one Forecast/TimeZero). They are identified/named by their Forecast.issue_date.
  • Only special users are allowed to specify Forecast.issue_date, for reasons of scientific integrity. All other creation uses the default as above. (Q: What is our permission policy? Django staff users? Django admin?
  • Forecast query functionality will have a new optional as_of key. If not passed then it defaults to using the most recent version. In the future we may add a second new optional field named issue_date that causes the query to only include Forecast with that date. This differs from as_of - as @nickreich puts it:

covidcast allows for querying either on issue_date (i.e. "give me the forecasts with a specific issue_date value(s)") and as_of (i.e. "give me the most recent forecasts as_of a particular date"). I think the as_of query will be the one we want to use more frequently.

  • Similarly, all web ui and library operations default to the most recent Forecast versions.
  • Scoring: Will use latest versions, but we may decide to not code scoring to work with this new forecast version feature because we will be moving scoring from Zoltar internal to external tools.
  • https://github.com/reichlab/covid19-forecast-hub : forecast upload scripts will need to be updated to pass forecast_date

For any as_of query discussions: Here's an example database with versions (header is timezeros, rows are forecast version dates):

+----+----+----+
|10/1|10/2|10/4|
|tz1 |tz2 |tz3 |...
+----+----+----+
|10/1|    |    |
|f1  | -  | -  |
+----+----+----+
|    |    |10/5|
|-   | -  |f2  |
+----+----+----+
|10/8|10/8|    |
|f3  | f4 | -  |
+----+----+----+

Here are some as_of examples (which forecast version would be used as of that date):

+-----+----+----+----+
|as_of|tz1 |tz2 |tz3 |
+-----+----+----+----+
|10/2 | f1 | -  | -  |
|10/6 | f1 | -  | f2 |
|10/8 | f3 | f4 | f2 |
+-----+----+----+----+
@nickreich
Copy link
Member

Looks great, Matt! I think it'd be nice for the web ui to show all versions where feasible (e.g. in tables) but maybe to choose the most recent where not feasible (e.g. in the forecast summary visualization)

@matthewcornell
Copy link
Member Author

matthewcornell commented Oct 29, 2020

Agreed. I scoped out as many web ui changes like this that I could think of, including:

  • ForecastModel detail: show multiple rows for each timezero

I know we will notice others as I code along.

@nickreich
Copy link
Member

perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants