Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare time series from data sources: JHU, Risklayer, RKI, ZEIT ONLINE, ... #58

Closed
jgehrcke opened this issue Mar 28, 2020 · 1 comment
Labels

Comments

@jgehrcke
Copy link
Owner

jgehrcke commented Mar 28, 2020

With the current state of tooling in this repo we're now approaching a state where it's easy to compare time series obtained from different data sources, and where it will be easy to do so continuously do so (with automation). A simple plot showing the four time series named in the title will reveal a lot about the relationship, differences, and commonalities between the data sources.

@jgehrcke jgehrcke added the task label Mar 28, 2020
@jgehrcke
Copy link
Owner Author

jgehrcke commented Mar 28, 2020

A preliminary plot that I've just prepared:

data-sources-comparison

Some quick observations:

  • Qualitatively, all sources show the same picture.
  • "Meldeverzug" is real and significant (data is coming in to the RKI with a multi-day, sometimes even multi-week delay).
  • Only the RKI properly accounts for Meldeverzug (re-writing history every day, adding individual data points to their time series based on the original "Meldedatum", which is probably the date where the local Gesundheitsamt learns about a case).
  • Quantitatively, when looking at a day that is more than two days in the past, the non-RKI sources keep record of a case that is by 5.000 to 10.000 lower than what the RKI tracks for that day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant