Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some reason why RKI data is not updated since at 2020-04-08? #93

Closed
avila opened this issue Apr 11, 2020 · 4 comments
Closed

some reason why RKI data is not updated since at 2020-04-08? #93

avila opened this issue Apr 11, 2020 · 4 comments
Labels
feedback feedback and questions.

Comments

@avila
Copy link

avila commented Apr 11, 2020

it seems that RKI data is not updated since 2020-04-08T17:00:00+0000. Any reason for that? I havent found an issue on the topic, sorry if I missed it.
And thanks for the effort, by the way!

@jgehrcke jgehrcke added the feedback feedback and questions. label Apr 14, 2020
@jgehrcke
Copy link
Owner

jgehrcke commented Apr 14, 2020

Hey @avila thanks for asking.

First of all, the data are still being updated regularly :-).

But I think there is a deeper aspect to your question which certainly deserves attention.

You asked on April 11 and you saw the last data point from April 8.

I updated RKI data today (April 14) and yet the last data point is from April 12.

This is intended and good!

Sometimes the RKI's ArcGIS system doesn't even yield data for the last 1-2 days. When it does then these data points are known to significantly underestimate the actual count ("actual count" being the official RKI count when queried a couple of days later).

When one looks at the RKI time series data in their ArcGIS system today then it is reasonable to assume that only the data points up until today - 2 days reflect the actual count reasonably well.

When you want to see case count data then you should categorize your motivation into either i) I want to have a bit of a sensationalistic impression for what the current state is or ii) I want to understand as good as possible what the historical evolution of case count numbers was up until 1-2 days ago.

A bit of a guideline:

  • For the "current state" (in whichever way we'd like to define that) it's "best" to look at https://covid19-germany.appspot.com/now (based on most recent data obtained from Gesundheitsaemter by Risklayer, ZEIT ONLINE). This is however a sensationalistic way of looking at things, because a couple of days later you'll find that the RKI data point for what was "now" before is actually more credible (often a higher count, but not always).

  • For looking at the time evolution and spatial evolution of confirmed COVID-19 cases it's best to look at the RKI time series data, but here we simply have to appreciate that the data points of the last couple of days are significantly affected by Meldeverzug.

@jgehrcke
Copy link
Owner

jgehrcke commented Apr 14, 2020

A plot I created on March 31 to visualize this effect:

data-sources-comparison-2020-03-31

See how the red line (RKI data) is "above" the other lines (other data sources) for most of the past, but not for the most recent days. The decrease of the slope of the red line towards April 1 in that plot is as of processing delays and is corrected for in the future, as you can see in the same plot for today:
data-sources-comparison-2020-04-14

The low slope of the red curve in the first plot around March 31 is not visible anymore in the second plot. Notably, the red line stays "above" the other lines.

In the second plot, towards the right end of the plot, said effect is still there, though a little harder to see as of the scaling properties of the plot.

@jgehrcke
Copy link
Owner

Some quotes from RKI:

Für die Darstellung der neuübermittelten Fälle pro Tag wird das Meldedatum verwendet – das Datum, an dem das lokale Gesundheitsamt Kenntnis über den Fall erlangt und ihn elektronisch erfasst hat.

Zwischen der Meldung durch die Ärzte und Labore an das Gesundheitsamt und der Übermittlung der Fälle an die zuständigen Landesbehörden und das RKI können einige Tage vergehen (Melde- und Übermittlungsverzug). Jeden Tag werden dem RKI neue Fälle übermittelt, die am gleichen Tag oder bereits an früheren Tagen an das Gesundheitsamt gemeldet worden sind. Diese Fälle werden in der Grafik Neue COVID-19-Fälle/Tag dann bei dem jeweiligen Datum ergänzt.

Durch den Meldeverzug sind die Daten die letzten Tage in der Grafik noch unvollständig und füllen sich mit den in den kommenden Tagen nachfolgend übermittelten Daten auf. Aus dem Verlauf der übermittelten Daten allein lässt sich daher kein Trend zu den aktuell erfolgten Neuinfektionen ablesen.

@avila
Copy link
Author

avila commented Apr 14, 2020

wow! thanks for the great response (ant the work!).
I believe I can close this issue very well closed! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback feedback and questions.
Projects
None yet
Development

No branches or pull requests

2 participants