Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Landing page: plot new cases #81

Closed
joaopn opened this issue Apr 2, 2020 · 9 comments
Closed

Landing page: plot new cases #81

joaopn opened this issue Apr 2, 2020 · 9 comments

Comments

@joaopn
Copy link

joaopn commented Apr 2, 2020

Very cool repo!

It would be nice to also plot the daily new cases for the comparison plot in the landing page. It would help separate true divergence from reporting delay between sources.

Related: #58

@jgehrcke
Copy link
Owner

jgehrcke commented Apr 2, 2020

Thanks for the kind feedback. You are absolutely right about the relevance of the derivatives here. Will certainly look into that.

@jgehrcke
Copy link
Owner

@joaopn I spent a bit of time on that: https://covid19-germany.appspot.com/

case-rate-rw-2020-04-10

@jgehrcke
Copy link
Owner

jgehrcke commented Apr 11, 2020

This is a hopefully thorough rate calculation with a subsequent rolling window analysis.

Code:

def _build_case_rate(df):

I'll have to re-check the method and compare numbers to be sure, but I think this is already pretty meaningful for comparing sources.

@joaopn
Copy link
Author

joaopn commented Apr 13, 2020

Nice comparison! Interesting that even with a pretty large rolling window the sources do not match, due to the heavy weekend effect in the RKI data. It seems likely that JHU logs announcement date (i.e. they scrape what appears in the state press releases, with data that is presumably processed continuously), while RKI logs true testing date, and thus the dip is due to less people getting tested on weekends. We should be able to confirm that with daily number of tests performed, but afaik the RKI only gives weekly numbers.

@jgehrcke
Copy link
Owner

jgehrcke commented Apr 14, 2020

the heavy weekend effect in the RKI data

Heavy, yes! That weekend effect alone is a really important observation. The entire world goes crazy with quantitative analysis of case count numbers. In Germany, this weekend effect can be used as a great example to corroborate "hey, these data need be interpreted with great care!" :-).

I think that qualitative statements about the dynamics of the virus spread as well as about the nature of the disease are largely sufficient for estimating risk and for making decisions. But of course for our media that's not convincing enough and they have come up with quantitative analyses (which they should better leave to experts, like your research group).

For example, heute journal (ZDF) plots a "Verdopplungszeit" over time:

Screenshot from 2020-04-06 01-06-46

This is IMO a pretty severe example for over-quantification... super dubious because the concept of the Verdopplungszeit makes most sense in the context of exponential growth. And then of course the weekend effect makes this look even more stupid. And what was most horrifying: the 9.6 (of course, determined to that precision) were "close" to the 10 that Merkel once declared to be a goal (ZDF claimed: we almost reached the goal, only 0.4 left to go!)

In view of the weekend effect I think we should sarcastically apply a Fourier transform to "show" that the virus is actually a religious entity (because -- clearly -- it operates with a 7-day periodicity and rests on weekends, RIGHT?) :-).

@jgehrcke
Copy link
Owner

jgehrcke commented Apr 14, 2020

while RKI logs true testing date, and thus the dip is due to less people getting tested on weekends

It's actually the "Meldedatum" which sadly is not the actual testing date, but the date of the day that a Gesundheitsamt learns about a new case. The test might have been performed days before that. I am also rather sad about that we miss that part of the timeline for each individual case.

Für die Darstellung der neuübermittelten Fälle pro Tag wird das Meldedatum verwendet – das Datum, an dem das lokale Gesundheitsamt Kenntnis über den Fall erlangt und ihn elektronisch erfasst hat.

Der genaue Infektionszeitpunkt der gemeldeten Fälle kann in aller Regel nicht ermittelt werden. Das Meldedatum an das Gesundheitsamt spiegelt daher am besten den Zeitpunkt der Feststellung der Infektion (Diagnosedatum) und damit das aktuelle Infektionsgeschehen wider.

@joaopn
Copy link
Author

joaopn commented Apr 14, 2020

By now I'm used to the weekly "spread has slowed down" news every Monday =P
I guess a bit of tea leave reading is to be expected in such a situation.

You are right, the Meldedatum is the reported date. Fortunately, The RKI added a Refdatum field to the query system with what I assume is the self-reported start of symptoms: https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0?selectedAttribute=Refdatum

Meldedatum: Datum, wann der Fall dem Gesundheitsamt bekannt geworden ist
Referenzdatum: Erkrankungsdatum bzw. wenn das nicht bekannt ist, das Meldedatum

It paints quite a different picture:

image

image

Refdatum (certain) plots only the data where Meldedatum > Refdatum. Unfortunately, we lose about 30% of the entries and 40% of the cases, and the undersampling is very much non-random. So the true start of symptoms timeseries lies somewhere between Refdatum (certain) and Refdatum.
It is also interesting to consider that in light of the (unfortunately only weekly) number of tests performed:

image

So one interpretation is that the ramping up in tests after 15/03 managed to catch the earlier unreported cases.

@jgehrcke
Copy link
Owner

@joaopn just some quick meta feedback here: thanks for this exchange of ideas!

I've also had a quick look at the work of your research group and I truly appreciate it. Btw, I am a physicist myself; also had deep contact to MPI-PKS in Dresden during my PhD -- truly appreciate the role of the institute you're working in :-).

@joaopn
Copy link
Author

joaopn commented Apr 16, 2020

@jgehrcke Thanks for the kind words =)

@joaopn joaopn closed this as completed Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants