New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Landing page: plot new cases #81
Comments
Thanks for the kind feedback. You are absolutely right about the relevance of the derivatives here. Will certainly look into that. |
@joaopn I spent a bit of time on that: https://covid19-germany.appspot.com/ |
This is a hopefully thorough rate calculation with a subsequent rolling window analysis. Code:
I'll have to re-check the method and compare numbers to be sure, but I think this is already pretty meaningful for comparing sources. |
Nice comparison! Interesting that even with a pretty large rolling window the sources do not match, due to the heavy weekend effect in the RKI data. It seems likely that JHU logs announcement date (i.e. they scrape what appears in the state press releases, with data that is presumably processed continuously), while RKI logs true testing date, and thus the dip is due to less people getting tested on weekends. We should be able to confirm that with daily number of tests performed, but afaik the RKI only gives weekly numbers. |
Heavy, yes! That weekend effect alone is a really important observation. The entire world goes crazy with quantitative analysis of case count numbers. In Germany, this weekend effect can be used as a great example to corroborate "hey, these data need be interpreted with great care!" :-). I think that qualitative statements about the dynamics of the virus spread as well as about the nature of the disease are largely sufficient for estimating risk and for making decisions. But of course for our media that's not convincing enough and they have come up with quantitative analyses (which they should better leave to experts, like your research group). For example, heute journal (ZDF) plots a "Verdopplungszeit" over time: This is IMO a pretty severe example for over-quantification... super dubious because the concept of the Verdopplungszeit makes most sense in the context of exponential growth. And then of course the weekend effect makes this look even more stupid. And what was most horrifying: the 9.6 (of course, determined to that precision) were "close" to the 10 that Merkel once declared to be a goal (ZDF claimed: we almost reached the goal, only 0.4 left to go!) In view of the weekend effect I think we should sarcastically apply a Fourier transform to "show" that the virus is actually a religious entity (because -- clearly -- it operates with a 7-day periodicity and rests on weekends, RIGHT?) :-). |
It's actually the "Meldedatum" which sadly is not the actual testing date, but the date of the day that a Gesundheitsamt learns about a new case. The test might have been performed days before that. I am also rather sad about that we miss that part of the timeline for each individual case.
|
By now I'm used to the weekly "spread has slowed down" news every Monday =P You are right, the Meldedatum is the reported date. Fortunately, The RKI added a Refdatum field to the query system with what I assume is the self-reported start of symptoms: https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0?selectedAttribute=Refdatum
It paints quite a different picture: Refdatum (certain) plots only the data where Meldedatum > Refdatum. Unfortunately, we lose about 30% of the entries and 40% of the cases, and the undersampling is very much non-random. So the true start of symptoms timeseries lies somewhere between Refdatum (certain) and Refdatum. So one interpretation is that the ramping up in tests after 15/03 managed to catch the earlier unreported cases. |
@joaopn just some quick meta feedback here: thanks for this exchange of ideas! I've also had a quick look at the work of your research group and I truly appreciate it. Btw, I am a physicist myself; also had deep contact to MPI-PKS in Dresden during my PhD -- truly appreciate the role of the institute you're working in :-). |
@jgehrcke Thanks for the kind words =) |
Very cool repo!
It would be nice to also plot the daily new cases for the comparison plot in the landing page. It would help separate true divergence from reporting delay between sources.
Related: #58
The text was updated successfully, but these errors were encountered: