Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't display last two weeks of reproduction number and infectious people; they're inaccurate (underestimated) #410

Closed
EwoutH opened this issue Sep 22, 2020 · 4 comments

Comments

@EwoutH
Copy link

EwoutH commented Sep 22, 2020

On September 15th the Dashboard was updated with RIVM data, stating that the reproduction number would be in the confidence interval between 0,92 and 1,07 on September 4th.
Screenshot_223

Today (September 22th) the reproduction number was updated for September 4th: 1,33, with the confidence interval ranging from 1,26 to 1,40. Nowhere near the predicted confidence interval.
Screenshot_224

A downward bias toward recent data can be seen for several weeks. This is heavily misleading, since it suggests it's going for all dates for which an effective R isn't calculated yet.

Aside from removing it from the dashboard, I would also heavily recommend upstreaming this issue to the RIVM to improve their data calculation, or at least be more honest about the vast uncertainty of these estimates/predictions.

@EwoutH
Copy link
Author

EwoutH commented Sep 25, 2020

@VWSCoronaDashboard Could you (assign someone to) review this issue?

@minvws minvws deleted a comment from VWSCoronaDashboard Sep 28, 2020
@VWSCoronaDashboard7
Copy link
Contributor

@EwoutH We monitor all issues but cannot always reply immediately. This question is not a technical issue but rather a question about data. We have therefore reached out to the RIVM on your behalf to ask for an explanation of the deviation you mention. Here's the response:

Het reproductiegetal (R-waarde) wordt geschat op basis van de trend in het aantal meldingen. Hoe sneller het aantal meldingen oploopt, hoe hoger het geschatte reproductiegetal dus is. De rapportagevertragingen en de tijdsduur tussen opeenvolgende infecties betekenen in Nederland dat we alleen betrouwbare schattingen kunnen maken van het reproductiegetal, voor langer dan 14 dagen geleden.

De genoemde onzekerheidsmarge is een 95% betrouwbaarheidsinterval. Dat wil zeggen dat het geschatte reproductiegetal met 95% zekerheid binnen die marge ligt. Dit is echter niet 100% zeker, het kan dus gebeuren dat de uiteindelijke schatting hierbuiten valt.

Hopefully this answers some of your questions. More about the R-value and how it is calculated can be found in the epidemiological rapport of the RIVM ( https://www.rivm.nl/documenten/wekelijkse-update-epidemiologische-situatie-covid-19-in-nederland ) at pages 29 and 30.

There is currently no intention to remove the reproduction number from the dashboard. We do appreciate the feedback and case example and will take that into consideration when thinking about how to explain R-value related data as clear as possible.

@OnixGH
Copy link

OnixGH commented Sep 28, 2020

@VWSCoronaDashboard7 the RIVM response states that: "[..] we are only able to make reliable R estimates for 14 days ago and before".

In other words, the estimates for the past 14 days are not reliable, which I believe is the issue @EwoutH raised that IMHO still stands.

Technically speaking, the dashboard acknowledges this unreliability by not plotting R for this period (and mentions that in the legend), but at the same time the 95% CI around the R is still plotted, which effectively means you are still using the unreliable R.

I would also suggest removing the 95% visualization of the past 14 days, or at the very least visually changing it to be clearly different from the 95% CI of >= 14 days ago, which IS based on a reliable estimate of R.

@EwoutH
Copy link
Author

EwoutH commented Sep 28, 2020

@VWSCoronaDashboard7 Thank you for your response. @OnixGH is right, the issue is that the 95% estimates for the past 14 days are not reliable and should therefore not be displayed on the dashboard.

So my suggestion would be:

  • Do show the part of the graph for which an effective R is calculated.
  • Do not show the part of the graph for which only a 95% interval is given.

R-dashboard

@EwoutH EwoutH changed the title Remove reproduction number estimate; it's inaccurate Don't display last two weeks of reproduction number and infectious people; they're inaccurate (underestimated) Sep 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants