-
Notifications
You must be signed in to change notification settings - Fork 346
Difference in trends for 7-day incidence and 7-day average #528
Comments
I think the cause for this is the following:
(Source: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Fallzahlen.html). This would explain the difference, or? (pinging @MikeMcC399 since he has a great understanding of such things) |
@Ein-Tim Start first by tapping the ℹ️ icon in the app for the definitions. Then access the raw data through According to that Excel file in tab "BL_7-Tage-Inzidenz" the 7-Day Incidence on Feb 16, 2021 of confirmed new infections was 58.7 and 7 days before that on Feb 9, 2021 it was 72.8. So that was a downwards trend of 14.1 or -19% based on the Feb 9 data. Using the tab "BL_7-Tage-Fallzahlen" I couldn't find values which matched the ones in the app, so I used the tab "Fälle-Todesfälle-gesamt" instead. Based on that I don't understand why the app is showing Trend: Steady for the 7-Day Incidence when, according to the figure I quoted, the trend is 19% down and this is more than the 5% threshold to declare it as Trend: Downwards and mark it with a green arrow. This needs to be looked at. Thanks to @nilsalex for bringing this up! |
Thank you @MikeMcC399 for checking (I can Google too, but I have to admit that you are often better in explaining (such) things than me 😅) I assume this also affects Android, or? If yes, please move it to the documentation repo. |
@Ein-Tim I think it should be looked at urgently because the 7-Day Incidence value and the trend is the one figure that everybody, including politicians, are looking at to influence the decision about the easing of lockdown. |
For ease of reference here are the RKI daily reports for Feb 16, 2021 and for 7 days previously on Feb 9, 2021. 2021-02-09-en.pdf These show the figures
which is a clear downwards trend (that I am sure we are all happy to be seeing 👏!) |
The value today, Feb 17, 2021, for 7-Day Incidence is 57.0 and the trend is down, which looks good.
The data for yesterday should still be investigated though. |
@dsarkar Could you take a look at this and transfer it to the correct repo? Thanks! |
The value today, Feb 18, 2021, for 7-Day Incidence is 57.1 with "Trend: Steady".
The incidence has decreased by 7.1 or 11% of 64.2, so why does it show "Trend: Steady" not "Trend: Downwards"? * Values from Fallzahlen_Kum_Tab.xlsx |
It looks like the trend indicator is just comparing to the value from the previous day, whereas the help text says "The trend compares the value from the previous day with the value from two days ago or, for the 7-day trends, the average value from the last 7 days with the average value from the 7 days prior to that." So the displayed comparison does not correspond to the method described in the help text. (Or I have misunderstood!)
The full help text from EN "Trend" "The arrow direction indicates whether the trend is increasing, decreasing, or remaining steady – that is, demonstrates a deviation of less than 1% compared to the previous day or 5% compared to the previous week. The color indicates this trend as positive (green), negative (red), or neutral (gray). The trend compares the value from the previous day with the value from two days ago or, for the 7-day trends, the average value from the last 7 days with the average value from the 7 days prior to that." DE "Die Pfeilrichtung zeigt an, ob der Trend nach oben oder nach unten geht oder relativ stabil ist, d.h. eine Abweichung von weniger als 1% im Vortagesvergleich bzw. 5% im Vorwochenvergleich aufweist. Die Farbe bewertet diesen Trend als positiv (grün), negativ (rot) oder neutral (grau). Der Trend vergleicht den Wert vom Vortag mit dem Wert von vor zwei Tagen bzw. für die 7-Tage-Trends den Mittelwert der letzten 7 Tage mit dem der vorausgegangenen 7 Tage." |
@MikeMcC399 regarding your last comment:
|
Correct, yes, that is what I am saying. That is how I understand the explanation in the help text. Is that the way you understand the help text as well? |
@MikeMcC399 Yes, I think I can follow through. For today and today-7 days I also get -11%, for yesterday and yesterday-7 i get -16% Even (I think that would be wrong) taking averages of the averaged values, I get averaging 11-17 Feb (59.8) and comparing average 4-10 Feb (75.8) a change of -21.1%. |
Agreed! 👍
From my hazy memory of statistics, averages of averages is not a good thing. I think you should discard those numbers and stick with the first line. Could you pass the issue on to the originators of the statistics? I assume that the statistics are calculated by RKI and transferred to the CWA infrastructure. I couldn't find any new documentation in https://github.com/corona-warn-app/cwa-documentation covering the statistics calculations and distribution. It looks to me like there is a binary file pulled from /version/v1/stats on the DOWNLOAD_CDN_URL which suggests that the app just has the job of displaying the data, not calculating it. So if there is an issue with what is displayed then something further upstream needs to be looked at. |
@MikeMcC399 indeed, I was told that the app only displays statistical data, it does not calculate it. I created an internal ticket 5225, and additionally, I will bring this up today in a meeting. |
All,
Therefore, we decided to start a new task of communication - it's not yet clear if it becomes a blog, an FAQ entry or any other kind of media. We'll try to "translate" the intention of the statistical metrics shown in the CWA and what are the key drivers for the "trend arrow" indicator. Believe me, this will not be an easy and fast task, as it challenges us to gain trust by "translating" the statistics into consumable portions of knowledge - how to read the tiles. So, I kindly ask you to stay patient. |
One more word to @MikeMcC399 and @nilsalex : I cannot comment the full issue here. But I want to let you know (and hope you can adjust your viewpoint and accept): The 7-day-Incidence is not a |
Thank you for the response and information! It seems that the help text is difficult to interpret correctly concerning what falls under the category of a "7-day trend". Could you help us out so that we understand this better? For each of the four values which have a trend arrow:
... could you let us know if the arrow (Upwards, Downwards or Steady) is calculated based on comparing to the corresponding number displayed the previous day or the number displayed 7 days previously? For "7-Day Incidence" you told us in the previous post that the trend depends on the number displayed from the previous day. |
We are going to write that down, I promise. |
@GisoSchroederSAP Thanks for looking into this!
I don't think there is any confusion about the definition of the 7-day incidence. And because this metric is defined as above, I really don't get how it can follow a different trend than the 7-day average, which is also -- please correct me if I'm wrong -- based on the sum of nationwide infections of the last 7 days. So I guess my question really is: Is it not the case that both numbers are the same up to a constant relative factor (of about |
@nilsalex I would also like to understand the difference in the two trends. I agree that it is not intuitively obvious that they should be different, so I'll be waiting with interest for the details of the calculations. I take on the statement from @GisoSchroederSAP that the calculations are correct, so I expect the reasons for differences will be caused by the calculation methods used. |
@MikeMcC399 Yes, I agree. Such an effect can be an explanation for the discrepancy. However, it should not be the reason. Because, the expectation is clear:
This does not change for values calculated from regional values. Any objections to this basic fact by @GisoSchroederSAP are wrong on the merits. We cannot dispute proven mathematical facts. Now, if there are different data sources for both values, we should settle on one of them. Absent a good reason, but which reason would that be? Edit: Sorry, I did not see your latest comment. So it is the explanation. Thanks for digging in to this! So I would suggest to consolidate the metrics. The current state breaks expectation by any reasonable user. |
Thanks for making this double-check, @MikeMcC399 . |
I cannot state this enough:
Is just false. Assuming both metrics refer to the same set, of course---that is, both or none are correct w.r.t. symptom onset. (To be perfectly clear: Yes, it is the weighted average of local incidences. But incidentally (pun intended), this translates into the nationwide incidence which is the ratio of nationwide totals. By multiplication with national population, you have the nationwide infections over the last 7 days.) The hostility towards me because you disagree with this basic fact has no place here. I am truly disappointed that people are treated this way in this community. Now, you say "won't fix" because you have a good reason for using different numbers (one corrected, one not corrected, whatever). That is kind of acceptable, although not optimal. But your entire argument and personal attacks did not revolve around this. |
Then just explain the difference of all these number If you excuse me, I'm going to stop the discussion here. |
When the facts have been checked with the product owner, we should also consider updating the FAQ https://www.coronawarn.app/en/faq/#further_details including the point about how the data movements of 7-Day Average and 7-Day Incidence are only loosely coupled with an explanation of why this is so. Probably this has not been obvious before because the RKI daily situation reports do not show a trend for these two indicators. The press tends to use the 7-Day Incidence alone. This may be the first time that the two values have been displayed together closely and with trends. The display is likely to cause confusion to other people even though it is technically correct. |
Thanks, @MikeMcC399 , I can already state that also the FAQ is under review. We definitely will enhance this communication - over time. |
I'm curiously reading this, and I really don't understand anything about these numbers, etc, so I won't make any statement here. But I want to ask: What should we do now, IIUC @nilsalex does not consider this as solved, but @GisoSchroederSAP does?
Would that be a good solution for all parties involved here? |
I never accused you of hostility or insults. I never used those idiom mentioned above. Again: I offer support, getting you contacts at the source of the data and calculations. You may discuss and resolve this there. Good evening. |
|
@Ein-Tim
If there are errors in calculations (I mean, well, the excel screenshot above clearly contains rounding errors, as pointed out in my previous comment, but I trust that this is unrelated to the actual production calculation), fix them. So, discussing 1) or 2) may warrant getting RKI or similar involved for your discussions. For me, I don't see the need to discuss anything, as 1) or 2) really is your decision. That the expectation any reasonable user has, which is
for comparable datasets is right is a fact, for which I don't see the need for further clarification. Again, you may break with this expectation for a good reason (that is, consider this as solved). But I would be very curious about this reason. |
Just to make this clear, I'm neither a Developer/Community Manager nor related to the RKI/SAP/T-Systems in any way. The Corona-Warn-App is showing the official numbers published by the RKI, so if there is any problem regarding these numbers (or the trend indicators), I would speak to the RKI. So IMHO the best option for you would be to talk to the experts, as offered by @GisoSchroederSAP. |
Oops, sorry :-)
Well, I don't need to do that. It may be necessary for the decision the developers have to make. |
Oh, one more thing: Does anyone have the population data for federal states used by the RKI and by the App? I would very much like to know them. Is it verified that those numbers match? This may in fact be a proposal I would bring towards the RKI: Include the population data in the daily numbers or at least document the data at a prominent place. Also, is the code where the calculations are performed publicly available? I am not able to find it. |
No need to apologize, I should have made this clearer 🙂
Okay, since @GisoSchroederSAP is one of the Developers (at least he is inside of the Development Team of CWA) the decision seem to be already made... |
Sorry, not a developer anymore since decades. I am just working for the Community and with the Community, trying to answer questions, to follow up on issues, provide additional input, and to translate proposals into development requests. As mentioned earlier, I already involved other data analysts and product management in this issue. Beside the fact, the CWA just presents the values coming from the servers, I tried to explain the way of calculation here. As we disagree here, @nilsalex , again I invite you one more time to convince the experts on the source of the data. So far, I don't see a calculation issue/bug here. However, multiple times I agreed:
So, if you want to question the trend indicators, feel free to ping me and I try to connect you to the experts. |
Thank you so much for this information, I did not know this 🙂 Everybody, have a good night. |
Oh, that is an important clarification. Of course, the mobile app does not perform any calculations. What I understand is: @JsonProperty("infections_effective_7days_avg")
private Double infectionsReported7daysAvg;
@JsonProperty("infections_effective_7days_avg_growthrate")
private Double infectionsReported7daysGrowthrate;
@JsonProperty("infections_effective_7days_avg_trend_5percent")
private Integer infectionsReported7daysTrend5percent;
@JsonProperty("seven_day_incidence_1st_reported_daily")
private Double sevenDayIncidence;
@JsonProperty("seven_day_incidence_1st_reported_growthrate")
private Double sevenDayIncidenceGrowthrate;
@JsonProperty("seven_day_incidence_1st_reported_trend_1percent")
private Integer sevenDayIncidenceTrend1percent; Now, I was under the assumption that the backend performs some calculations to provide these values---because @GisoSchroederSAP talked in great length about the bottom-up calculation, etc. My question: What is the exact source for each of these values? Does the CWA backend perform any calculations itself? I would be grateful to anyone who can answer this. |
I already mentioned in an early statement here with a similar summary like the last one above, that I could reproduce all the numbers and trends by the public-available data sources that we discussed here earlier. But to detach the discussion from my personal view, I just transferred your request to the product owner and to one of the T-Systems data analysts, @nilsalex. Let's see, what we get out of there. Maybe, they forward this to the RKI directly. As soon as I get a response, I'll share it here. All, enjoy the weekend. |
Checking the values and the trends today, they are consistent with what we already found out. Using the historical data from https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Daten/Fallzahlen_Kum_Tab.xlsx the 7-Day Average of 7,420 can be confirmed. The value of the 7-Day Average 7 days before that on reporting day Feb 15, 2021 was 7,206 (50,442 / 7) - that is adding the values from Feb 9 to Feb 15, 2021 "Differenz Vortag Fälle" in "Fälle-Todesfälle-gesamt". So the 7-Day Average has gone up by 214 cases, or 3.0% of 7,206. The trend of 3% is less than the 5% hurdle, so it is categorized as a Steady trend. From the same Excel file the value of the 7-Day Incidence 60.2 from yesterday Feb 21, 2021 can be extracted. Today's value of 61.0 is an increase of 0.7 or 1.2% of yesterday's value of 60.2. The trend hurdle for comparisons with the previous day is 1%, so this trend of 1.2% is classed as Upwards. So the data and the display in the app agree with the base data from the Excel sheet published by RKI. 👍 Edit: Sorry about the decimal point and thousands separator in the screenshot. I had the locale on the device set to English (Germany) which produces strange results. I updated the text above to use comma as thousands separator and dot as decimal point, which is the usual way for English texts. |
I asked and received an answer in corona-warn-app/cwa-server#1223 (comment)
|
To summarize the findings:
There is a more detailed write-up in corona-warn-app/cwa-website#904 which is open for review. I hope that the information text regarding Trend will be acknowledged as a documentation bug and addressed through "Trend" "The arrow direction indicates whether the trend is increasing, decreasing, or remaining steady – that is, demonstrates a deviation of less than 1% compared to the previous day or 5% compared to the previous week. The color indicates this trend as positive (green), negative (red), or neutral (gray). The trend compares the value from the previous day with the value from two days ago or, for the 7-day trends, the average value from the last 7 days with the average value from the 7 days prior to that." |
Could we close this issue now? The trend for Confirmed New Infections is calculated based on a comparison to the value of the 7-Day Average one week previously whereas the trend for the 7-Day Incidence is calculated using the value one day previously. So that difference on its own is enough reason that the trends will not necessarily be the same on any one day. In your original post, you wrote under Expected Behaviour "Same trend for both indicators.". Through the research we did, we now know that it is not expected that trend will be the same, for all the reasons I gave in #528 (comment). I made a suggestion in the open issue #550 about changing the help text to explain better. Also there is a note in #535 (comment) that the FAQs will be updated. |
Sure. It is certainly not a bug because the behaviour is intended, as you explained. Let me, however, just note: I do not expect this behaviour as user as laid out in great detail and it's weird to tell the user what to expect :-) The question should really be: How does the user benefit from seeing different numbers and trends? But this is more an issue for the RKI as data source and the stakeholders as the ones who decide what information to present in the widgets. People have pointed out this inconsistency elsewhere (CWA is of course not the only medium where the data is published) but apparently it has been decided not to act on this. |
Hi @nilsalex , you are free to call it "inconsistency" - this is your opinion, I still don't agree here. I wanted to make this clear to avoid the impression, we agree your point of view. Hope you understand and accept our standpoint as well. |
Thank you very much for raising this issue. I learned a lot trying to understand it myself! You should see a button at the bottom so you can close it yourself. I'm not a moderator, just a Contributor so I can't close it for you. |
Avoid duplicates
Technical details
Describe the bug
As of now (16.02.2021, 17:11 CET), CWA shows a 7-day average of 7,274 confirmed infections and a 7-day incidence of 58.7/100,000. For the 7-day average, an arrow pointing towards the lower right indicates a downward trend, while for the 7-day incidence, an arrow pointing to the right indicates a stable trend. Yesterday, the difference was even higher: a downward trend vs an upward trend.
My understanding is that both numbers are related by a factor like
and therefore, the trend should always be the same. Or is there more to it?
Steps to reproduce the issue
Open the app and swipe through the widgets.
Expected behaviour
Same trend for both indicators.
Internal Tracking ID: EXPOSUREAPP-5225
The text was updated successfully, but these errors were encountered: