# COVID-19 is not the flu

> twitter: https://twitter.com/jperla/status/1247330008814632960

I love this plot. It's truly amazing. OK, actually I think I need to give you some context. At the beginning of the Covid-19 spread, there were many voices claiming that Covid-19 was not worst than the flu. Just to give you some examples:

- The Washington Post: [The flu is a much bigger threat than coronavirus, for now.](https://www.washingtonpost.com/health/time-for-a-reality-check-america-the-flu-is-a-much-bigger-threat-than-coronavirus-for-now/2020/01/31/46a15166-4444-11ea-b5fc-eefa848cde99_story.html) on February 1.
- Wired: [We Should Deescalate the War on the Coronavirus](https://www.wired.com/story/opinion-we-should-deescalate-the-war-on-the-coronavirus/) on January 29.
- National Post: [New Coronavirus may be no more dangerous than the flue despite worldwide alarm](https://nationalpost.com/health/new-coronavirus-may-be-no-more-dangerous-than-the-flu-despite-worldwide-alarm-experts) on February 4.


Even the president of the United States on March 9 wrote the following:

> twitter: https://twitter.com/realDonaldTrump/status/1237027356314869761

I think the problem was people was looking at the numbers of confirmed cases and deaths and the numbers were not that impressive. But as some smart people pointed out:

> twitter: https://twitter.com/MaxCRoser/status/1237392250637709313

So, on March 28, there were already many confirmed cases and confirmed deaths of Covid-19 in Italy. On that day I sent a message to my friend Patricio Reyes telling him that I wanted to do a plot comparing deaths by the flu and Covid-19 in Italy. I actually found an article stating "We estimated excess deaths of 7,027, 20,259, 15,801 and 24,981 attributable to influenza epidemics in the 2013/14, 2014/15, 2015/16 and 2016/17, respectively, using the Goldstein index":

- [Investigating the impact of influenza on excess mortality in all ages in Italy during recent seasons (2013/14–2016/17 seasons)](https://www.sciencedirect.com/science/article/pii/S1201971219303285)

So I looked at the data in Italy and this is how it looked like:

In [1]:
#hide
import pandas as pd
import matplotlib.pyplot as plt
import altair as alt
import numpy as np

In [2]:
#hide
full_data = pd.read_csv("https://covid.ourworldindata.org/data/ecdc/full_data.csv", index_col='date')

In [3]:
#hide
# Window size deaths
WS_deaths = 7

In [4]:
#hide
new_deaths_IT = full_data[full_data['location'].str.contains('Italy')]['new_deaths']
total_deaths_IT = full_data[full_data['location'].str.contains('Italy')]['total_deaths']

In [5]:
#hide
new_deaths_IT = new_deaths_IT[new_deaths_IT>0]
total_deaths_IT = total_deaths_IT[total_deaths_IT>0]

In [25]:
#hide
new_deaths_IT = new_deaths_IT.loc[:'2020-03-28']
total_deaths_IT = total_deaths_IT.loc[:'2020-03-28']

In [28]:
#hide
data = new_deaths_IT.reset_index()
data['rolling_new_deaths'] = new_deaths_IT.rolling(window=WS_deaths).mean().values
data["New deaths"] = len(data) * ["New deaths"]
data["7-day rolling average"] = len(data) * ["7-day rolling average"]

In [29]:
#hide_input
base = alt.Chart(total_deaths_IT.reset_index()).encode(
    x=alt.X('date:N', axis=alt.Axis(title='Date')),
).properties(
    title='Covid-19 in Italy: Total number of confirmed deaths'
)

line = base.mark_line(color='firebrick').encode(
    y=alt.Y('total_deaths', axis=alt.Axis(title='Total number of confirmed deaths')),
    tooltip = ['date', 'total_deaths']
)

circles = base.mark_circle(color='firebrick', size=60).encode(
    x=alt.X('date:N', axis=alt.Axis(title='Date')),
    y=alt.Y('total_deaths', axis=alt.Axis(title='Total number of confirmed deaths')),
    tooltip = ['date', 'total_deaths']
)

line+circles

In [30]:
#hide_input
bars = alt.Chart(data.reset_index()).mark_bar(opacity=0.7).encode(
    x = alt.X('date:N', axis=alt.Axis(title='Date')),
    y = alt.Y('new_deaths:Q', axis=alt.Axis(title='Deaths')),
    tooltip = ['date', 'new_deaths'],
    opacity=alt.Opacity('New deaths', legend=alt.Legend(title=""))
)

line = alt.Chart(data.reset_index()).mark_line(point={
      "filled": True,
      "fill": "firebrick"
    }, color='firebrick').encode(
    x=alt.X('date:N', axis=alt.Axis(title='Date')),
    y = alt.Y('rolling_new_deaths:Q'),
    shape=alt.Shape('7-day rolling average', legend=alt.Legend(title=""))
)

(bars + line).properties(
    title='Covid-19 in Italy: Daily confirmed deaths'
)

So in around one month there were more than 9,000 deaths and the daily deaths were more than 900. I guess I could have made a point by comparing it to the average number of deaths in a month in the worst year (2016/2017 with 24,981 deaths, or around 2100 deaths per month or 68 per day), but the flu is highly seasonal so I don't think it would be convincing.

And then on April 7, I found the mentioned tweet:

> twitter: https://twitter.com/jperla/status/1247330008814632960

I think it illustrates well what was important which was the growing rate. I had some people telling me that it didn't prove anything since the quantity of weekly deaths was smaller than the worst weeks of some years of the flu, but I don't agree. 

So here I recreated the plot:

In [64]:
#hide
data_raw = pd.read_csv('https://raw.githubusercontent.com/alonsosilvaallende/COVID-19/master/data/National_Custom_Data.csv', thousands=',')

In [65]:
#hide
data_season_2018_2019 = data_raw.query("SEASON == '2018-19'")
data_season_2017_2018 = data_raw.query("SEASON == '2017-18'")
data_season_2016_2017 = data_raw.query("SEASON == '2016-17'")
data_season_2015_2016 = data_raw.query("SEASON == '2015-16'")

In [66]:
#hide
total_deaths_US = full_data[full_data['location'].str.contains('United States')]['total_deaths']
total_deaths_US = total_deaths_US[total_deaths_US>0]

In [133]:
#hide
i = 0
Week_09 = total_deaths_US.iloc[i]
Week_10 = total_deaths_US.iloc[i+7]  - Week_09
Week_11 = total_deaths_US.iloc[i+14] - Week_10
Week_12 = total_deaths_US.iloc[i+21] - Week_11
Week_13 = total_deaths_US.iloc[i+28] - Week_12
Week_14 = total_deaths_US.iloc[i+35] - Week_13
Week_15 = total_deaths_US.iloc[i+42] - Week_14
Week_16 = total_deaths_US.iloc[i+49] - Week_15

In [134]:
#hide
deaths_covid = [Week_09, Week_10, Week_11, Week_12, Week_13, Week_14, Week_15, Week_16]
deaths_covid

[1, 16, 41, 299, 1892, 6609, 13999, 24911]

In [136]:
#hide
data["Weeks"] = [i for i in np.arange(40,53)]+[i for i in np.arange(1,40)]

In [142]:
#hide
data = pd.DataFrame()
data["Season 2018-2019"] = (data_season_2018_2019['NUM INFLUENZA DEATHS']+data_season_2018_2019['NUM PNEUMONIA DEATHS']).reset_index(drop=True)
data["Season 2017-2018"] = (data_season_2017_2018['NUM INFLUENZA DEATHS']+data_season_2017_2018['NUM PNEUMONIA DEATHS']).reset_index(drop=True)
data["Season 2016-2017"] = (data_season_2016_2017['NUM INFLUENZA DEATHS']+data_season_2016_2017['NUM PNEUMONIA DEATHS']).reset_index(drop=True)
data["Season 2015-2016"] = (data_season_2015_2016['NUM INFLUENZA DEATHS']+data_season_2015_2016['NUM PNEUMONIA DEATHS']).reset_index(drop=True)

In [143]:
#hide
data["Covid-19"] = len(data) * np.nan
data["Covid-19"].iloc[21:27] = deaths_covid[:-2]

In [146]:
#hide_input
line = alt.Chart(data.reset_index().melt('index')).mark_line().encode(
        x = alt.X('index:N', axis=alt.Axis(title='Week of Flu Season')),
        y = 'value',
        color='variable'
)
line.properties(width=800)

And of course, the growing rate was a proof of what was going to happen. Here is the next two weeks:

In [147]:
#hide
data["Covid-19"] = len(data) * np.nan
data["Covid-19"].iloc[21:28] = deaths_covid[:-1]

In [148]:
#hide_input
line = alt.Chart(data.reset_index().melt('index')).mark_line().encode(
        x = alt.X('index:N', axis=alt.Axis(title='Week of Flu Season')),
        y = 'value',
        color='variable'
)
line.properties(width=800)

In [149]:
#hide
data["Covid-19"] = len(data) * np.nan
data["Covid-19"].iloc[21:29] = deaths_covid

In [150]:
#hide_input
line = alt.Chart(data.reset_index().melt('index')).mark_line().encode(
        x = alt.X('index:N', axis=alt.Axis(title='Week of Flu Season')),
        y = 'value',
        color='variable'
)
line.properties(width=800)

Q.E.D.