Confirming Peak of Epidemic Curve (CPEC)
========================================

An epidemic curve shows frequency of new cases per day. A peak on that curve means the day when highest number of individuals were found to be infected and post that the frequency reduced. Predicting the peak is of significance since it indicates the maximum number of infected patients requiring care at a time. It can help Government prep the healthcare system and also gives a rough estimate on when the epidemic might end. In this study, we look at a couple of models which are predicting peaks and verify whether their predictions are correct.

Recently, Data Driven Innovation lab (DDI) at Singapore University of Technology and Design (SUTD) came out with a model which tries to provide peak dates for various conuntries: https://www.altaveu.com/documents/covid19predictionpaper20200426.pdf

Similarly, Institute for Health Metrics and Evaluation (IHME) at University of Washington (UW) has come up with models which declare peaks for some countries: https://covid19.healthdata.org/

We'll verify whether the peak dates provided by these two models are correct or not. 
Since the peak is a local maximum, the slope of the curve should hit 0.

In [1]:
import pandas as pd
import numpy as np
from matplotlib.dates import date2num

For this study, we've downloaded country-wise data from "Our World In Data": 
https://ourworldindata.org/coronavirus

In [71]:
cols = ['location', 'date', 'total_cases', 'new_cases', 'total_deaths', 'new_deaths']
dates = ['date']
df = pd.read_csv("csv/owid-covid-data.csv", 
                 usecols=cols,
                 parse_dates=dates)
df.sample()

Unnamed: 0,location,date,total_cases,new_cases,total_deaths,new_deaths
11718,Russia,2020-04-11,11917,1786,94,18


For each country, onset date is considered when number of total cases exceeds 100. 

This is done since data from the initial days is usually noisy.

In [76]:
# consider days when total_cases exceeded 100
fdf = df.loc[df['total_cases']>100]

In [77]:
def calc_peak(df, country='India', col='new_cases', plot=True, save=False):
    # filter sort and prep data
    grp_df = df.loc[df['location']==country]
    grp_df = grp_df.sort_values(by='date')
    grp_df.reset_index(inplace=True, drop=True)
    
    # calc daily new cases avg and normalise to pct
    grp_df[col + '_pct'] = grp_df[col] / grp_df[col].max() * 100
    
    # calc change in new cases, smoothened over a period of 6 days
    grp_df['avg_' + col + '_pct'] = grp_df[col + '_pct'].rolling(6, min_periods=1, center=True).mean().rolling(6, min_periods=1, center=True).mean()
    grp_df['delta_' + col + '_pct'] = (grp_df['avg_' + col + '_pct'].shift(-3) - grp_df['avg_' + col + '_pct'].shift(3, fill_value=0)) / 6
    
    # calc slope in degrees, assuming 5 pct per day growth = 45 deg slope. then smoothen over 6 days
    grp_df['slope'] = np.degrees(np.arctan(grp_df['delta_' + col + '_pct'] / 5))
    grp_df['slope'] = grp_df['slope'].rolling(6, min_periods=1, center=True).mean()

    # calc second order change
    grp_df['delta_slope'] = (grp_df['slope'].shift(-3) - grp_df['slope'].shift(3, fill_value=0))
    
    # calc peaks, i.e. when slope is 0
    sign_change = np.nonzero(np.abs(np.diff(np.sign(grp_df['slope'].interpolate(method='pad')))))[0]
    peaks = grp_df.loc[sign_change]
    
    # filter noise using second order derivative
    peaks = peaks.loc[peaks['delta_slope'] < 0]
    
    # use highest peak in case multiple peaks found
    peak = peaks.loc[peaks['avg_' + col + '_pct'].idxmax()] if len(peaks) else None
    
    # return if plotting not required
    if not plot:
        return peak
    
    # plot all 3 cols as subplots
    ax_daily_pct, ax_slp = grp_df.plot(
        x = 'date',
        y = [col + '_pct', 'slope'],
        title = [country + l for l in [': % daily' + col, ': Slope (in degrees)']],
        grid = True,
        figsize = (9,7),
        subplots = True,
        sharex = False
    )
    
    # add moving avg to chart
    grp_df.plot(x = 'date', y = 'avg_' + col + '_pct', grid = True, ax = ax_daily_pct)
    # add x axis to slope chart
    ax_slp.axhline(y=0, linewidth=2, color='r')
    
    # add vertical line when peak hit
    if not peak is None:
        for ax in [ax_daily_pct, ax_slp]:
            ax.axvline(x=peak['date'], linewidth=2, color='r')
    
    # save charts to docs folder
    if save:
        ax_slp.get_figure().savefig("docs/assets/images/02/" + country.lower().replace(" ", "_") + ".png")
    
    return peak

In [67]:
# Data from University of Washington and Singapore University of Technology and Design
COUNTRY_DATA = [
    {"location": "India", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "China", "uw_peak": "", "sutd_peak": "2020-02-08"},
    # sutd and uw
    {"location": "United States", "uw_peak": "2020-04-15", "sutd_peak": "2020-04-10"},
    {"location": "United Kingdom", "uw_peak": "2020-04-10", "sutd_peak": "2020-04-12"},
    {"location": "Italy", "uw_peak": "2020-03-27", "sutd_peak": "2020-03-29"},
    {"location": "Spain", "uw_peak": "2020-04-01", "sutd_peak": "2020-04-02"},
    {"location": "Germany", "uw_peak": "2020-04-16", "sutd_peak": "2020-04-01"},
    {"location": "France", "uw_peak": "2020-04-05", "sutd_peak": "2020-04-03"},
    {"location": "Portugal", "uw_peak": "2020-04-03", "sutd_peak": "2020-04-06"},
    {"location": "Switzerland", "uw_peak": "2020-04-04", "sutd_peak": "2020-03-29"},
    {"location": "Slovenia", "uw_peak": "2020-04-07", "sutd_peak": "2020-03-28"},
    {"location": "Norway", "uw_peak": "2020-04-20", "sutd_peak": "2020-03-27"},
    {"location": "Netherlands", "uw_peak": "2020-04-07", "sutd_peak": "2020-04-08"},
    {"location": "Luxembourg", "uw_peak": "2020-04-11", "sutd_peak": "2020-03-27"},
    {"location": "Lithuania", "uw_peak": "2020-04-10", "sutd_peak": "2020-04-03"},
    {"location": "Latvia", "uw_peak": "2020-05-15", "sutd_peak": "2020-03-30"},
    {"location": "Iceland", "uw_peak": "2020-04-06", "sutd_peak": "2020-03-28"},
    {"location": "Hungary", "uw_peak": "2020-04-24", "sutd_peak": "2020-04-15"},
    {"location": "Greece", "uw_peak": "2020-04-03", "sutd_peak": "2020-03-30"},
    {"location": "Finland", "uw_peak": "2020-04-21", "sutd_peak": "2020-04-11"},
    {"location": "Estonia", "uw_peak": "2020-04-02", "sutd_peak": "2020-04-01"},
    {"location": "Denmark", "uw_peak": "2020-04-04", "sutd_peak": "2020-04-06"},
    {"location": "Czech Republic", "uw_peak": "2020-04-14", "sutd_peak":"2020-04-01"},
    {"location": "Cyprus", "uw_peak": "2020-03-30", "sutd_peak": "2020-04-05"},
    {"location": "Croatia", "uw_peak": "2020-04-19", "sutd_peak": "2020-04-02"},
    {"location": "Canada", "uw_peak": "2020-04-16", "sutd_peak": "2020-04-12"},
    {"location": "Belgium", "uw_peak": "2020-04-10", "sutd_peak": "2020-04-08"},
    {"location": "Austria", "uw_peak": "2020-04-08", "sutd_peak": "2020-03-26"},
    # sutd only
    {"location": "United Arab Emirates", "uw_peak": "", "sutd_peak": "2020-04-17"},
    {"location": "Ukraine", "uw_peak": "", "sutd_peak": "2020-04-21"},
    {"location": "Turkey", "uw_peak": "", "sutd_peak": "2020-04-14"},
    {"location": "Tunisia", "uw_peak": "", "sutd_peak": "2020-04-03"},
    {"location": "Thailand", "uw_peak": "", "sutd_peak": "2020-03-28"},
    {"location": "Tanzania", "uw_peak": "", "sutd_peak": "2020-04-23"},
    {"location": "Taiwan", "uw_peak": "", "sutd_peak": "2020-03-24"},
    {"location": "Sweden", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "Sudan", "uw_peak": "", "sutd_peak": "2020-04-22"},
    {"location": "South Korea", "uw_peak": "", "sutd_peak": "2020-03-02"},
    {"location": "South Africa", "uw_peak": "", "sutd_peak": "2020-05-03"},
    {"location": "Somalia", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "Slovakia", "uw_peak": "", "sutd_peak": "2020-04-23"},
    {"location": "Singapore", "uw_peak": "", "sutd_peak": "2020-05-05"},
    {"location": "Serbia", "uw_peak": "", "sutd_peak": "2020-04-15"},
    {"location": "Saudi Arabia", "uw_peak": "", "sutd_peak": "2020-04-27"},
    {"location": "Russia", "uw_peak": "", "sutd_peak": "2020-04-24"},
    {"location": "Romania", "uw_peak": "", "sutd_peak": "2020-04-13"},
    {"location": "Qatar", "uw_peak": "", "sutd_peak": "2020-05-27"},
    {"location": "Poland", "uw_peak": "", "sutd_peak": "2020-04-13"},
    {"location": "Philippines", "uw_peak": "", "sutd_peak": "2020-04-07"},
    {"location": "Peru", "uw_peak": "", "sutd_peak": "2020-04-18"},
    {"location": "Paraguay", "uw_peak": "", "sutd_peak": "2020-04-10"},
    {"location": "Panama", "uw_peak": "", "sutd_peak": "2020-04-12"},
    {"location": "Pakistan", "uw_peak": "", "sutd_peak": "2020-04-27"},
    {"location": "Oman", "uw_peak": "", "sutd_peak": "2020-04-19"},
    {"location": "Niger", "uw_peak": "", "sutd_peak": "2020-04-08"},
    {"location": "New Zealand", "uw_peak": "", "sutd_peak": "2020-03-29"},
    {"location": "Myanmar", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Morocco", "uw_peak": "", "sutd_peak": "2020-04-24"},
    {"location": "Montenegro", "uw_peak": "", "sutd_peak": "2020-04-03"},
    {"location": "Monaco", "uw_peak": "", "sutd_peak": "2020-03-30"},
    {"location": "Moldova", "uw_peak": "", "sutd_peak": "2020-04-13"},
    {"location": "Mexico", "uw_peak": "", "sutd_peak": "2020-05-01"},
    {"location": "Mauritius", "uw_peak": "", "sutd_peak": "2020-04-01"},
    {"location": "Malta", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Mali", "uw_peak": "", "sutd_peak": "2020-04-21"},
    {"location": "Malaysia", "uw_peak": "", "sutd_peak": "2020-03-31"},
    {"location": "Madagascar", "uw_peak": "", "sutd_peak": "2020-04-03"},
    {"location": "Macedonia", "uw_peak": "", "sutd_peak": "2020-04-12"},
    {"location": "Liechtenstein", "uw_peak": "", "sutd_peak": "2020-03-20"},
    {"location": "Liberia", "uw_peak": "", "sutd_peak": "2020-04-18"},
    {"location": "Lebanon", "uw_peak": "", "sutd_peak": "2020-03-25"},
    {"location": "Kyrgyzstan", "uw_peak": "", "sutd_peak": "2020-04-11"},
    {"location": "Kuwait", "uw_peak": "", "sutd_peak": "2020-04-22"},
    {"location": "Kosovo", "uw_peak": "", "sutd_peak": "2020-04-18"},
    {"location": "Kenya", "uw_peak": "", "sutd_peak": "2020-04-21"},
    {"location": "Kazakhstan", "uw_peak": "", "sutd_peak": "2020-05-01"},
    {"location": "Jordan", "uw_peak": "", "sutd_peak": "2020-03-26"},
    {"location": "Jersey", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Japan", "uw_peak": "", "sutd_peak": "2020-04-14"},
    {"location": "Jamaica", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "Israel", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Isle of Man", "uw_peak": "", "sutd_peak": "2020-04-08"},
    {"location": "Ireland", "uw_peak": "", "sutd_peak": "2020-04-15"},
    {"location": "Iraq", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Iran", "uw_peak": "", "sutd_peak": "2020-04-01"},
    {"location": "Indonesia", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "India", "uw_peak": "", "sutd_peak": "2020-04-20"},
    {"location": "Honduras", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Guyana", "uw_peak": "", "sutd_peak": "2020-04-10"},
    {"location": "Guinea", "uw_peak": "", "sutd_peak": "2020-05-17"},
    {"location": "Guernsey", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Guatemala", "uw_peak": "", "sutd_peak": "2020-05-04"},
    {"location": "Guam", "uw_peak": "", "sutd_peak": "2020-04-01"},
    {"location": "Gibraltar", "uw_peak": "", "sutd_peak": "2020-03-30"},
    {"location": "Ghana", "uw_peak": "", "sutd_peak": "2020-05-01"},
    {"location": "Georgia", "uw_peak": "", "sutd_peak": "2020-04-13"},
    {"location": "Gabon", "uw_peak": "", "sutd_peak": "2020-04-24"},
    {"location": "Faeroe Islands", "uw_peak": "", "sutd_peak": "2020-03-16"},
    {"location": "Ethiopia", "uw_peak": "", "sutd_peak": "2020-04-12"},
    {"location": "El Salvador", "uw_peak": "", "sutd_peak": "2020-04-23"},
    {"location": "Egypt", "uw_peak": "", "sutd_peak": "2020-04-18"},
    {"location": "Ecuador", "uw_peak": "", "sutd_peak": "2020-04-12"},
    {"location": "Dominican Republic", "uw_peak": "", "sutd_peak": "2020-04-18"},
    {"location": "Djibouti", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Democratic Republic of Congo", "uw_peak": "", "sutd_peak": "2020-05-08"},
    {"location": "Cuba", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Cote D Ivoire", "uw_peak": "", "sutd_peak": "2020-04-24"},
    {"location": "Costa Rica", "uw_peak": "", "sutd_peak": "2020-03-30"},
    {"location": "Colombia", "uw_peak": "", "sutd_peak": "2020-04-25"},
    {"location": "Chile", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Cameroon", "uw_peak": "", "sutd_peak": "2020-04-08"},
    {"location": "Cambodia", "uw_peak": "", "sutd_peak": "2020-03-21"},
    {"location": "Burkina Faso", "uw_peak": "", "sutd_peak": "2020-04-04"},
    {"location": "Brunei", "uw_peak": "", "sutd_peak": "2020-03-16"},
    {"location": "Brazil", "uw_peak": "", "sutd_peak": "2020-04-21"},
    {"location": "Botswana", "uw_peak": "", "sutd_peak": "2020-04-10"},
    {"location": "Bosnia And Herzegovina", "uw_peak": "", "sutd_peak": "2020-04-07"},
    {"location": "Bolivia", "uw_peak": "", "sutd_peak": "2020-04-30"},
    {"location": "Belarus", "uw_peak": "", "sutd_peak": "2020-04-29"},
    {"location": "Barbados", "uw_peak": "", "sutd_peak": "2020-03-31"},
    {"location": "Bangladesh", "uw_peak": "", "sutd_peak": "2020-04-23"},
    {"location": "Bahrain", "uw_peak": "", "sutd_peak": "2020-05-16"},
    {"location": "Azerbaijan", "uw_peak": "", "sutd_peak": "2020-04-08"},
    {"location": "Australia", "uw_peak": "", "sutd_peak": "2020-03-27"},
    {"location": "Armenia", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Argentina", "uw_peak": "", "sutd_peak": "2020-04-11"},
    {"location": "Andorra", "uw_peak": "", "sutd_peak": "2020-03-29"},
    {"location": "Algeria", "uw_peak": "", "sutd_peak": "2020-04-10"},
    {"location": "Albania", "uw_peak": "", "sutd_peak": "2020-04-16"},
    {"location": "Afghanistan", "uw_peak": "", "sutd_peak": "2020-04-29"},
]

In [84]:
country = []
cpec = []
cpec_death = []
uw = []
sutd = []

for c in COUNTRY_DATA:
    result = ''
    peak = calc_peak(fdf, c['location'], 'new_cases', False, False)
    if not peak is None:
        result = str(peak['date'].date())
    cpec.append(result)
    
    result = ''
    death_peak = calc_peak(fdf, c['location'], 'new_deaths', False, False)
    if not death_peak is None:
        result = str(death_peak['date'].date())
    cpec_death.append(result)

    country.append(c['location'])
    uw.append(c['uw_peak'])
    sutd.append(c['sutd_peak'])

results = pd.DataFrame({
    "country": country,
    "cpec": cpec,
    "cpec_death": cpec_death,
    "uw": uw,
    "sutd": sutd
})

print("BOTH PEAKS HIT:")
both_hit = results.loc[ (results['cpec']!='') & (results['cpec_death']!='') ]
print(both_hit.to_csv(index=False, sep='|'))

print("NO PEAK HIT:")
no_hit = results.loc[ (results['cpec']=='') & (results['cpec_death']=='') ]
print(no_hit.to_csv(index=False, sep='|'))

print("ONE PEAK HIT:")
one_hit = results.loc[ (results['cpec']!='') ^ (results['cpec_death']!='') ]
print(one_hit.to_csv(index=False, sep='|'))

BOTH PEAKS HIT:
country|cpec|cpec_death|uw|sutd
China|2020-02-13|2020-04-18||2020-02-08
United States|2020-04-10|2020-04-18|2020-04-15|2020-04-10
United Kingdom|2020-04-15|2020-04-13|2020-04-10|2020-04-12
Italy|2020-03-26|2020-03-31|2020-03-27|2020-03-29
Spain|2020-04-01|2020-04-03|2020-04-01|2020-04-02
Germany|2020-04-03|2020-04-18|2020-04-16|2020-04-01
France|2020-04-02|2020-04-09|2020-04-05|2020-04-03
Portugal|2020-04-04|2020-04-14|2020-04-03|2020-04-06
Switzerland|2020-03-30|2020-04-07|2020-04-04|2020-03-29
Slovenia|2020-03-30|2020-04-10|2020-04-07|2020-03-28
Norway|2020-03-29|2020-04-13|2020-04-20|2020-03-27
Netherlands|2020-04-12|2020-04-07|2020-04-07|2020-04-08
Luxembourg|2020-03-28|2020-04-11|2020-04-11|2020-03-27
Lithuania|2020-04-03|2020-04-16|2020-04-10|2020-04-03
Latvia|2020-03-31|2020-04-25|2020-05-15|2020-03-30
Iceland|2020-03-30|2020-04-08|2020-04-06|2020-03-28
Hungary|2020-04-12|2020-04-20|2020-04-24|2020-04-15
Greece|2020-04-02|2020-04-05|2020-04-03|2020-03-30
Finland|