# Florida COVID epicenter analysis
This analysis seeks to dig deeper into Florida's COVID data (updated daily) in order to truly assess the actual human cost and make a prediction on the trend of the virus. My insights have been published on [Towards Data Science](https://towardsdatascience.com/@nathanfreystaetter)

### Florida DOH COVID Case Data
Using publicly available data from Florida's department of health - useful facts about the data:

1) Person-level record of all cases with a death indicator <br>
2) Includes specific demographic information related to the individual<br>
3) Data is collected from 2020-03-01 and onward<br>

### Questions
This analysis will seek to answer the following questions:<br>
1) How quickly has the virus spread in the state and which demographic drove the rate of spread?<br>
2) What is the true death rate by demographic?<br>
3) How large of an impact can we expect in the near future given current case volumes?<br>
4) Based on multiple plausible scenarios, what is the best course of action Florida can take to reopen the economy?

## Setup
### Import libraries and define global variables

In [222]:
# Import config and libraries
import config as c
import chart_studio
import chart_studio.plotly as py
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import chart_studio.tools as tls
import pandas as pd
import numpy as np
import urllib, json
import datetime as dt
from datetime import timedelta
from datetime import date
import matplotlib
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go

#API key to Chart studio
chart_studio.tools.set_credentials_file(username=c.cs_un, api_key=c.cs_key)

#define latest week and set global variable
today=date.today()
last_full_week = (today - dt.timedelta(days=today.weekday())).strftime('%Y-%m-%d')
death_baked_week = (today - dt.timedelta(days=today.weekday()+21)).strftime('%Y-%m-%d')
min_3w_window = (today - dt.timedelta(days=today.weekday()+42)).strftime('%Y-%m-%d')
max_3w_window = (today + dt.timedelta(days=today.weekday()+19)).strftime('%Y-%m-%d')

#Create bins based age groups
bins = [0, 25, 50, 60, 70, 80, 100]
bins_condensed = [0, 50, 70, 100]

# Customized DF output size
pd.set_option('display.max_rows', 1000)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

### Import data from Florida DOH

In [2]:
#Florida Department of Health: Coronavirus source data
url = "https://opendata.arcgis.com/datasets/37abda537d17458bae6677b8ab75fcb9_0.geojson"

#Create pandas dataframe from Florida DOH dataset
json_url = urllib.request.urlopen(url)
data = json.loads(json_url.read())
L = []
for x in data['features']:
    d = {}
    for k, v in x.items():
        if k == 'properties':
            for x, y in v.items():
                d[x] = y
        else:
            pass
    L.append(d)

df = pd.DataFrame(L)
df.head(5)

Unnamed: 0,County,Age,Age_group,Gender,Jurisdiction,Travel_related,Origin,EDvisit,Hospitalized,Died,Case_,Contact,Case1,EventDate,ChartDate,ObjectId
0,Broward,31,25-34 years,Female,FL resident,No,,UNKNOWN,UNKNOWN,,Yes,UNKNOWN,2020/07/01 05:00:00+00,2020/06/30 00:00:00+00,2020/07/01 05:00:00+00,1
1,Dade,45,45-54 years,Male,FL resident,Unknown,,UNKNOWN,UNKNOWN,,Yes,,2020/07/01 05:00:00+00,2020/07/01 00:00:00+00,2020/07/01 05:00:00+00,2
2,Broward,50,45-54 years,Male,FL resident,No,,YES,NO,,Yes,Yes,2020/07/01 05:00:00+00,2020/06/30 00:00:00+00,2020/07/01 05:00:00+00,3
3,Dade,81,75-84 years,Female,FL resident,Unknown,,UNKNOWN,UNKNOWN,Yes,Yes,,2020/07/01 05:00:00+00,2020/06/30 00:00:00+00,2020/07/01 05:00:00+00,4
4,Broward,56,55-64 years,Male,FL resident,No,,YES,YES,,Yes,UNKNOWN,2020/07/01 05:00:00+00,2020/06/25 00:00:00+00,2020/07/01 05:00:00+00,5


### Clean data

In [5]:
df.dtypes

County            object
Age               object
Age_group         object
Gender            object
Jurisdiction      object
Travel_related    object
Origin            object
EDvisit           object
Hospitalized      object
Died              object
Case_             object
Contact           object
Case1             object
EventDate         object
ChartDate         object
ObjectId           int64
dtype: object

In [14]:
df.Age.value_counts()

21     10203
30     10169
29     10118
22     10046
23      9994
25      9904
31      9847
28      9777
27      9771
24      9747
26      9675
32      9423
20      9280
33      9060
34      8949
35      8853
36      8729
37      8723
49      8457
48      8327
19      8285
39      8244
38      8236
50      8115
40      8092
47      8085
51      7930
43      7870
46      7829
41      7827
55      7812
52      7765
42      7729
45      7689
54      7613
44      7590
53      7584
56      7415
57      7348
18      7207
58      6885
59      6597
60      6313
61      5925
62      5527
63      5317
64      4944
17      4708
65      4563
66      4489
67      4263
68      3960
69      3845
16      3609
70      3521
71      3290
73      3286
72      3226
15      2937
74      2696
75      2650
76      2594
14      2590
77      2504
13      2333
78      2292
12      2194
0       2164
79      2080
80      1968
81      1941
11      1939
10      1886
82      1845
9       1823
83      1785
8       1678

In [6]:
desc_df = df.describe(include = 'all')
desc_df.loc['% null'] = df.isnull().mean()
desc_df

Unnamed: 0,County,Age,Age_group,Gender,Jurisdiction,Travel_related,Origin,EDvisit,Hospitalized,Died,Case_,Contact,Case1,EventDate,ChartDate,ObjectId
count,497330,497330.0,497330,497330,497330,497330,497330.0,485434,492858,497330.0,497330,497330.0,497330,497330,497330,497330.0
unique,68,112.0,11,3,3,3,1014.0,4,4,2.0,1,5.0,154,82637,154,
top,Dade,21.0,25-34 years,Female,FL resident,Unknown,,UNKNOWN,UNKNOWN,,Yes,,2020/07/11 05:00:00+00,2020/07/03 00:00:00+00,2020/07/11 05:00:00+00,
freq,124759,10203.0,96693,254694,491765,271276,488752.0,278047,266954,489804.0,497330,199293.0,15359,9860,15359,
mean,,,,,,,,,,,,,,,,248665.5
std,,,,,,,,,,,,,,,,143566.949025
min,,,,,,,,,,,,,,,,1.0
25%,,,,,,,,,,,,,,,,124333.25
50%,,,,,,,,,,,,,,,,248665.5
75%,,,,,,,,,,,,,,,,372997.75


In [17]:
def clean_data(df):
    '''
    INPUT:
    df - pandas dataframe that needs to be cleaned
        
    OUTPUT:
    df - a new dataframe cleaned up for analysis in the following ways:
            1. Create cap and floors to continuous columns; fill NA
            2. Fix data formats to be usable
            3. Create new columns needed for analysis
    '''
    #Fix data formats
    df = df.astype({'Age': 'int32'})
    df['ChartDate'] = df['ChartDate'].apply( lambda x: pd.to_datetime(x)).dt.normalize()
    df['DateMonth'] = df['ChartDate'].apply(lambda x: x.strftime('%M-%Y'))
    
    #Create cap and floors; fill NA
    df['Age']=np.where( df['Age'] == 'NA',50, df['Age']) #impute missing age with median age
    df['Died_int']=np.where( df['Died'] == "NA",0, 1) #Convert Died boolean to a binary integer
    df['Age']=np.maximum(np.minimum(df['Age'],100),1)
    
    #Create new columns
    df['DateWeek'] = df.apply(lambda row: row['ChartDate'] - dt.timedelta(days=row['ChartDate'].weekday()), axis=1)
    df['binned'] = pd.cut(df['Age'], bins=bins)
    df['binned_condensed'] = pd.cut(df['Age'], bins=bins_condensed)
    df['DateMonth'] = df['ChartDate'].dt.to_period('M')
    df['avg_death_rate']=df['Died_int']
    df['tdr_date'] = df['ChartDate'] + timedelta(days=14) + timedelta(days=7)
    df['TDR_Week'] = df.apply(lambda row: row['tdr_date'] - dt.timedelta(days=row['tdr_date'].weekday()+1), axis=1)
    
    return df

In [18]:
df=clean_data(df)

### Q1. How quickly has the virus spread in the state and which demographic drove the rate of spread?

In [19]:
# Case volumes by age group and reported date

age = df[df['ChartDate']<=last_full_week].pivot_table(values=['Case1'], 
                      index=['DateWeek'],
                      columns=['binned_condensed'],
                      aggfunc='count')

# age = age.div(age.sum(1), axis=0)
age#.plot(kind='bar', stacked=True)

Unnamed: 0_level_0,Case1,Case1,Case1
binned_condensed,"(0, 50]","(50, 70]","(70, 100]"
DateWeek,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
2020-03-02 00:00:00+00:00,2,7,5
2020-03-09 00:00:00+00:00,52,52,19
2020-03-16 00:00:00+00:00,467,331,170
2020-03-23 00:00:00+00:00,1982,1248,662
2020-03-30 00:00:00+00:00,3772,2560,1157
2020-04-06 00:00:00+00:00,3628,2551,1245
2020-04-13 00:00:00+00:00,3136,2000,1113
2020-04-20 00:00:00+00:00,2891,1687,1104
2020-04-27 00:00:00+00:00,2273,1454,1040
2020-05-04 00:00:00+00:00,2082,1137,886


In [31]:
# Case volumes by age group and reported date - Plot
df2 = df[df['ChartDate']<last_full_week].groupby(['DateWeek','binned']).agg({'Case1':'count'}).reset_index()

fig = px.bar(df2,
             x="DateWeek",
             y="Case1",
             color='binned',
             barmode='stack',
             labels = {'DateWeek':'Date','Case1':'Confirmed Case Volume','binned':'Age Group'},
             title={
                'text': "Weekly COVID-19 Case Volumes by Age Group - Florida",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})

fig.update_yaxes(tickformat = ',.0')
fig.update_traces(hovertemplate=None)
fig.update_layout(hovermode="x")
config = {'responsive': False}

fig.show()
#py.plot(fig, filename = 'Weekly COVID-19 Case Volumes by Age Group - Florida', auto_open=False, conifg=config)
fig.write_image("chart_images/Weekly COVID-19 Case Volumes by Age Group - Florida.png")

### A1
Cases have started to increase in late June, and it appears to be a similar growth across all age groups. The week of July 12th had the worst number of cases followed by a slight decrease in the week of July 19th. This signals that the rate of spread has decreased as authorities have increased regulations to limit large gatherings and mandate mask wearing. However, the total amount of cases is still very high.

### Q2. What is the true death rate by demographic?

In [21]:
#1D Death rates by age group over time (graph)
death_rate = df[df['ChartDate']<death_baked_week].pivot_table(values=['avg_death_rate'], 
                      index=['DateMonth'],
                      columns=['binned'],
                      aggfunc='mean')

# age = age.div(age.sum(1), axis=0)
death_rate#.plot(kind='line', legend=False)

Unnamed: 0_level_0,avg_death_rate,avg_death_rate,avg_death_rate,avg_death_rate,avg_death_rate,avg_death_rate
binned,"(0, 25]","(25, 50]","(50, 60]","(60, 70]","(70, 80]","(80, 100]"
DateMonth,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2020-03,0.0,1.015965,2.971888,8.191808,18.005181,38.888889
2020-04,0.0,0.777202,2.2187,7.022824,17.406407,33.73339
2020-05,0.056497,0.321294,1.970009,6.041079,15.87934,29.631499
2020-06,0.021002,0.214671,1.064252,3.643349,9.174085,22.156476
2020-07,0.030108,0.146413,0.665075,2.54229,7.542001,18.23717


In [32]:
df2 = df[(df['ChartDate']<death_baked_week)&(df['ChartDate']>'2020-03-15')].groupby(['DateWeek','binned']).agg({'Died_int':'mean'}).reset_index()

fig = px.line(df2,
             x="DateWeek",
             y="Died_int",
             color='binned',
             #barmode='stack',
             labels = {'DateWeek':'Date','Died_int':'Death Rate','binned':'Age Group'},
             title={
                'text': "Average Weekly Death Rate by Case Reported Date - Florida",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})

fig.update_yaxes(tickformat = ',.1%')
fig.update_traces(hovertemplate=None)
fig.update_layout(hovermode="x")

fig.show()
#py.plot(fig, filename = 'Average Weekly Death Rate by Case Reported Date - Florida', auto_open=False)
fig.write_image("chart_images/Average Weekly Death Rate by Case Reported Date - Florida.png")

### A2.
The death rate has decreased significantly by age group over time. This can be driven by several factors, including an increase in positive test results, adequate supply of ventilators and hospital beds, and improved treatment plans as doctors learn more about how to address this virus. 

The spike in case volumes in late June seems to have made the falling trend to plateau. Cases reported in the week of June 21st saw mixed results: age groups 60-70 and 80+ saw a slight increase in death rates, while ages 70-80 saw a slight decrease in death rates.

### Q3. How large of an impact can we expect in the near future given current case volumes?
Predictions of deaths will take actual case volumes and apply two key assumptions to get death volumes:<br><br>
1) Average time from case reported date to death date is equal to 21 days. This is based on empirical evidence that it takes approximately 2 weeks for infected people to succumb to the virus, plus an additional 7 days for the death to be reported. In reality, each case has some variability involved, but unfortunately the data isn't available to build a model on this assumption.<br>
2) Future death rates will be equal to past death rates in the month of June. While the death rate has decreased over time, there are many unique factors that may just as equally drive up death rates, such as higher case volumes that may potentially overwhelm existing hospital systems and yield worse outcomes.

In [48]:
df['avg_death_rate']=df['avg_death_rate']/100

In [49]:
#Create a dictionary with death rate assumptions based on first 3 weeks of June, merge to df
dr_df = pd.pivot_table(df[(df['ChartDate']>=(min_3w_window)) & (df['ChartDate']<death_baked_week) ],index=['binned'],values=["avg_death_rate"],aggfunc=np.mean, margins=True)
dr_df.drop(index='All', axis=0).reset_index()
df2 = df.merge(dr_df, on='binned', how='left',suffixes=('_l', '_r'))
df2 = df2.astype({'avg_death_rate_r':float})
df3 = df2.groupby(['TDR_Week','binned']).agg({'avg_death_rate_r':'sum'}).reset_index()
df4=df3[df3['TDR_Week']<max_3w_window]

In [51]:
fig = px.bar(df4,
             x="TDR_Week",
             y='avg_death_rate_r',
             color='binned',
             barmode='stack',
             labels = {'TDR_Week':'Date','avg_death_rate_r':'Predicted Death Count','binned':'Age Group'},
             title={
                'text': "Weekly Actual and Forecasted COVID-19 Death Volumes by Age Group - Florida",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})

temp = {'TDR_Week': [last_full_week,last_full_week],
        'avg_death_rate_r': [-10000, 10000],
        'binned':[last_full_week,last_full_week]
        }
temp_df = pd.DataFrame(temp, columns = ['TDR_Week', 'avg_death_rate_r','binned'])


fig2 = px.line(temp_df
                ,x='TDR_Week'
                ,y='avg_death_rate_r'
               ,color='binned'
            ,color_discrete_sequence=["black"]
               ,hover_data={'avg_death_rate_r':False
                           ,'TDR_Week':False}
               ,labels = {'TDR_Week':'<--- Actuals | Predictions--->','avg_death_rate_r':'blah','binned':'Forecasted After Date'},
)

fig.add_trace(fig2.data[0])
fig.update_traces(hovertemplate=None)
fig.update_layout(hovermode="x")
fig.update_yaxes(tickformat = ',.0f', range = [0,1500])

fig.show()
figActuals=fig
#py.plot(figActuals, filename = 'Weekly Actual and Forecasted COVID-19 Death Volumes by Age Group - Florida', auto_open=False)
fig.write_image("chart_images/Weekly Actual and Forecasted COVID-19 Death Volumes by Age Group - Florida.png")

### A3.
Death volumes are going to continue to increase from approximately 750 in the week of July 19th to nearly 1100 in the week of July 26th and 1200 in the weeks thereafter. Since we are already seeing lower case volumes, we can expect total death volumes to start to fall after August 9th, but not by much.

### Q4. Based on multiple plausible scenarios, what is the best course of action Florida can take to reopen the economy?
There are 3 main scenarios that this analysis will focus on:<br><br>
**Scenario A** - Restart the “normal” economy: Florida started on this path. But since it lasted only 3 weeks before lockdowns resumed, this scenario has been proven itself impractical. However, I have included it in my analysis to quantify the total costs this would incur if Florida were to power through.<br>
**Scenario B** - Transition to a “low-touch” economy: This is the current state of Florida after re-closing bars and some indoor venues, but it has yet to be proven as a viable solution. There isn’t data on this yet, but this scenario will assume the lost economic cost is half that of a full lockdown and case volumes fall to a midpoint between between May lows and current highs.<br>
**Scenario C** - Re-instate April/May lockdowns: This option is extremely scary for the economy, especially given there is uncertainty on when it will be safe to reopen again. The vaccine is starting to look like the best way out of this pandemic, but it could take another 6-18 months to be developed and distributed, not to mention a recent survey suggests only 50% of Americans will take the vaccine.

In [223]:
df_agg=df2.groupby(['DateWeek','binned']).agg({'Case1':'count','avg_death_rate_r': 'mean'}).reset_index()
df_agg['weekly_growth_rate']=df_agg['Case1'].div(df_agg['Case1'].shift(6))

In [224]:
May_temp = df_agg[(df_agg['DateWeek']>='2020-05-03') & (df_agg['DateWeek']<'2020-05-31')]
May_temp=May_temp.groupby('binned').agg({'Case1':'mean','avg_death_rate_r': 'mean'}).reset_index()
May_temp['DateWeek']=last_full_week
May_temp['DateWeek'] = pd.to_datetime(May_temp['DateWeek'], utc = False) - timedelta(days=7)

May_temp2=May_temp.rename(columns={'Case1':'May_avg_cases'})
May_temp2=May_temp2[['DateWeek','binned','May_avg_cases']]
May_temp2['DateWeek'] = May_temp2['DateWeek'].apply( lambda x: pd.to_datetime(x)).dt.normalize()

df_agg['DateWeek']=df_agg['DateWeek'].dt.tz_convert(None)
temp = df_agg[(df_agg['DateWeek']+timedelta(days=7))==last_full_week].merge(May_temp2, on=['DateWeek','binned'], how='left')
temp['SceA_cases']=temp['Case1']

In [225]:
df_agg=df_agg[df_agg['DateWeek']<last_full_week]
for x in range(7):
    temp.loc[:,'DateWeek'] = temp['DateWeek'] + dt.timedelta(days=7)
    temp['weekly_growth_rate']=temp['weekly_growth_rate']*1 #Assume weekly growth rate tapers off by 5% per week
    temp.loc[:,'SceA_cases'] = temp['SceA_cases']*temp['weekly_growth_rate']
    temp['SceA_deaths'] = 0
    temp.loc[:,'SceB_cases'] = ((temp['Case1']+temp['May_avg_cases'])/2*x+temp['Case1']*(3-x))/3 #Drop case volume to midpoint of May lows and current highs by August 1st
    temp['SceB_deaths'] = 0
    temp['SceC_cases'] = (temp['May_avg_cases']*x+temp['Case1']*(3-x))/3 #Return to May low case loads by August 1st
    temp['SceC_deaths'] = 0
    df_agg = df_agg.append(temp)

df_agg=df_agg.fillna({'SceA_cases': 0.0, 'SceB_cases': 0.0, 'SceC_cases': 0.0, 'SceA_deaths': 0.0,'SceB_deaths': 0.0,'SceC_deaths': 0.0})

df_agg.loc[:,'SceA_deaths'] = np.where(df_agg['DateWeek']<=max_3w_window,df_agg['avg_death_rate_r'].mul(df_agg['Case1'].shift(18)),df_agg['avg_death_rate_r'].mul(df_agg['SceA_cases'].shift(18)))
df_agg.loc[:,'SceB_deaths'] = np.where(df_agg['DateWeek']<=max_3w_window,df_agg['avg_death_rate_r'].mul(df_agg['Case1'].shift(18)),df_agg['avg_death_rate_r'].mul(df_agg['SceB_cases'].shift(18)))
df_agg.loc[:,'SceC_deaths'] = np.where(df_agg['DateWeek']<=max_3w_window,df_agg['avg_death_rate_r'].mul(df_agg['Case1'].shift(18)),df_agg['avg_death_rate_r'].mul(df_agg['SceC_cases'].shift(18)))

In [226]:
#3A Death predictions by Scenario
pred_deaths = df_agg.pivot_table(values=['SceA_deaths','SceB_deaths','SceC_deaths'], 
                      index=['DateWeek'],
                      #columns=['SceA_deaths'],
                      aggfunc='sum')

# age = age.div(age.sum(1), axis=0)
pred_deaths#.plot(kind='line')

Unnamed: 0_level_0,SceA_deaths,SceB_deaths,SceC_deaths
DateWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-03-02,0.0,0.0,0.0
2020-03-09,0.0,0.0,0.0
2020-03-16,0.0,0.0,0.0
2020-03-23,1.4,1.4,1.4
2020-03-30,3.4,3.4,3.4
2020-04-06,26.5,26.5,26.5
2020-04-13,99.8,99.8,99.8
2020-04-20,184.2,184.2,184.2
2020-04-27,208.2,208.2,208.2
2020-05-04,185.0,185.0,185.0


In [227]:
df5 = df_agg.groupby(['DateWeek']).agg({'SceA_deaths':'sum','SceB_deaths':'sum','SceC_deaths':'sum'}).reset_index()
df5 = df5.astype({'SceA_deaths':int,'SceB_deaths':int,'SceC_deaths':int}) 

df5['Scenario']="A"
figA = px.line(df5,
             x="DateWeek",
             y='SceA_deaths',
             labels = {'DateWeek':'Date','SceA_deaths':'Death Volume'},
            color='Scenario',
            color_discrete_sequence=["red"],
             title={
                'text': "Scenario Analysis - Actual and Forecasted Weekly Death Volumes",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})
df5['Scenario']="B"
figB = px.line(df5,
             x="DateWeek",
             y='SceB_deaths',
             labels = {'DateWeek':'Date','SceB_deaths':'Death Volume'},
               color='Scenario',
               color_discrete_sequence=["green"],
             title={
                'text': "Scenario Analysis - Actual and Forecasted Weekly Death Volumes",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})
df5['Scenario']="C"
figC = px.line(df5,
             x="DateWeek",
             y='SceC_deaths',
             labels = {'DateWeek':'Date','SceC_deaths':'Death Volume'},
               color='Scenario',
               color_discrete_sequence=["blue"],
             title={
                'text': "Scenario Analysis - Actual and Forecasted Weekly Death Volumes",
                'y':0.9,
                'x':0.5,
                'xanchor': 'center',
                'yanchor': 'top'})

temp = {'DateWeek': [last_full_week,last_full_week],
        'SceA_deaths': [-10000.00, 10000.00],
        'binned':[last_full_week,last_full_week]
                }
temp_df = pd.DataFrame(temp, columns = ['DateWeek', 'SceA_deaths','binned'])


fig_vertical = px.line(temp_df
                ,x='DateWeek'
                ,y='SceA_deaths'
                ,color='binned'
                ,color_discrete_sequence=["black"]
               ,labels = {'DateWeek':'Date','SceA_deaths':'blah','binned':'Forecasted After Date'},
)

figA.add_trace(figB.data[0])
figA.add_trace(figC.data[0])
figA.add_trace(fig_vertical.data[0])
figA.update_traces(hovertemplate=None)
figA.update_layout(hovermode="x")
figA.update_yaxes(tickformat = ',.0', range = [0,3000])

#fig.add_bar(df2,x='TDR_Week', y='pred_deaths', barmode='stacked')


figA.show()
#py.plot(figA, filename = 'Scenario Analysis - Actual and Forecasted Weekly Death Volumes', auto_open=False)
figA.write_image("chart_images/Scenario Analysis - Actual and Forecasted Weekly Death Volumes.png")

In [228]:
#3B Economic costs
df_agg['DateRange'] = str(max_3w_window) + ' to ' + str(df_agg['DateWeek'].max().strftime('%Y-%m-%d'))
max_date = df_agg['DateWeek'].max().strftime('%Y-%m-%d')
df_agg2=df_agg[(df_agg['DateWeek']==max_date)].groupby('DateRange').agg({'SceA_deaths':'sum','SceB_deaths': 'sum','SceC_deaths': 'sum'})
pt = df_agg2.pivot_table(values=['SceA_deaths','SceB_deaths','SceC_deaths'], 
                      index=['DateMonth'],
                      aggfunc='sum')

pd.options.display.float_format = '{:.1f}'.format

pt2=pt.reset_index().rename(columns={'DateMonth':'Scenario','SceA_deaths':'A: Restart the normal economy','SceB_deaths':'B: Transition to a low-touch economy','SceC_deaths':'C: Re-instate April/May lockdown'}).T.reset_index().drop(index=0)
pt2=pt2.rename(columns={'index':'Scenario',0:'Forecasted Death Count'})
pt2['Forecasted Death Count']=pt2['Forecasted Death Count']*4
pt2['Statistical Human Cost ($B)']=pt2['Forecasted Death Count']/100
pt2['Florida GDP ($B)']=950.76
pt2['GDP Cost ($B)'] = np.where(pt2['Scenario'].astype(str).str[0]=='C',pt2['Florida GDP ($B)']*.05,
                                np.where(pt2['Scenario'].astype(str).str[0]=='B',pt2['Florida GDP ($B)']*.025,pt2['Florida GDP ($B)']*.0125))
pt2['Difference']=pt2['Florida GDP ($B)']-pt2['Statistical Human Cost ($B)']
pt2['Breakeven Reduction in Economic Activity ($B)'] = pt2['Difference'] - pt2.iloc[0,4]
pt2['Total Economic Cost ($B)']=pt2['GDP Cost ($B)']+pt2['Statistical Human Cost ($B)']
pt2=pt2[['Scenario','Forecasted Death Count','Statistical Human Cost ($B)','Florida GDP ($B)','GDP Cost ($B)','Total Economic Cost ($B)']]
pt2.set_index('Scenario')

Unnamed: 0_level_0,Forecasted Death Count,Statistical Human Cost ($B),Florida GDP ($B),GDP Cost ($B),Total Economic Cost ($B)
Scenario,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A: Restart the normal economy,1890.5,18.9,950.8,11.9,30.8
B: Transition to a low-touch economy,2276.2,22.8,950.8,23.8,46.5
C: Re-instate April/May lockdown,556.4,5.6,950.8,47.5,53.1


In [229]:
df6 = pt2

fig = go.Figure(data=[go.Table(
    header=dict(values=list(df6.columns),
                fill_color='paleturquoise',
                align='left'),
    cells=dict(values=[df6['Scenario'],df6['Forecasted Death Count'],df6['Statistical Human Cost ($B)'],df6['Florida GDP ($B)'],df6['GDP Cost ($B)'],df6['Total Economic Cost ($B)']],
               fill_color='lavender',
               align='left',
               format=[None,',.0f','$.2f']))
])

fig.update_layout(title_text="Total Economic Cost - Next 4 Week Forecast, Florida")

fig.show()
#py.plot(fig, filename = 'Breakeven Analysis by Scenario - August 2020 Forecasts', auto_open=False)
fig.write_image("chart_images/Breakeven Analysis by Scenario.png")

### A4.
Based on the sum of GDP cost of shutting down and the statistical human cost of deaths, it appears that the best course of action is to transition to a hybrid model where financial transactions take place in low-touch mediums and/or low-density environments.

Note: as Florida has implemented some stricter regulations in the past few weeks, scenario A now reflects some lost GDP value.

In [230]:
figFeatured=figActuals
figFeatured.add_trace(figA.data[0])
figFeatured.add_trace(figB.data[0])
figFeatured.add_trace(figC.data[0])
figFeatured.update_layout(title_text="Actual and Forecasted Weekly Death Volumes - Florida")
figFeatured.update_yaxes(tickformat = ',.0', range = [0,1500])
figFeatured.update_traces(hovertemplate=None)


figFeatured.show()
#py.plot(figFeatured, filename = 'Actual and Forecasted Weekly Death Volumes - Florida', auto_open=True)
figFeatured.write_image("chart_images/Actual and Forecasted Weekly Death Volumes - Florida.png")

# Conclusion

Based on these assumptions, the economically optimal solution would be to transition to a **low-touch economy** for the month of August (see table below). A low-touch economy would yield the lowest sum of human and GDP costs of \\$57.56B. The least desirable outcome would be to restart the economy as normal, as this scenario would yield a total \\$111.28B in human costs alone.

The difference in total costs between these two scenarios is shocking, and commands that the order of priorities is to reduce the threat of the virus first, and restart the economy second.