In this file we first create world map showing number of cases global in a map. After that, we predict the number of deaths in US using prophet method.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
#Loading the data to see number of deaths globally
death_cases_world = "/kaggle/input/ece657aw20asg4coronavirus/time_series_covid19_deaths_global.csv"
df4 = pd.read_csv(death_cases_world)
df4.head()

In [None]:
df4["Total_Cases"]=df4[df4.columns[4:]].sum(axis=1)

Installing folium in below command.It will be used for creating a map subsequently.

In [None]:
pip install folium

In [None]:
#Creating a map showing number of deaths globally
import folium
country=list(df4.iloc[:,1])
lat=list(df4.iloc[:,2])
long=list(df4.iloc[:,3])
total_cases=list(df4["Total_Cases"])
median1=df4["Total_Cases"].median()
def fill_color(total_cases):
    if total_cases>=median1:
        return "red"
    else:
        return "green"
# print(total_cases)
html = """
Country name:%s<br>
<a href ="https://www.google.com/search?q=%s coronavirus status" target="_blank">%s coronavirus status</a><br>
Total Cases: %s 
"""
map1=folium.Map(location=[0,0],zoom_start=6,tiles="Stamen Terrain")
fg=folium.FeatureGroup(name="My Map")

for i in range(len(lat)):
    iframe=folium.IFrame(html= html %(country[i],country[i],country[i],total_cases[i]),width=200,height=100)
    fg.add_child(folium.CircleMarker([lat[i],long[i]],radius=6,popup=folium.Popup(iframe),fill_color=fill_color(total_cases[i]),color="grey",fill_opacity="0.7"))

map1.add_child(fg)
map1.save("Map1.html")
    


          

In the above code we have created a html file which shows the map of world along with the number of deaths in each country. We decided to predict number of deaths for a particular country.

In [None]:
def deaths_US_case():
  death_cases_US = "/kaggle/input/ece657aw20asg4coronavirus/time_series_covid19_deaths_global.csv"
  df = pd.read_csv(death_cases_US)
  df = df[df['Country/Region'] == "US"]
  df_new = df.melt(id_vars=['Province/State', 'Country/Region', 'Lat', 'Long'])
  df_new.rename(columns={"variable":"Date","value":"deaths_cases"},inplace=True)
  deaths_per_day = df_new.groupby("Date")['deaths_cases'].sum()
  deaths_per_day = deaths_per_day.reset_index()
  print(deaths_per_day)
  deaths_per_day = deaths_per_day[['Date','deaths_cases']]
  return deaths_per_day

deaths_cases = deaths_US_case()

In [None]:
#Since we would be using prophet to forecast number of cases, we need to convert data types of our columns.

deaths_cases.rename(columns={"Date":"ds","deaths_cases":"y"},inplace=True)
deaths_cases['ds'] = pd.to_datetime(deaths_cases['ds'])
deaths_cases.sort_values(by='ds',inplace=True)

In [None]:
#Creating a plot deaths with the days since 22 January
plt_deaths = deaths_cases.reset_index()['y'].plot(title="#Death Cases Vs Day");
plt_deaths.set(xlabel="Date", ylabel="#Death Cases");

In [None]:
#Splitting the data for model
train = deaths_cases[:-4]
test = deaths_cases[-4:]

test = test.set_index("ds")
test = test['y']

In [None]:
# Model Initialize
from fbprophet import Prophet
m = Prophet()
m.fit(train)
future_dates = m.make_future_dataframe(periods=10)
# Prediction
forecast =  m.predict(future_dates)
pd.plotting.register_matplotlib_converters()
ax = forecast.plot(x='ds',y='yhat',label='Predicted death cases',legend=True,figsize=(12,8))
test.plot(y='y',label='Actual death counts',legend=True)


As we see from above model that predicted deaths count is not close to the actual death count. This could be due to the fact that time series data has frequent changes in its trajectories. To resolve this we will tune the parameters of the Prophet function. One of the key parameters for this function is changepoints.These are the points in data where there are sudden and abrupt changes in the trend we have taken them as 4(atleast 1 change point for each month).

changepoint_prior_scale indicates how flexible the changepoints are allowed to be.Higher value of this gives more flexibility but can also lead to overfitting.

daily_seasonality component indicates if seasonailty componenets can be incorporated into the forecast.We will keep true for our model.

In [None]:
from fbprophet import Prophet
m = Prophet(changepoint_range=0.95, changepoint_prior_scale=5, n_changepoints=4,daily_seasonality=True)
m.fit(train)

future_dates = m.make_future_dataframe(periods=10)
# Prediction
forecast =  m.predict(future_dates)

ax = forecast.plot(x='ds',y='yhat',label='Predicted confirmed cases',legend=True,figsize=(12,8))
test.plot(y='y',label='Actual Confirmed Cases',legend=True,ax=ax,xlim=['2020-02-01','2020-04-20'])
ax.set(xlabel="Date", ylabel="#Confirmed Cases")

Now our model predicts the data better. As we see from above the number of deaths due to COVID-19 are expected to rise.

References used:

https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3