# Analysis and Prediction on the Coronavirus (Italy)

### by Vansh Jatana

> ## Related Work
*  For Analysis and Prediction on the Coronavirus (Iran), click [here](https://www.kaggle.com/vanshjatana/analysis-and-prediction-on-coronavirus-iran)
*  For Analysis and Prediction on the Coronavirus (South Korea), click [here](https://www.kaggle.com/vanshjatana/analysis-on-coronavirus)
*  For Machine Learning on the Coronavirus, click [here](https://www.kaggle.com/vanshjatana/machine-learning-on-coronavirus)
*  For a report on the Coronavirus, click [here](https://www.researchgate.net/publication/339738108_Analysis_On_Coronavirus)

## Current Scenario

At the time of this writing, there are 9,172 confirmed cases according to the WHO. Italy is a member state of the European Union and is a popular tourist destination. Italy's first case was confirmed on January 30th, when two infected Chinese tourists were found. Italy is the most infected country in Europe and second most affected region after China. Many Italian visitors were confirmed as being infected with Coronavirus after visiting other countries. Many countries, including Asia, America, and Europe, trace their local cases to Italy. In the very beginning of the outbreak, only the northern area of Italy was affected; but after a very short time, it had spread to all of Italy.

## Libraries

In [None]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install pandas
!{sys.executable} -m pip install seaborn
!{sys.executable} -m pip install plotly
!{sys.executable} -m pip install sklearn
!{sys.executable} -m pip install fbprophet
!{sys.executable} -m pip install statsmodels
!{sys.executable} -m pip install keras
!{sys.executable} -m pip install tensorflow
import numpy as np
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import plotly.express as px
import datetime
from datetime import date, timedelta
from sklearn.cluster import KMeans
from fbprophet import Prophet
from fbprophet.plot import plot_plotly, add_changepoints_to_plot
import plotly.offline as py
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import statsmodels.api as sm
from keras.models import Sequential
from keras.layers import LSTM,Dense
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator

## Reading Data

In [None]:
data=pd.read_csv("../input/covid19-in-total/covid19_italy_region.csv")

In [None]:
an_data = pd.read_csv("../input/novel-corona-virus-2019-dataset/COVID19_open_line_list.csv")

## Looking into the Data

In [None]:
an_data.head()

In [None]:
an_data = an_data[an_data['country']=='Italy']
an_data.shape


## Age Distribution of Confirmation

In [None]:
plt.figure(figsize=(10,6))
sns.set_style("darkgrid")
plt.title("Age distribution of Confirmation")
sns.kdeplot(data=an_data['age'], shade=True).set(xlim=(0))

## Age

Here, the graph shows the age distribution of infected people by gender. We can clearly see that older people are more likely to become infected, especially older people that have lung disease or respiratory system problems. Men in the 40 to 50 year age group are more likely to be infected. In comparison, women aged 50 to 70 years are more likely to be infected. As Dr. Steven Gambert, professor of medicine and director of geriatrics at the University of Maryland School of Medicine says, "Older people have higher risk of underlying health conditions; older people are already under physical stress, and their immune systems, even if not significantly compromised, simply do not have the same ability to fight viruses and bacteria.” Data shows that Italy has the oldest population across the globe, by count. According to EU statistics, Italy has the lowest percentage of young people.

## Gender Distribution of Confirmation

In [None]:
plt.figure(figsize=(15, 5))
plt.title('Gender')
an_data.sex.value_counts().plot.bar();

In [None]:
fig = px.pie( values=an_data.groupby(['sex']).size().values,names=an_data.groupby(['sex']).size().index)
fig.update_layout(
    font=dict(
        size=15,
        color="#242323"
    )
    )   
    
py.iplot(fig)

## Gender

The graph shows the age distribution of infected and confirmed people by gender, where men are more likely to die from coronavirus than women. Research has found that middle-aged and older men are more likely to get infected by the virus. Researchers found that the infection rate among men and women is the same, but the death rate among men is 2.8% as compared to 1.7% for women. Some of the factors thought to explain why men are more likely to die from the coronavirus include:

* Men lack the heightened immunity system found in women
* In China, 50%-80% of men smoke, compared to only 2%-3% of women

## Age Distribution of Confirmation by Gender

In [None]:
male_dead = an_data[an_data.sex=='male']
female_dead = an_data[an_data.sex=='female']

In [None]:
plt.figure(figsize=(10,6))
sns.set_style("darkgrid")
plt.title("Age distribution of the confirmation by gender")
sns.kdeplot(data=female_dead['age'], label="Women", shade=True).set(xlim=(0))
sns.kdeplot(data=male_dead['age'],label="Male" ,shade=True).set(xlim=(0))


In [None]:
data.head()

## Checking for Null Values

In [None]:
data.isna().sum()

## Description of the Data

In [None]:
data.describe().T

## Tracking the Patient

In [None]:
data.shape

In [None]:
clus=data.loc[:,['SNo','Latitude','Longitude']]
clus.head()

## Checking for Number of Cluster

In [None]:
K_clusters = range(1,15)
kmeans = [KMeans(n_clusters=i) for i in K_clusters]
Y_axis = data[['Latitude']]
X_axis = data[['Longitude']]
score = [kmeans[i].fit(Y_axis).score(Y_axis) for i in range(len(kmeans))]
plt.plot(K_clusters, score)
plt.xlabel('Number of Clusters')
plt.ylabel('Score')
plt.title('Score vs Cluster')
plt.show()

**The score becomes constant after four clusters, so making more clusters will not help us. The value for k is 4 in this case.**

In [None]:
kmeans = KMeans(n_clusters = 4, init ='k-means++')
kmeans.fit(clus[clus.columns[1:3]])
clus['cluster_label'] = kmeans.fit_predict(clus[clus.columns[1:3]])
centers = kmeans.cluster_centers_
labels = kmeans.predict(clus[clus.columns[1:3]])

## Graphical Representation of Clusters

In [None]:
clus.plot.scatter(x = 'Latitude', y = 'Longitude', c=labels, s=50, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=100, alpha=0.5)

**We will verify our clusters by putting values onto a world map generated through use of the folium library.**

**Affected places shown on the world map include Hospitalised, Confirm, Deaths, and Recovery.**

In [None]:
!{sys.executable} -m pip install folium
import folium
italy_map = folium.Map(location=[42.8719,12.5674 ], zoom_start=5,tiles='Stamen Toner')

for lat, lon,RegionName,TotalPositiveCases,Recovered,Deaths,TotalHospitalizedPatients in zip(data['Latitude'], data['Longitude'],data['RegionName'],data['TotalPositiveCases'],data['Recovered'],data['Deaths'],data['TotalHospitalizedPatients']):
    folium.CircleMarker([lat, lon],
                        radius=5,
                        color='red',
                      popup =('RegionName: ' + str(RegionName) + '<br>'
                    'TotalPositiveCases: ' + str(TotalPositiveCases) + '<br>'
                    'TotalHospitalizedPatients: ' + str(TotalHospitalizedPatients) + '<br>'
                      'Recovered: ' + str(Recovered) + '<br>'
                      'Deaths: ' + str(Deaths) + '<br>'),

                        fill_color='red',
                        fill_opacity=0.7 ).add_to(italy_map)
italy_map

**Early on, the most severely affected cities and regions in Italy were Lombardy and then Emilia-Romagna, Veneto, Marche, and Piemonte. Milan is the second most populous Italian city and is located in Lombardy. Other areas in Italy that are affected by coronavirus include Toscana, Campania, Lazio, Liguria, Friuli Venezia Giulia, Sicilia, Puglia, Umbria, Abruzzo, Trento, Molise, Calabria, Sardegna, Valle d’Aosta, Basilicata, and Bolzano. Italy was the fourth most affected country as of February, but now it has the highest number of confirmed cases after China.**

## Grouping Data According to Region Name

In [None]:
data['Date'] = pd.to_datetime(data['Date']).dt.normalize()
daily = data.sort_values(['Date','Country','RegionName'])
latest = data[data.Date == daily.Date.max()]
latest.head()

In [None]:
data_groupby_region = latest.groupby("RegionName")[['TotalPositiveCases', 'Deaths', 'Recovered','TestsPerformed','HospitalizedPatients','TotalHospitalizedPatients']].sum().reset_index()
dgr = data_groupby_region 
dgr.head()

## Description of Grouped Data by Region

In [None]:
dgr.describe().T

## Test Performed vs Region

In [None]:
fig = px.bar(dgr[['RegionName', 'TestsPerformed']].sort_values('TestsPerformed', ascending=False), 
             y="TestsPerformed", x="RegionName", color='RegionName', 
             log_y=True, template='ggplot2', title='Test Performed vs Region')
fig.show()


**As the graph shows, the test was performed in different regions of Italy. Lombardia shows the maximum number of tests performed (over 25,000), as cities are most highly affected. As a result, the next graph shows that the region also has the highest number of positive coronavirus patients (7,280). Veneto is the second most infected city, followed by Emilia Romagna, Lazio, Marche, Toscana, Piemonte, Friuli V.G., Campania, Sicilia, Liguria, Puglia, P.A. Trento, Calabria, Umbria, Abruzzo, Sardegna, Molisa, Basilicata, Valle d'Aosta, P.A. Bolzano, etc.**

## Confirmed Cases vs Region

In [None]:
fig = px.bar(dgr[['RegionName', 'TotalPositiveCases']].sort_values('TotalPositiveCases', ascending=False), 
             y="TotalPositiveCases", x="RegionName", color='RegionName', 
             log_y=True, template='ggplot2', title='Confirmed Cases vs Region')
fig.show()


**There are more than 10,000 people who are infected with this virus. Italy is the most affected country in the world after China, with 827 deaths and 12,462 confirmed cases in almost three weeks. The government has restricted all flights from China because, at the end of January, two Chinese tourists came down with coronavirus during a trip to Italy. At the time, it was hoped to be the best measure to block the spread of the disease.**

## Hospitalized Patients vs Region

In [None]:
fig = px.bar(dgr[['RegionName', 'TotalHospitalizedPatients']].sort_values('TotalHospitalizedPatients', ascending=False), 
             y="TotalHospitalizedPatients", x="RegionName", color='RegionName', 
             log_y=True, template='ggplot2', title='Hospitalised Patient vs Region')
fig.show()

## Recovery vs Region

In [None]:
fig = px.bar(dgr[['RegionName', 'Recovered']].sort_values('Recovered', ascending=False), 
             y="Recovered", x="RegionName", color='RegionName', 
             log_y=True, template='ggplot2', title='Revovery vs Region')
fig.show()


## Death vs Region Name

In [None]:
fig = px.bar(dgr[['RegionName', 'Deaths']].sort_values('Deaths', ascending=False), 
             y="Deaths", x="RegionName", color='RegionName', 
             log_y=True, template='ggplot2', title='Death vs Region')
fig.show()


In [None]:
dgrs_el = dgr.sort_values(by=['TotalPositiveCases'],ascending = False)
dgrs_el.head()

## Test and Confirm vs Region

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgrs_el.RegionName, dgrs_el.TestsPerformed,label="Tests Performed")
plt.bar(dgrs_el.RegionName, dgrs_el.TotalPositiveCases,label="Confirm Cases")
plt.xlabel('Region')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Test and Confirm vs Region',fontsize = 35)

plt.show()

f, ax = plt.subplots(figsize=(80,30))
ax=sns.scatterplot(x="RegionName", y="TestsPerformed", data=dgrs_el,
             color="red",label = "Tests Performed")
ax=sns.scatterplot(x="RegionName", y="TotalPositiveCases", data=dgrs_el,
             color="blue",label = "Confirm Cases")
ax.xaxis.set_tick_params(labelsize=35)

plt.plot(dgrs_el.RegionName,dgrs_el.TestsPerformed,zorder=1,color="red")
plt.plot(dgrs_el.RegionName,dgrs_el.TotalPositiveCases,zorder=1,color="blue")

## Confirmed Cases vs People Hospitalized

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgrs_el.RegionName, dgrs_el.TotalPositiveCases,label="Confirm Cases")
plt.bar(dgrs_el.RegionName, dgrs_el.TotalHospitalizedPatients,label="Hospitalized Patients")

plt.xlabel('Region')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Confirm Cases vs People Hospitalised',fontsize= 35)
plt.show()

f, ax = plt.subplots(figsize=(40,20))

ax=sns.scatterplot(x="RegionName", y="TotalPositiveCases", data=dgrs_el,
             color="blue",label = "Confirm Cases")
ax=sns.scatterplot(x="RegionName", y="TotalHospitalizedPatients", data=dgrs_el,
             color="red",label = "Hospitalized Patients")
ax.xaxis.set_tick_params(labelsize=18)
plt.plot(dgrs_el.RegionName,dgrs_el.TotalPositiveCases,zorder=1,color="blue")
plt.plot(dgrs_el.RegionName,dgrs_el.TotalHospitalizedPatients,zorder=1,color="red")


**The graph shows statistical data direct from the WHO. The data shows that in Lombardia, after more than 7,000 confirmed cases, there are only approximately 4,500 people who are hospitalized. This has become a crisis situation in Italy. Hospital conditions are becoming worse day by day. According to the doctors, not every patient is getting proper and equal care, and that is the main cause of multifold spread of coronavirus. The whole country is locked down. Government has announced there will be no gatherings, no sporting events, and no travelling across the country because of the high number of deaths.**

## Death and Recovery vs Region

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgrs_el.RegionName, dgrs_el.Recovered,label="Recovery")
plt.bar(dgrs_el.RegionName, dgrs_el.Deaths,label="Death")
plt.xlabel('Region')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Death and Recovery vs Region', fontsize= 35)
plt.show()

f, ax = plt.subplots(figsize=(23,10))
ax=sns.scatterplot(x="RegionName", y="Recovered", data=dgrs_el,
             color="red",label = "Recovered")
ax=sns.scatterplot(x="RegionName", y="Deaths", data=dgrs_el,
             color="blue",label = "Deaths")
plt.plot(dgrs_el.RegionName,dgrs_el.Recovered,zorder=1,color="red")
plt.plot(dgrs_el.RegionName,dgrs_el.Deaths,zorder=1,color="blue")

**According to the graph, the recovery rate of the patients is very slow. There are a few common reasons behind the rapid increase in the number of people infected by the coronavirus. According to the data, the number of hospitalized people is far fewer than the number of people infected by the novel coronavirus. Cases have now been confirmed in every member nation of the European Union. Italy will remain totally locked down as its healthcare system struggles to cope. Nearby countries like Germany and France report alarming spikes in daily cases.**

In [None]:
data['Date'] = pd.to_datetime(data['Date']).dt.normalize()
latest = data[data.Date == daily.Date.max()]

In [None]:
temp = latest.loc[:,['Date','HospitalizedPatients','IntensiveCarePatients','TotalHospitalizedPatients','HomeConfinement','Recovered','Deaths','TotalPositiveCases','TestsPerformed']]
temp.head()

## Description of data Grouped by Date

In [None]:
temp.describe().T

In [None]:
data_groupby_date = latest.groupby("Date")[['Date','HospitalizedPatients','IntensiveCarePatients','TotalHospitalizedPatients','HomeConfinement','Recovered','Deaths','TotalPositiveCases','TestsPerformed']].sum().reset_index()
data_groupby_date

## Ratio and percentage of Confirmation, Deaths and Deaths, Recovery after Confirmation

In [None]:
ps_ts = float(data_groupby_date.TotalPositiveCases/data_groupby_date.TestsPerformed)
d_ts = float(data_groupby_date.Deaths/data_groupby_date.TestsPerformed)
r_ps = float(data_groupby_date.Recovered/data_groupby_date.TotalPositiveCases)
d_ps = float(data_groupby_date.Deaths/data_groupby_date.TotalPositiveCases)

In [None]:
print("The percentage of Confirmation is "+ str(ps_ts*100) )
print("The percentage of Death is "+ str(d_ts*100) )
print("The percentage of Death after confirmation is "+ str(d_ps*100) )
print("The percentage of recovery after confirmation is "+ str(r_ps*100) )

In [None]:
data_groupby_date1 = data.groupby("Date")[['TotalPositiveCases', 'Deaths', 'Recovered','TestsPerformed','HospitalizedPatients','TotalHospitalizedPatients']].sum().reset_index()
dgd3 = data_groupby_date1
dgd3.head()

In [None]:
dgd2 = dgd3

In [None]:
dgd2["Date"]= dgd3["Date"].dt.strftime("%d-%m-%y") 
dgd2.head()

## Test vs Confirmed


In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgd2.Date, dgd2.TestsPerformed,label="Tests Performed")
plt.bar(dgd2.Date, dgd2.TotalPositiveCases,label="Confirm Cases")
plt.xlabel('Date')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Test Peroformed vs Confirmed Cases',fontsize = 35)
plt.show()

f, ax = plt.subplots(figsize=(23,10))
ax=sns.scatterplot(x="Date", y="TestsPerformed", data=dgd2,
             color="red",label = "Tests Performed")
ax=sns.scatterplot(x="Date", y="TotalPositiveCases", data=dgd2,
             color="blue",label = "Confirm Cases")
plt.plot(dgd2.Date,dgd2.TestsPerformed,zorder=1,color="red")
plt.plot(dgd2.Date,dgd2.TotalPositiveCases,zorder=1,color="blue")

## Confirmed Cases vs People Hospitalized

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgd2.Date, dgd2.TotalPositiveCases,label="Confirm Cases")
plt.bar(dgd2.Date, dgd2.TotalHospitalizedPatients,label="Hospitalized Patients")
plt.xlabel('Date')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Confirmed Cases vs Hospitalised Cases',fontsize= 35)
plt.show()

f, ax = plt.subplots(figsize=(23,10))
ax=sns.scatterplot(x="Date", y="TotalHospitalizedPatients", data=dgd2,
             color="red",label = "Hospitalized Patients")
ax=sns.scatterplot(x="Date", y="TotalPositiveCases", data=dgd2,
             color="blue",label = "Confirm Cases")
plt.plot(dgd2.Date,dgd2.TotalHospitalizedPatients,zorder=1,color="red")
plt.plot(dgd2.Date,dgd2.TotalPositiveCases,zorder=1,color="blue")

## Hospitalized vs Recovery and Death

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgd2.Date, dgd2.TotalHospitalizedPatients,label="Hospitalise Patients")
plt.bar(dgd2.Date, dgd2.Recovered,label="Recovery")
plt.bar(dgd2.Date, dgd2.Deaths,label="Death")
plt.xlabel('Date')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Hospitalise vs Recovery vs Death',fontsize=30)
plt.show()

f, ax = plt.subplots(figsize=(23,10))
ax=sns.scatterplot(x="Date", y="TotalHospitalizedPatients", data=dgd2,
             color="black",label = "Hospitalise Patients")
ax=sns.scatterplot(x="Date", y="Recovered", data=dgd2,
             color="red",label = "Recovery")
ax=sns.scatterplot(x="Date", y="Deaths", data=dgd2,
             color="blue",label = "Death")
plt.plot(dgd2.Date,dgd2.TotalHospitalizedPatients,zorder=1,color="black")
plt.plot(dgd2.Date,dgd2.Recovered,zorder=1,color="red")
plt.plot(dgd2.Date,dgd2.Deaths,zorder=1,color="blue")

## Confirm vs Recovery vs Death

In [None]:
plt.figure(figsize=(23,10))
plt.bar(dgd2.Date, dgd2.TotalPositiveCases,label="Confirm")
plt.bar(dgd2.Date, dgd2.Recovered,label="Recovery")
plt.bar(dgd2.Date, dgd2.Deaths,label="Death")
plt.xlabel('Date')
plt.ylabel("Count")
plt.legend(frameon=True, fontsize=12)
plt.title('Confirm vs Recovery vs Death',fontsize=30)
plt.show()

f, ax = plt.subplots(figsize=(23,10))
ax=sns.scatterplot(x="Date", y="TotalPositiveCases", data=dgd2,
             color="black",label = "Confirm")
ax=sns.scatterplot(x="Date", y="Recovered", data=dgd2,
             color="red",label = "Recovery")
ax=sns.scatterplot(x="Date", y="Deaths", data=dgd2,
             color="blue",label = "Death")
plt.plot(dgd2.Date,dgd2.TotalPositiveCases,zorder=1,color="black")
plt.plot(dgd2.Date,dgd2.Recovered,zorder=1,color="red")
plt.plot(dgd2.Date,dgd2.Deaths,zorder=1,color="blue")

**This graph gives an overview of the current situation in Italy. There are now more than 12,000 confirmed cases. The numbers of deaths and recoveries are roughly equal. From the date the country confirmed its first case, the number has been increasing exponentially. On March 11, Italy became the second most infected country after China.**

In [None]:
data_groupby_date1 = data.groupby("Date")[['TotalPositiveCases', 'Deaths', 'Recovered','TestsPerformed','HospitalizedPatients','TotalHospitalizedPatients']].sum().reset_index()
dgd1 = data_groupby_date1
dgd1.head()

## Prophet  Algorithm

For Number of Test(Screening)

In [None]:
pr_data_test = dgd1.loc[:,['Date','TestsPerformed']]
pr_data_test.columns = ['ds','y']
pr_data_test.head()

## Modeling

In [None]:
m = Prophet()
m.fit(pr_data_test)
future=m.make_future_dataframe(periods=365)
forecast_test=m.predict(future)
forecast_test

## Predicting

In [None]:
test = forecast_test.loc[:,['ds','trend']]
test = test[test['trend']>0]
test.head()
test=test.head(45)
test=test.tail(30)
test.columns = ['Date','Screening']
test.head()


## Graphical Representation of Predicted Screening

In [None]:
fig_test = plot_plotly(m, forecast_test)
py.iplot(fig_test) 

fig_test = m.plot(forecast_test,xlabel='Date',ylabel='Scrrening Count')

In [None]:
figure_test=m.plot_components(forecast_test)


**The graph predicts the total number of screenings required to bring the condition of Italy under control. By March 2021, when there will be more than 1.4 million screenings, the situation will be under control. The graph is linear, as European governments are supposed to work accordingly.**

# Confirmed Cases

## Making Data Ready for Algorithm

In [None]:
pr_data_cm = dgd1.loc[:,['Date','TotalPositiveCases']]
pr_data_cm.columns = ['ds','y']
pr_data_cm.head()

## Modelling

In [None]:
m=Prophet()
m.fit(pr_data_cm)
future=m.make_future_dataframe(periods=365)
forecast_cm=m.predict(future)
forecast_cm

## Predicting

In [None]:
cnfrm = forecast_cm.loc[:,['ds','trend']]
cnfrm = cnfrm[cnfrm['trend']>0]
cnfrm.head()
cnfrm=cnfrm.head(42)
cnfrm=cnfrm.tail(30)
cnfrm.columns = ['Date','Confirm']
cnfrm.head()

## Graphical Representation of Predicted Confirmation

In [None]:
fig_cm = plot_plotly(m, forecast_cm)
py.iplot(fig_cm) 

fig_cm = m.plot(forecast_cm,xlabel='Date',ylabel='Confirmed Count')

In [None]:
figure_cm=m.plot_components(forecast_cm)


**Predicting the number of cases confirmed in Italy by February 2021. There will be more than 250,000 people who will be affected by the coronavirus.**


# Recovery

## Making Data Ready for the Algorithm

In [None]:
pr_data_r = dgd1.loc[:,['Date','Recovered']]
pr_data_r.columns = ['ds','y']
pr_data_r.head()

## Modelling

In [None]:
m=Prophet()
m.fit(pr_data_r)
future=m.make_future_dataframe(periods=365)
forecast_r=m.predict(future)
forecast_r

## Predicting

In [None]:
rec = forecast_r.loc[:,['ds','trend']]
rec = rec[rec['trend']>0]
rec.head()
rec=rec.head(42)
rec=rec.tail(30)
rec.columns = ['Date','Recovery']
rec.head()

## Graphical Representation of Predicted Recovery

In [None]:
fig_r = plot_plotly(m, forecast_r)
py.iplot(fig_r) 

fig_r = m.plot(forecast_r,xlabel='Date',ylabel='Recovery Count')

In [None]:
figure_r=m.plot_components(forecast_r)


# For Deaths

## Making Data Ready for the Algorithm

In [None]:
pr_data_d = dgd1.loc[:,['Date','Deaths']]
pr_data_d.columns = ['ds','y']
pr_data_d.head()

## Modelling

In [None]:
m=Prophet()
m.fit(pr_data_d)
future=m.make_future_dataframe(periods=365)
forecast_d=m.predict(future)
forecast_d

## Predicting

In [None]:
dth = forecast_d.loc[:,['ds','trend']]
dth = dth[dth['trend']>0]
dth=dth.head(42)
dth=dth.tail(30)
dth.columns = ['Date','Death']
dth.head()


## Graphical Representation of Predicted Death

In [None]:
fig_d = plot_plotly(m, forecast_d)
py.iplot(fig_d) 

fig_d = m.plot(forecast_d,xlabel='Date',ylabel='Deaths Count')

In [None]:
figure_d=m.plot_components(forecast_d)


**The current situation is not under control. According to current data on confirmed patients, 53% will recover, whereas 47% will die. But the prediction says there will approximately 25,000 people who will recover and around 14,000 people who will die by February 2021. The rest will be in isolation.**

# What the Future Looks Like

In [None]:
prediction = test
prediction['Confirm'] = cnfrm.Confirm
prediction['Recover'] = rec.Recovery
prediction['Death'] = dth.Death

In [None]:
prediction.head()

## Future Ratios

In [None]:
pps_pts = float(prediction.Confirm.sum()/prediction.Screening.sum())
pd_pts = float(prediction.Death.sum()/prediction.Screening.sum())
pr_pps = float(prediction.Recover.sum()/prediction.Confirm.sum())
pd_pps = float(prediction.Death.sum()/prediction.Confirm.sum())

In [None]:
print("The percentage of  Predicted Confirmation is "+ str(pps_pts*100) )
print("The percentage of Predicted Death is "+ str(pd_pts*100) )
print("The percentage of Predicted Death after confirmation is "+ str(pd_pps*100) )
print("The percentage of Predicted recovery after confirmation is "+ str(pr_pps*100) )

> ### Prevention
To avoid infection, people should do the following things:
*  Avoid contact with people who are sick.
*  Avoid touching your eyes, nose, and mouth.
*  Stay home when you are sick.
*  Cover your cough or sneeze with a tissue, then throw the tissue in the trash.
*  Clean and disinfect frequently touched objects and surfaces
*  Wash your hands often with soap and water, especially after going to the bathroom; before eating; and after blowing your nose, coughing, or sneezing. If soap and water are not readily available, use an alcohol-based hand sanitizer.