## OBJECTIVE:
### Predicting Covid-19 confirmed cases using Facebook Prophet model which is best suited for univariate time-series analysis. 
### Prophet is an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
!pip install pystan

In [None]:
!pip install fbprophet

In [None]:
import fbprophet
from fbprophet import Prophet

In [None]:
## checking the func & attributes of Prophet

dir(Prophet)

In [None]:
df = pd.read_csv(r"../input/corona-virus-report/covid_19_clean_complete.csv")

In [None]:
df.head()

In [None]:
df.dtypes

In [None]:
df['Date']=pd.to_datetime(df['Date'])

In [None]:
df.dtypes

In [None]:
df.shape

In [None]:
df['Date'].nunique()

#### We have 188 unique date values therefore we're going to groupby the date.

In [None]:
total=df.groupby("Date")['Confirmed','Deaths','Recovered','Active'].sum().reset_index()

In [None]:
total.head()

### Applying Prophet on the data

In [None]:
## rename columns as per prophet

df_prophet=total.rename(columns={"Date":"ds","Confirmed":"y"})

In [None]:
df_prophet.head()

In [None]:
## intializing the prophet
m= Prophet()

In [None]:
## fitting the model

model = m.fit(df_prophet)

### Making future data

In [None]:
model.seasonalities

In [None]:
## data for next 30 days

future_global = model.make_future_dataframe(periods=30, freq="D")

### Validation

In [None]:
future_global.shape

In [None]:
df_prophet.shape

In [None]:
future_global.tail()

In [None]:
df_prophet["ds"].tail()

#### We have successfully added the next 30 days to predict the confirmed cases.

### Prediction on future data

In [None]:
prediction = model.predict(future_global)
prediction

In [None]:
## deriving required columns

prediction[["ds","yhat","yhat_lower","yhat_upper"]].tail()

### Visulalizing the Results

In [None]:
model.plot(prediction)

#### Note: Ignore the second plot.

### Trend and Weekly

In [None]:
model.plot_components(prediction)

#### Note: Ignore the second pair plot.

### Change points

In [None]:
from fbprophet.plot import add_changepoints_to_plot

In [None]:
fig=model.plot(prediction)

a= add_changepoints_to_plot(fig.gca(), model, prediction)

### Cross-Validation

In [None]:
from fbprophet.diagnostics import cross_validation

#### Cross validation in prophet includes horizon i.e the prediction day, period is half of horizon and initial is 3 times the horizon. 

In [None]:
df_cv= cross_validation(model, horizon="30 days", period="15 days", initial="90 days")

In [None]:
df_cv.head()

In [None]:
df_cv.shape

### Performance metrics for the model

In [None]:
from fbprophet.diagnostics import performance_metrics

In [None]:
df_perf = performance_metrics(df_cv)

In [None]:
df_perf .head()

In [None]:
df_perf.shape

### Visulalizing the metrics

In [None]:
from fbprophet.plot import plot_cross_validation_metric

#### RMSE plot

In [None]:
df_perf= plot_cross_validation_metric(df_cv, metric="rmse")

#### MSE plot

In [None]:
df_perf= plot_cross_validation_metric(df_cv, metric="mse")

#### Mape plot

In [None]:
df_perf= plot_cross_validation_metric(df_cv, metric="mape")