In [1]:
# IMPORTANT: RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES,
# THEN FEEL FREE TO DELETE THIS CELL.
# NOTE: THIS NOTEBOOK ENVIRONMENT DIFFERS FROM KAGGLE'S PYTHON
# ENVIRONMENT SO THERE MAY BE MISSING LIBRARIES USED BY YOUR
# NOTEBOOK.
import kagglehub
chirag19_air_passengers_path = kagglehub.dataset_download('chirag19/air-passengers')

print('Data source import complete.')


  from .autonotebook import tqdm as notebook_tqdm


Downloading from https://www.kaggle.com/api/v1/datasets/download/chirag19/air-passengers?dataset_version_number=1...


100%|██████████| 764/764 [00:00<00:00, 405kB/s]

Extracting model files...
Data source import complete.





<a class="anchor" id="0"></a>
# **Tutorial : Time Series Forecasting with Prophet**



Hello friends,


In the previous notebooks, we have discussed [Time Series Analysis with Python](https://www.kaggle.com/prashant111/complete-guide-on-time-series-analysis-in-python) and [ARIMA Model for Time Series Forecasting](https://www.kaggle.com/prashant111/arima-model-for-time-series-forecasting). In this notebook, we will make a time series forecast using Facebook’s time series model [Prophet](https://facebook.github.io/prophet/). First up, we will discuss [Prophet](https://facebook.github.io/prophet/),  its advantages and make a time series forecast with [Prophet](https://facebook.github.io/prophet/).


So, let's get started.

### **I hope you find this notebook useful and your <font color="red"><b>UPVOTES</b></font> keep me motivated.**



<a class="anchor" id="0.1"></a>
# **Table of Contents**


1.	[Introduction to Prophet](#1)
2.	[Advantages of Prophet](#2)
3.	[Installation of Prophet](#3)
4.	[Python API](#4)
5.	[Basic Setup](#5)
6.	[Time Series Forecasting with Prophet](#6)
7.	[Plotted the forecasted components](#7)
8.	[Adding ChangePoints to Prophet](#8)
9.	[Adjusting Trend](#9)
10.	[Conclusion](#10)
11.	 [References](#11)



# **1. Introduction to Prophet** <a class="anchor" id="1"></a>

[Table of Contents](#0.1)


The official [Prophet](https://facebook.github.io/prophet/) homepage states that-

   *Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.*

   *Prophet is open source software released by Facebook’s Core Data Science team. It is available for download on CRAN and PyPI.*


- So, [Prophet](https://facebook.github.io/prophet/) is the facebooks’ open source tool for making time series predictions.

- [Prophet](https://facebook.github.io/prophet/) decomposes time series data into trend, seasonality and holiday effect.

- **Trend** models non periodic changes in the time series data.

- **Seasonality** is caused due to the periodic changes like daily, weekly, or yearly seasonality.

- **Holiday effect** which occur on irregular schedules over a day or a period of days.

- **Error terms** is what is not explained by the model.



# **2. Advantages of Prophet** <a class="anchor" id="2"></a>

[Table of Contents](#0.1)



[Prophet](https://facebook.github.io/prophet/) has several advantages associated with it. These are given below:-

- **1. Accurate and fast** - Prophet is accurate and fast. It is used in many applications across Facebook for producing reliable forecasts for planning and goal setting.


- **2. Fully automatic** - Prophet is fully automatic. We will get a reasonable forecast on messy data with no manual effort.


- **3. Tunable forecasts** - Prophet produces adjustable forecasts. It includes many possibilities for users to tweak and adjust forecasts. We can use human-interpretable parameters to improve the forecast by adding our domain knowledge.


- **4. Available in R or Python** - We can implement the Prophet procedure in R or Python.



- **5. Handles seasonal variations well** - Prophet accommodates seasonality with multiple periods.



- **6. Robust to outliers** - It is robust to outliers. It handles outliers by removing them.



- **7. Robust to missing data** - Prophet is resilient to missing data.

# **3. Installation of Prophet** <a class="anchor" id="3"></a>


[Table of Contents](#0.1)



- We can install Prophet using either command prompt or Anaconda prompt using pip as follows-

In [None]:
!pip install fbprophet

# **4. Python API** <a class="anchor" id="4"></a>


[Table of Contents](#0.1)


- [Prophet](https://facebook.github.io/prophet/docs/quick_start.html#python-api) follows the sklearn model API.

- First up, we create an instance of the Prophet class and then call its fit and predict methods.

- **The input to Prophet is always a dataframe with two columns** - **ds** and **y**.

- The **ds (datestamp)** column should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp.

- The **y** column must be numeric, and represents the measurement we wish to forecast.

# **5. Basic Setup** <a class="anchor" id="5"></a>


[Table of Contents](#0.1)


- Now wel will dive right in and see how to make time series predictions using Prophet.

- We will explore the change points, how to include holidays and then add multiple regressors.

- First up, we will import the required libraries and the data.

### **Import libraries**

In [1]:
from fbprophet import Prophet
from fbprophet.plot import plot_plotly
import plotly.offline as py
py.init_notebook_mode()

ModuleNotFoundError: No module named 'fbprophet'

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('fivethirtyeight')

### **Import data**

In [None]:
# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


In [None]:
file = '/kaggle/input/air-passengers/AirPassengers.csv'

df = pd.read_csv(file)

### **Preview dataset**

In [None]:
df.head()

We should rename the column name `#Passenegrs` as `AirPassengers`

In [None]:
df.rename(columns = {'#Passengers':'AirPassengers'}, inplace = True)

### **Summary of dataset**



Now, we will print the information about the dataset that will tell us about the columns, data type of the columns and whether the column is null or not null.

In [None]:
df.info()

- We can see that the dataset contains a `Month` and `AirPassengers` column.

- Their data types are `object` and `int64` respectively.

- The [Prophet](https://facebook.github.io/prophet/) library expects as input a dataframe with one column containing the time information, and another column containing the metric that we wish to forecast.

- The important thing to note is that, the `Month` column must be of the datetime type. But, we can see that it is of `object` data type. Now, because the `Month` column is not of the datetime type. So, we’ll need to convert it into datetime type.

In [None]:
df['Month'] = pd.DatetimeIndex(df['Month'])
df.dtypes

We can now see that our `Month` column is of the correct datetime type.

- [Prophet](https://facebook.github.io/prophet/) also imposes the strict condition that the input columns must be named as **ds (the time column)** and **y (the metric column)**.

- So, we must rename the columns in our dataframe.

In [None]:
df = df.rename(columns={'Month': 'ds',
                        'AirPassengers': 'y'})

df.head()

We can see that the column names are renamed accordingly.

### **Visualize the data**


Now, it is considered a good practice to visualize the data at hand. So let’s plot our time series data:

In [None]:
ax = df.set_index('ds').plot(figsize=(12, 8))
ax.set_ylabel('Monthly Number of Airline Passengers')
ax.set_xlabel('Date')

plt.show()

Now, our dataset is prepared and we are ready to use the Prophet library to produce forecasts of our time series.

# **6. Time Series Forecasting with Prophet** <a class="anchor" id="6"></a>


[Table of Contents](#0.1)


- Now, we will describe how to use the [Prophet](https://facebook.github.io/prophet/) library to predict future values of our time series data.

- The developers of [Prophet](https://facebook.github.io/prophet/) have made it more intuitive for analysts and developers alike to work with time series data.

- To begin, we must instantiate a new Prophet object. Prophet enables us to specify a number of arguments. For example, we can specify the desired range of our uncertainty interval by setting the `interval_width parameter`.


In [None]:
# set the uncertainty interval to 95% (the Prophet default is 80%)
my_model = Prophet(interval_width=0.95)

- Now that our Prophet model has been initialized, we can call its `fit` method with our DataFrame as input.

In [None]:
my_model.fit(df)

- In order to obtain forecasts of our time series, we must provide Prophet with a new DataFrame containing a `ds` column that holds the dates for which we want predictions.

- Conveniently, we do not have to concern ourselves with manually creating this DataFrame, as Prophet provides the `make_future_dataframe` helper function.

In [None]:
future_dates = my_model.make_future_dataframe(periods=36, freq='MS')
future_dates.head()

- In the code snippet above, we instructed Prophet to generate 36 datestamps in the future.

- When working with Prophet, it is important to consider the frequency of our time series.

- Because we are working with monthly data, we clearly specified the desired frequency of the timestamps (in this case, `MS` is the start of the month).

- Therefore, the `make_future_dataframe` generated 36 monthly timestamps for us.

- In other words, we are looking to predict future values of our time series 3 years into the future.

- The DataFrame of future dates is then used as input to the predict method of our fitted model.

In [None]:
forecast = my_model.predict(future_dates)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].head()

Prophet returns a large DataFrame with many interesting columns, but we subset our output to the columns most relevant to forecasting. These are:

  - **ds**: the datestamp of the forecasted value
  - **yhat**: the forecasted value of our metric (in Statistics, yhat is a notation traditionally used to represent the predicted values of a value y)
  - **yhat_lower**: the lower bound of our forecasts
  - **yhat_upper**: the upper bound of our forecasts

- A variation in values from the output presented is to be expected as Prophet relies on **Markov chain Monte Carlo (MCMC)** methods to generate its forecasts.

- MCMC is a stochastic process, so values will be slightly different each time.

- Prophet also provides a convenient function to quickly plot the results of our forecasts as follows:

In [None]:
my_model.plot(forecast, uncertainty=True)

- Prophet plots the observed values of our time series (the black dots), the forecasted values (blue line) and the uncertainty intervals of our forecasts (the blue shaded regions).

- One other particularly strong feature of Prophet is its ability to return the components of our forecasts.

- This can help reveal how daily, weekly and yearly patterns of the time series contribute to the overall forecasted values.

In [None]:
my_model.plot_components(forecast)

- The above plot provides interesting insights.

- The first plot shows that the monthly volume of airline passengers has been linearly increasing over time.

- The second plot highlights the fact that the weekly count of passengers peaks towards the end of the week and on Saturday.

- The third plot shows that the most traffic occurs during the holiday months of July and August.

# **7. Plotting the forecasted components** <a class="anchor" id="7"></a>


[Table of Contents](#0.1)



- We can plot the trend and seasonality, components of the forecast as follows:

In [None]:
fig1 = my_model.plot_components(forecast)

# **8. Adding ChangePoints to Prophet** <a class="anchor" id="8"></a>


[Table of Contents](#0.1)


- Changepoints are the datetime points where the time series have abrupt changes in the trajectory.

- By default, Prophet adds 25 changepoints to the initial 80% of the data-set.

- Let’s plot the vertical lines where the potential changepoints occurred.

In [None]:
from fbprophet.plot import add_changepoints_to_plot
fig = my_model.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), my_model, forecast)

We can view the dates where the chagepoints occurred.

In [None]:
my_model.changepoints

- We can change the inferred changepoint range by setting the *changepoint_range*

In [None]:
pro_change= Prophet(changepoint_range=0.9)
forecast = pro_change.fit(df).predict(future_dates)
fig= pro_change.plot(forecast);
a = add_changepoints_to_plot(fig.gca(), pro_change, forecast)

The number of changepoints can be set by using the *n_changepoints* parameter when initializing prophet.

In [None]:
pro_change= Prophet(n_changepoints=20, yearly_seasonality=True)
forecast = pro_change.fit(df).predict(future_dates)
fig= pro_change.plot(forecast);
a = add_changepoints_to_plot(fig.gca(), pro_change, forecast)

# **9. Adjusting Trend** <a class="anchor" id="9"></a>


[Table of Contents](#0.1)


- Prophet allows us to adjust the trend in case there is an overfit or underfit.

- *changepoint_prior_scale* helps adjust the strength of the trend.

- Default value for *changepoint_prior_scale* is 0.05.

- Decrease the value to make the trend less flexible.

- Increase the value of changepoint_prior_scale to make the trend more flexible.

- Increasing the *changepoint_prior_scale* to 0.08 to make the trend flexible.


In [None]:
pro_change= Prophet(n_changepoints=20, yearly_seasonality=True, changepoint_prior_scale=0.08)
forecast = pro_change.fit(df).predict(future_dates)
fig= pro_change.plot(forecast);
a = add_changepoints_to_plot(fig.gca(), pro_change, forecast)

- Decreasing the *changepoint_prior_scale* to 0.001 to make the trend less flexible.

In [None]:
pro_change= Prophet(n_changepoints=20, yearly_seasonality=True, changepoint_prior_scale=0.001)
forecast = pro_change.fit(df).predict(future_dates)
fig= pro_change.plot(forecast);
a = add_changepoints_to_plot(fig.gca(), pro_change, forecast)

# **10. Conclusion** <a class="anchor" id="10"></a>


[Table of Contents](#0.1)


- In this tutorial, we described how to use the Prophet library to perform time series forecasting in Python.

- We have been using out-of-the box parameters, but Prophet enables us to specify many more arguments.

- In particular, Prophet provides the functionality to bring your own knowledge about time series to the table.

# **11. References** <a class="anchor" id="11"></a>


[Table of Contents](#0.1)


The concepts and ideas in this notebook are tgaken from the following websites-

- 1. https://facebook.github.io/prophet/

- 2. https://facebook.github.io/prophet/docs/quick_start.html

- 3. https://peerj.com/preprints/3190.pdf

- 4. https://www.digitalocean.com/community/tutorials/a-guide-to-time-series-forecasting-with-prophet-in-python-3


So, now we will come to the end of this notebook.

I hope you find this kernel useful and enjoyable.

Your comments and feedback are most welcome.

Thank you


[Go to Top](#0)