# Facebook Prophet

## Intuition

__Prophet__ is an open source software released by Facebook's Core Data Science Team as a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly and daily seasonality, plus holiday effects.

Prophet works best with time series data that have strong seasonal effects and several seasons of historical data, such as:
- Sales three years in the past with all the seasonalities in there (holidays, weekend, weather, etc).
- With all these information, the model can predict the expected sales may be in the next two months. 

The model is pretty accurate, and therefore Facebook itself uses this tool internally to do the planning like:
- demand planning in the future,
- number of customers forecasting model to visit the site at the specific time in the future.

For more information, please check this out:
- https://research.fb.com/prophet-forecasting-at-scale/
- https://facebook.github.io/prophet/docs/quick_start.html#python-api

## Technical

__Prophet__ implements an __additive regression__ model with four elements:
1. A piecewise linear, Prophet automatically picks up change points in the data and identifies any change in trends.
2. A yearly seasonal component modeled using Fourier series.
3. A weekely seasonal component.
4. A holiday list that can be manually provided.

Additive Regression model takes the form: 


- The functions f1(x) are unknown smoothing functions fit from the data.
- Reference: https://research.fb.com/prophet-forecasting-at-scale/

## Advantages

### Accurate and Fast
- Facebook teams uses Prophet for accurate forecasting and planning.
- Prophet can generate results in seconds.

### Automatic
- No need to perform data preprocessing.
- Prophet works with missing data with several outliers.

### Domain Knowledge Integrarion
- Users can tweak forecast by manually adding domain specific knowledge, such as holidays that can be added manually.

# Model Building

In [5]:
# !pip install fbprophet
from fbprophet import Prophet

In [6]:
# import all the main libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%config IPCompleter.greedy=True
%config IPCompleter.use_jedi=False

# Display all the columns in pandas without being truncated
pd.options.display.max_columns = None

In [8]:
test_df = pd.read_csv('Datasets/test.csv')
test_df.head()

Unnamed: 0,index,Store,DayOfWeek,Date,Sales,Customers,Promo,StateHoliday,SchoolHoliday,StoreType,Assortment,CompetitionDistance,CompetitionOpenSinceMonth,CompetitionOpenSinceYear,Promo2,Promo2SinceWeek,Promo2SinceYear,PromoInterval
0,0,1,5,2015-07-31,5263,555,1,0,1,c,a,1270.0,9.0,2008.0,0,0.0,0.0,0
1,1115,1,4,2015-07-30,5020,546,1,0,1,c,a,1270.0,9.0,2008.0,0,0.0,0.0,0
2,2230,1,3,2015-07-29,4782,523,1,0,1,c,a,1270.0,9.0,2008.0,0,0.0,0.0,0
3,3345,1,2,2015-07-28,5011,560,1,0,1,c,a,1270.0,9.0,2008.0,0,0.0,0.0,0
4,4460,1,1,2015-07-27,6102,612,1,0,1,c,a,1270.0,9.0,2008.0,0,0.0,0.0,0


In [9]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 844392 entries, 0 to 844391
Data columns (total 18 columns):
 #   Column                     Non-Null Count   Dtype  
---  ------                     --------------   -----  
 0   index                      844392 non-null  int64  
 1   Store                      844392 non-null  int64  
 2   DayOfWeek                  844392 non-null  int64  
 3   Date                       844392 non-null  object 
 4   Sales                      844392 non-null  int64  
 5   Customers                  844392 non-null  int64  
 6   Promo                      844392 non-null  int64  
 7   StateHoliday               844392 non-null  object 
 8   SchoolHoliday              844392 non-null  int64  
 9   StoreType                  844392 non-null  object 
 10  Assortment                 844392 non-null  object 
 11  CompetitionDistance        844392 non-null  float64
 12  CompetitionOpenSinceMonth  844392 non-null  float64
 13  CompetitionOpenSinceYear   84

In [13]:
pd.to_datetime(test_df.Date)

0        2015-07-31
1        2015-07-30
2        2015-07-29
3        2015-07-28
4        2015-07-27
            ...    
844387   2013-01-07
844388   2013-01-05
844389   2013-01-04
844390   2013-01-03
844391   2013-01-02
Name: Date, Length: 844392, dtype: datetime64[ns]