# Advanced Prophet Demo Part II
Jingyuan Chen, Prepared for ECON 148 by DSUS Modules Team

Import the packages  if needed

In [None]:
#!pip install fredapi
#!pip install prophet
#!pip install yfinance

In [None]:
from prophet import Prophet
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error
import yfinance as yf
from fredapi import Fred
fred = Fred(api_key='e3053cdc3e94dfb2b73c5945b0d1b1f7')

## Prophet: Additive Models


Previously, we learned about ARIMA Models which builds a formula for future values as a function of past data – assuming causality between past and future. Today, we’ll explore Facebook’s Prophet tool, which makes use of an “Additive Model”. Instead of attempting to forecast based on a function of past data, additive models look at major “change points” in trends through combining (hence additive) factors, i.e. looking for the “best fit curve” for our data.

The idea is quite simple: we can decompose a time series as a combination of patterns

$$
y(t) = g(t) + s(t) + h(t) + \epsilon(t)
$$

- $s(t)$ is our general data trend, which attempts to capture the general direction of our time-series through fitting piecewise linear or lin-log functions at “change points”. 
- $g(t)$ is our seasonal component using Fourier series for larger periods (months, years), and dummy variables for small periods (weeks) 
- $h(t)$ are our important holidays / exception points provided by the user or the default list. These are modeled as gaussian functions for smoothing.
- $\epsilon(t)$ is essentially our noise error / residual term which cannot be explained by our previous components. 
 

Prophet works best with time series that have strong seasonal effects and several seasons of historical data. You can read more about this model in the team’s [white paper](https://peerj.com/preprints/3190/) and [documentation](https://facebook.github.io/prophet/docs/quick_start.html#python-api). Prophet is also [open source](https://github.com/facebook/prophet)


## Dangers & Assumptions

Like many other forecasting packages, Prophet is a somewhat "opinionated" model and has certain biases and assumptions, it is important to take into consideration these limitations when conducting analysis on our forecasts and conduct cross validation across models.

Some food for thought: in 2021 Zillow’s data science team allegedly made heavy use of Prophet as the basis of their price prediction model, which failed to predict houses accurately enough leading to a loss of $300MM in a single quarter. [Read more](https://ryxcommar.com/2021/11/06/zillow-prophet-time-series-and-prices/)

Some important considerations to keep in mind when using Prophet is that it assumes:  
- Additivity, it is assumed that our trend, seasonality, holidays and error are independent and can be combined linearly. Additionally, Prophet does not assume a stochastic trend. 
- Seasonality, it is assumed that seasonal patterns repeat over time cyclically (yearly, monthly, weekly, daily, etc) and predictably.
- Holidays & exceptions, Prophet makes use of built in lists of holiday effects it is assumed that these effects are constant over time [read more](https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html)
- Errors are assumed to be due to noise (and hence Gaussian). This gives up some inferential advantages of ARIMA (in return for more relaxed and plausible assumptions), as we do not assume causal relations between our past data/errors and our future ones (no autocorrelation on residual). 


## Case 1: Air Revenue Passenger Miles Over Time

Lets take the example of our data set Air Revenue Passenger Miles (AIRRPMTSI) from FRED. Revenue passenger miles are calculated by multiplying the number of paying passengers by the distance traveled.

In [None]:
data = fred.get_series('AIRRPMTSI')

In [None]:
df = pd.DataFrame({'ds': data.index, 'y': data.values}) #can vary 
model = Prophet(seasonality_mode='multiplicative') #should be the same
model.fit(df) #given

In [None]:
future = model.make_future_dataframe(periods=730)  #May have slight variations
forecast = model.predict(future) # Given

In [None]:
fig = model.plot(forecast) # should be the same #
plt.title("Air Revenue Passenger Miles Forecast") #plotly parts can have variance
plt.xlabel("Date")
plt.ylabel("Air Revenue Passenger Miles (Thousands)")
plt.show() #given

Run the following cell:

In [None]:
from prophet.plot import add_changepoints_to_plot
change_points_df = df.loc[df["ds"].isin(model.changepoints)]
fig = model.plot(forecast, figsize=(12, 6))
plt.scatter(change_points_df['ds'], change_points_df['y'], color='red', marker='x', s=100, label='Change Points')
plt.legend(loc='upper left')
plt.show()

Prophet is its ability to provide component plots, which break down the forecast into its individual components. Run the following cell:

In [None]:
fig2 = model.plot_components(forecast)
plt.show()

## Case 2: Monthly Air Passengers

Traditionally, By default Prophet fits additive seasonalities, meaning the effect of the seasonality is added to the trend to get the forecast. Let's think why this might not always be the case, namely following our previous example of air travel. Why would we use multiplactive seasonality in our previous example?

For this example, we make use of a seperate data set which includes the number of air passengers monthly from 1949 to 1960. (Read up on what happened to air travel during this period!)

In [None]:
data2 = pd.read_csv('https://raw.githubusercontent.com/facebook/prophet/main/examples/example_air_passengers.csv')

In [None]:
model2 = Prophet(seasonality_mode='additive')
model2.fit(data2)
future = model2.make_future_dataframe(50, freq='MS')
forecast = model2.predict(future)
fig = model2.plot(forecast)
plt.title('Additive Model')
plt.show()

model3 = Prophet(seasonality_mode='multiplicative')
model3.fit(data2)
future = model3.make_future_dataframe(50, freq='MS')
forecast = model3.predict(future)
fig = model3.plot(forecast)
plt.title('Multiplicative Model')
plt.show()

It turns out that by default Prophet fits additive seasonalities, meaning the effect of the seasonality is added to the trend to get the forecast. Consider if seasonality is an additive factor as assumed by Prophet. As air travel improves with time with greater accessibility do we expect that it should also grow with the trend? In general: 

Additive Model:
Suitable when variance is constant: If the variance (fluctuations or noise) in the time series is roughly constant across different levels of the time series, an additive model may be appropriate. In other words, the seasonal patterns contribute a constant amount to the overall variability of the time series.

Multiplicative Model:
Suitable when variance scales with the level: If the variance of the time series increases (or decreases) with the overall level of the time series, a multiplicative model may be more appropriate. This is often the case when the seasonal patterns contribute proportionally to the overall level of the time series.


## Evaluating Performance
While we can see that in this instance, one model clearly does a better job, we would like to find some way of quantifying this. Run the following cell.

In [None]:
from prophet.diagnostics import cross_validation
from prophet.diagnostics import performance_metrics

df_cv_additive = cross_validation(model2, initial='730 days', period='180 days', horizon='365 days')
metrics_additive = performance_metrics(df_cv_additive)

print(metrics_additive[['horizon', 'mae']])
overall_mae_additive = metrics_additive['mae'].mean()
print(f'Overall MAE (Additive Model): {overall_mae_additive:.3f}')

df_cv_multiplicative = cross_validation(model3, initial='730 days', period='180 days', horizon='365 days')
metrics_multiplicative = performance_metrics(df_cv_multiplicative)

print(metrics_multiplicative[['horizon', 'mae']])
overall_mae_multiplicative = metrics_multiplicative['mae'].mean()
print(f'Overall MAE (Multiplicative Model): {overall_mae_multiplicative:.3f}')