I wanted to get into the Prophet package for forecasting. Underneath the hood this procedure uses Additive Regression.
There are four components: a piecewise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting changepoints from the data.

A yearly seasonal component is modeled using Fourier series.

A weekly seasonal component using dummy variables.

Users can define a list of holidays, modeling future uncertainty with seasonality or holiday effects, we can run a few hundred Hamiltonian Monte Carlo iterations, to include this seasonal uncertainty estimates. 

The model uses the Stan probabilistic programming language. In the default implementation Stan performs the MAP (maximum a posteriori) optimization for parameters. 


In [3]:
import pandas as pd
import numpy as np
from fbprophet import Prophet
import os
os.chdir('..')

In [4]:
df = pd.read_csv("Data/max_years.csv", encoding='utf-8')

In [5]:
df["y"] = np.log(df["y"]) 
# taking the log reduces the outcome space to the exponent of e to get the original number.


In [6]:
df.head()

Unnamed: 0,ds,y
0,3/17/1980,-0.897103
1,3/18/1980,-0.938299
2,3/19/1980,-0.952415
3,3/20/1980,-1.026161
4,3/21/1980,-1.016009


In [7]:
# df = df.rename(columns = {"date 30 min":"ds", "lst trd /lst prce":'y'})
df["ds"] = pd.DatetimeIndex(df["ds"])
# ax = df.set_index('ds').plot(df['ds'],df['y'])

In [8]:
type(df["ds"][1])

pandas._libs.tslib.Timestamp

By default Prophet only returns uncertainty in the trend and observation noise. To get uncertainty in seasonality, we must do full Bayesian sampling as noted in the Facebook documentation. This process replaces the typical MAP with Markov Chain Monte Carlo (MCMC).

MAP stands for Maximum a posteriori estimation: in Bayesian statistics, a MAP estimate is an estimate of an unknown quantity that equals the mode of the posterior distribution. This gives us a point estimate of an unobserved quantity based on observed empirical data. 

MAP is related to maximum likelihood (ML) estimation which is a special case of MAP that is defined as a process of estimating the parameters of a statistical model given observations, essentially finding the parameter values that maximize the likelihood of making the observations given the parameters, maxmizing the agreement between a set of observations and the selected model (Source: Wikipedia). 

MAP differs in that it employs an augmented optimization objective incorporating a prior distribution incorporating additional information available through prior knowledge of a related event. We can simplify this definition as employing ML estimation with regularization.

MCMC employs a full sampling and we will then see the uncertainty in seasonal components, which is great, however, it is much slower than the MAP estimation algorithm (especially on Windows using Python). For best results use R on Windows and Python on Unix based operating system. 

In [25]:
# We initialize our Prophet model, mcmc stands for Markov Chain Monte Carlo (MCMC) method to generate its forecasts. MCMC is 
# a stochastic process giving us uncertainty intervals in our subplots.
my_model = Prophet(interval_width= 0.95, mcmc_samples= 10000) # this will take a long time!

In [None]:
my_model.fit(df) # Very slow

INFO:fbprophet.forecaster:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.


In [None]:
future = my_model.make_future_dataframe(periods = 365)
forecast = my_model.predict(future)

In [None]:
# cols = [col for col in forecast.columns if col not in 'ds']
# forecast[cols] = np.log(forecast[cols])
# forecast

In [None]:
my_model.plot((forecast), uncertainty=True).savefig("Amos's Conservative Bayesian MCMC", dpi = 2000)

Prophet uses baked in l1 regularization. It has a large number of potential change points, at which the rate is allowed to change. It then puts a sparse prior (l1 regularization). Essentially Prophet will set a large number of change points and only use a subset of them (Source: https://research.fb.com/prophet-forecasting-at-scale/. 

By default Prophet fits weekly and yearly seasonalities, it can even fit sub-daily time series. The model updates it's priors with new information, the default prior scaling is 10 which provides very little regularization. 

There are three sources of uncertainty in the trend, uncertainty in the seasonality estimates, and additional observation noise. The biggest source of uncertainty in the forecast is the potential for future trend changes. The model can effectively fit the trend changes, but how do we predict trends in the future? We can assume it will be similar to the past, in that we assume that the average frequency and magnitude of trend changes in the future will be the same as in the past. 
This assumption probably will not hold, thus the coverage of the confidence interval may not be entirely accurate.



In [None]:
forecast

In [None]:
np.mean(np.exp(forecast['yhat'][-365:]))

If we include earnings reports and holidays we can even more accurately the price of this stock. 

In [None]:
my_model.plot_components(forecast).savefig("Conservative subplots", dpi = 2000)