# Facebook Prophet

## Moving average and exponential smoothing

**Moving average:** a moving average is used as a smoothing technique to find a straighter line through data with a lot of variation.

**Exponential smoothing:** is to apply exponentially decreasing weights to the values being averaged over time, giving recent values more weight and older values less.

Exponential smoothing originated in the 1950s with **simple exponential smoothing**, 
which does not allow for a trend or seasonality. Charles Holt advanced the technique 
in 1957 to allow for a trend with what he called **double exponential smoothing**; and 
in collaboration with Peter Winters, Holt added seasonality support in 1960, in what is 
commonly called **Holt-Winters exponential smoothing**. 


## ARIMA

In 1970, the mathematicians George Box and Gwilym Jenkins published *Time Series:* 
*Forecasting and Control*, which described what is now known as the **Box-Jenkins model**. 
This methodology took the idea of the moving average further with the development of 
**ARIMA**. As a term, ARIMA is often used interchangeably with Box-Jenkins, although 
technically, Box-Jenkins refers to a method of parameter optimization for an ARIMA 
model.

- Autoregressive (AR)
- Integrated (I)
- Moving Average (MA).

**Autoregressive** means that the model uses the dependent relationship between a data point and some number of lagged data points.

**Integrated** is the difference between that data point and some previous data point is used.


**Paramenters:**

*ARIMA(p, d, q)*

- *p:* is used as the number of lag observations to use, also known as the lag order.
- *d:* number of times that a raw observation is differenced, or the degree of differencing.
- *q:* q represents the size of the moving average window.

> problem with ARIMA models is that they do not support seasonality, or data with repeating cycles

**SARIMA**, or **Seasonal ARIMA**, was developed to overcome this drawback.

Other variations on ARIMA models:


- **VARIMA** (**Vector ARIMA**, for cases with multiple time series as vectors);
- **FARIMA** (**Fractional ARIMA**) or **ARFIMA** (**Fractionally Integrated ARMA**), both of which include a fractional differencing degree allowing a long memory in the sense that observations far 
apart in time can have non-negligible dependencies;
- **SARIMAX**, a **seasonal ARIMA** model where the X stands for exogenous or additional variables added to the model, such as adding a rain forecast to a temperature model.







> ARIMA is complexity;  
> Tuning and optimizing ARIMA models is often computationally expensive;    
> Depend upon the skill and experience of the forecaster;   
> It is not a scalable process; 
but     
>better suited to ad hoc analyses by skilled practitioners


## ARCH/GARCH

When the variance of a dataset is not constant over time, ARIMA models face problems with modeling it.

**Autoregressive Conditional Heteroscedasticity (ARCH)** models were developed to solve this problem. **Heteroscedasticity** is a fancy way of saying that the variance or spread of the data is not constant.

Robert Engle introduced the first ARCH model in 1982 by describing the **conditional variance** as a function of previous values.

Tim Bollerslev and Stephen Taylor introduced a moving average component to the model in 1986 with their **Generalized ARCH model**, or **GARCH**.

Both ARCH and GARCH models can handle neither trend nor seasonality though, 
so often, in practice, an ARIMA model may first be built to extract out the seasonal 
variation and trend of a time series, and then an ARCH model may be used to model 
the expected variance.


## Neural networks

- A relatively recent development in time series forecasting is the use of **Recurrent Neural Networks (RNNs)**. 

- This was made possible with the development of the **Long Short-Term Memory** unit, or **LSTM**, by Sepp Hochreiter and Jürgen Schmidhuber in 1997. 


- Essentially, an LSTM unit allows a neural network to process a sequence of data, such as speech or video, instead of a single data point, such as an image.

- A standard RNN is called *recurrent* because it has loops built into it, which is what gives it memory, that is, gives it access to previous information. 


- Time series forecasting with LSTMs is still in its infancy when compared to the other forecasting methods discussed here; however, it is showing promise. 

- One strong advantage over other forecasting techniques is the ability of neural networks to capture non-linear relationships. 

- But as with any deep learning problem though, LSTM forecasting requires a great deal of data, computing power, and processing time. 

- Additionally, there are many decisions to be made regarding the architecture of the model and the hyperparameters to be used, which necessitate a very experienced forecaster. 

- In most practical problems, where budget and deadlines must be considered, an ARIMA model is often the better choice.



## Prophet

Prophet was developed internally at Facebook by Sean J. Taylor and Ben Letham in 
order to overcome two issues often encountered with other forecasting methodologies: 
the **more automatic** forecasting tools available tended to be too inflexible and unable 
to accommodate additional assumptions, and the **more robust** forecasting tools would 
require an experienced analyst with specialized data science skills. Facebook was 
experiencing too much demand for high-quality business forecasts than their analysts 
were able to provide. In 2017, Facebook released Prophet to the public as open source 
software.

Prophet was designed to optimally handle business forecasting tasks, which typically 
feature any of these attributes:

- Time series data captured at the hourly, daily, or weekly level with ideally at least a full year of historical data
- Strong seasonality effects occurring daily, weekly, and/or yearly
- Holidays and other special one-time events that don't necessarily follow the seasonality patterns but occur irregularly
- Missing data and outliers
- Significant trend changes that may occur with the launch of new features or products, for example
- Trends that asymptotically approach an upper or lower bound

<br>

Prophet is an **additive regression model**.

- A linear or logistic growth trend curve
- An annual seasonality curve
- A weekly seasonality curve
- A daily seasonality curve
- Holidays and other special events
- Additional user-specified seasonality curves, such as hourly or quarterly, for example
