<a href="https://colab.research.google.com/github/Uzmamushtaque/CSCI-4967-Projects-in-ML-AI/blob/main/Lecture_21.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Lecture 21

### Topics for Today

1. Introduction to Time Series Analysis
2. Forecasting with Statistical Models
3. Time series forecasting with Deep Learning (Already Covered)
4. Forecasting at Scale: Prophet Model

## Introduction

A time series is a set of data points ordered in time.
The data is equally spaced in time, meaning that it was recorded at every hour, minute,
month, or quarter. Typical examples of time series include the closing value of a
stock, a household’s electricity consumption, or the temperature outside.

Components of a time series:

1. Trend: The slow moving changes in a time series is called trend. it is responsible for making the series gradually increase or decrease over time.

2. Seasonality: The seasonality component represents the seasonal pattern in the series. The cycles
occur repeatedly over a fixed period of time.

3. Residuals: This corresponds to random errors, also termed white noise.

## Important concepts

**Stationarity:** A stationary time series is one whose statistical properties do not change over time. In
other words, it has a constant mean, variance, and autocorrelation, and these properties
are independent of time.
Many forecasting models assume stationarity.

**Transformation:** A transformation is a mathematical operation applied to a time series in order to
make it stationary.
Differencing is a transformation that calculates the change from one timestep to
another. This transformation is useful for stabilizing the mean.
Applying a log function to the series can stabilize its variance.

**Test for stationarity:** The augmented Dickey-Fuller (ADF) test helps us determine if a time series is stationary
by testing for the presence of a unit root. If a unit root is present, the time series
is not stationary.

**Autocorrelation function:**  The autocorrelation function (ACF) measures the linear relationship between lagged
values of a time series.
In other words, it measures the correlation of the time series with itself.

**Partial Autocorrelation Function:** Partial autocorrelation measures the correlation between lagged values in a time
series when we remove the influence of correlated lagged values in between. We can
plot the partial autocorrelation function to determine the order of a stationary AR(p)
process. The coefficients will be non-significant after lag p.

**Akaike Information Criteria (AIC):** The Akaike Information Criterion (AIC) lets you test how well your model fits the data set without over-fitting it.

The AIC score rewards models that achieve a high goodness-of-fit score and penalizes them if they become overly complex.

By itself, the AIC score is not of much use unless it is compared with the AIC score of a competing model.

The model with the lower AIC score is expected to strike a superior balance between its ability to fit the data set and its ability to avoid over-fitting the data set.

## Forecasting with statistical models

### Moving average process

In a moving average (MA) process, the current value depends linearly on the mean of
the series, the current error term, and past error terms.
The moving average model is denoted as MA(q), where q is the order.

Order of the process can be determined using ACF and look for significant autocorrelation coefficients.
In the case of a random walk, we will not see significant coefficients after lag 0.
On the other hand, if we see significant coefficients, we must check whether they
become abruptly non-significant after some lag q. If that is the case, then we know that
we have a moving average process of order q.


### Auto Regressive process (AR)

An autoregressive process is a regression of a variable against itself. In a time
series, this means that the present value is linearly dependent on its past values.
The autoregressive process is denoted as AR(p), where p is the order. Using PACF this order can be determined.

## Auto Regressive Moving Average process (ARMA)

The autoregressive moving average process is a combination of the autoregressive
process and the moving average process.
It is denoted as ARMA(p,q), where p is the order of the autoregressive process, and
q is the order of the moving average process.
Using both ACF and PACF, the order of the process can be determined.
If your process is stationary and both the ACF and PACF plots show a decaying or sinusoidal
pattern, then it is a stationary ARMA(p,q) process.

## Auto Regressive Integrated Moving Average (ARIMA)

An autoregressive integrated moving average (ARIMA) process is the combination of
the AR(p) and MA(q) processes, but in terms of the differenced series.
It is denoted as ARIMA(p,d,q), where p is the order of the AR(p) process, d is the order
of integration, and q is the order of the MA(q) process.
Integration is the reverse of differencing, and the order of integration d is equal to the
number of times the series has been differenced to be rendered stationary.
A time series that can be rendered stationary by applying differencing is said to be an
integrated series.


[Implementation](https://github.com/marcopeix/TimeSeriesForecastingInPython/blob/master/CH07/CH07.ipynb)

There are other variants of ARIMA model like SARIMA, SARIMAX etc.
Read more about these here: [Link](https://machinelearningmastery.com/sarima-for-time-series-forecasting-in-python/)

Deep learning models(RNN, LSTM, GRU etc.) are used when we have large complex datasets. In those situations, deep
learning can leverage all the available data to infer relationships between each feature
and the target, usually resulting in good forecasts.

These have been covered in previous lectures.

## Automated Forecasting Libraries

The data science community and companies have developed many libraries to automate
the forecasting process and make it easier. Some of the most popular libraries
and their websites are listed here:

1. Pmdarima—http://alkaline-ml.com/pmdarima/modules/classes.html

2. Prophet—https://facebook.github.io/prophet

3. NeuralProphet—https://neuralprophet.com/html/index.html

4. PyTorch Forecasting—https://pytorch-forecasting.readthedocs.io/en/stable

Prophet is an open source package from Meta Open Source, meaning that it is built
and maintained by Meta. This library was built specifically for business forecasting at
scale. It arose from the internal need at Facebook to produce accurate forecasts quickly,
and the library was then made freely available. Prophet is arguably the best-known forecasting
library in the industry, as it can fit nonlinear trends and combine the effect of
multiple seasonalities.

NeuralProphet builds on the Prophet library to automate the use of hybrid models
for time series forecasting.

[Paper 1](https://arxiv.org/pdf/2111.15397.pdf)

## Exploring Prophet

Under the hood, Prophet implements a general additive model where each time
series y(t) is modeled as the linear combination of a trend g(t), a seasonal component
s(t), holiday effects h(t), and an error term ϵt, which is normally distributed.

The trend component models the non-periodic long-term changes in the time series.
The seasonal component models the periodic change, whether it is yearly, monthly,weekly, or daily. The holiday effect occurs irregularly and potentially on more than
one day. Finally, the error term represents any change in value that cannot be
explained by the previous three components.

Notice that this model does not take into account the time dependence of the
data, unlike the ARIMA(p,d,q) model, where future values are dependent on past values.
Thus, this process is closer to fitting a curve to the data, rather than finding the
underlying process. Although there is some loss of predictive information using this
method, it comes with the advantage that it is very flexible, since it can accommodate
multiple seasonal periods and changing trends. Also, it is robust to outliers and missing
data, which is a clear advantage in a business context.
The inclusion of multiple seasonal periods was motivated by the observation that
human behavior produced multi-period seasonal time series. For example, the fiveday
work week can produce a pattern that repeats every week, while school break can
produce a pattern that repeats every year. Thus, to take multiple seasonal periods into
account, Prophet uses the Fourier series to model multiple periodic effects.

Finally, this model allows us to consider the effect of holidays. Holidays are irregular
events that can have a clear impact on a time series. For example, events such as
Black Friday in the United States can dramatically increase the attendance in stores or
the sales on an ecommerce website.

[Paper 2](https://peerj.com/preprints/3190/)

[Implementation](https://github.com/marcopeix/TimeSeriesForecastingInPython/blob/master/CH19/CH19.ipynb)

### Transformers in Time Series Analysis

This is arelatively novel research area because transformers in general are not found to be effective in time series forecasting problems.

[Link](https://medium.com/intel-tech/how-to-apply-transformers-to-time-series-models-spacetimeformer-e452f2825d2e)

[Paper 3](https://arxiv.org/pdf/2202.07125.pdf)

### Reference for the lecture
 Time series forecasting in Python by Marco Peixeiro
