## Most used Time Series Methodology

### 1. Moving Average Methodology
The commonly used time series method is the `Moving Average (MA).` This method is slick with random short-term variations. Relaively associated with the components of time series.

**The Moving Average (MA) or Rolling Mean**: The value of MA is calculated by taking average data of the time-series within k periods.

Let's see the types of moving averages:
- Simple Moving Average (SMA),
- Weighted Moving Average (WMA),
- Cumulative Moving Average (CMA),
- Exponential Moving Average (EMA)

#### 1. Simple Moving Average (SMA)
The Simple Moving Average (SMA) is the most basic type of moving average. It is calculated by taking the arithmetic mean of a given set of values over a specified number of periods. The formula for the SMA is:

$$ \text{SMA} = \frac{1}{n} \sum_{i=1}^{n} Y_{t-i+1} $$

Where:
- $( n )$ is the number of periods
- $( Y_{t-i+1} )$ is the value at time $(t-i+1)$

The SMA assigns equal weight to each value in the calculation period.

#### 2. Weighted Moving Average (WMA)
The Weighted Moving Average (WMA) assigns different weights to each value in the calculation period, with more recent values typically given higher weights. The formula for the WMA is:

$$ \text{WMA} = \frac{\sum_{i=1}^{n} w_i Y_{t-i+1}}{\sum_{i=1}^{n} w_i} $$

Where:
- $ (w_i )$ is the weight assigned to the value $( Y_{t-i+1} )$
- $( \sum_{i=1}^{n} w_i )$ is the sum of the weights

The WMA provides a smoother curve compared to the SMA and can be more responsive to recent changes in the data.

#### 3. Exponential Moving Average (EMA)
The Exponential Moving Average (EMA) is a type of moving average that places greater weight on more recent values. It is calculated using a smoothing factor, which determines the rate at which older data points are exponentially decreased. The formula for the EMA is:

$$ \text{EMA}_t = \alpha Y_t + (1 - \alpha) \text{EMA}_{t-1} $$

Where:
- $( \alpha )$ is the smoothing factor, calculated as $( \alpha = \frac{2}{n+1} )$
- $( Y_t )$ is the current value
- $( \text{EMA}_{t-1} )$ is the EMA of the previous period

The EMA reacts more quickly to recent price changes compared to the SMA.

#### 4. Cumulative Moving Average (CMA)
The Cumulative Moving Average (CMA) is the average of all values in a time series up to the current time. It is updated as each new value is added to the series. The formula for the CMA is:

$$ \text{CMA}_t = \frac{\sum_{i=1}^{t} Y_i}{t} $$

Where:
- $( \sum_{i=1}^{t} Y_i )$ is the sum of all values up to time $( t )$
- $( t )$ is the current time period

The CMA provides a long-term average of the time series and is less sensitive to recent changes compared to other moving averages.

## Applications of Moving Averages

Moving averages are used in various applications, including:

- **Trend Analysis**: Identifying and confirming long-term trends.
- **Smoothing**: Reducing short-term fluctuations to better understand the underlying pattern.
- **Forecasting**: Predicting future values based on past data.
- **Technical Analysis**: Used in financial markets to identify buy and sell signals.

By understanding the different types of moving averages and their applications, analysts can choose the most appropriate method for their specific needs.


<Br>


### Auto-Correlation Function (ACF)
ACF indicates how similar a value is within a given time series and the previous value. (OR) It measures the degree of the similarity between a given time series and the lagged version of that time series at the various intervals we observed.

Python Statsmodels library calculates autocorrelation. It identifies a set of trends in the given dataset and the influence of former observed values on the currently observed values.

### Partial Auto-Correlation (PACF)
PACF is similar to Auto-Correlation Function and is a little challenging to understand. It always shows the correlation of the sequence with itself with some number of time units per sequence order in which only the direct effect has been shown, and all other intermediary effects are removed from the given time series.

| ACF                       | PACF                      | Perfect ML Model    |
|---------------------------|----------------------------|---------------------|
| Plot declines gradually   | Plot drops instantly       | Auto Regressive     |
| Plot drops instantly      | Plot declines gradually    | Moving Average      |
| Plot decline gradually    | Plot decline gradually     | ARMA                |
| Plot drops instantly      | Plot drops instantly       | No model            |


### Importance of Autocorrelation and Partial Autocorrelation in Time Series Models

#### AutoRegressive (AR) Model
- **Autocorrelation Function (ACF)**: Measures the correlation between observations at different lags. For an AR model, the ACF decays exponentially or shows sinusoidal behavior. The AR model captures the dependency between an observation and a number of lagged observations. Autocorrelation measures the correlation between observations at different lags. In an AR model, the ACF will typically decay exponentially or show sinusoidal behavior for a stationary series.
- **Partial Autocorrelation Function (PACF)**: Measures the correlation between observations at different lags, excluding the influence of intermediate lags. For an AR model, the PACF shows a significant spike at the lag order $p$ and then drops off to zero.

#### Moving Average (MA) Model
- **Autocorrelation Function (ACF)**: For an MA model, the ACF cuts off after the lag order $ q $, showing no significant correlations beyond this point. This helps in identifying the order $q$ of the MA model.
- **Partial Autocorrelation Function (PACF)**: The PACF for an MA model will typically decay exponentially or show sinusoidal behavior, which is the opposite pattern seen in the AR model.

#### AutoRegressive Moving Average (ARMA) Model
- **Autocorrelation Function (ACF)**:  The combined ARMA model will show a mix of behaviors from both AR and MA models, making it more complex to interpret. However, the ACF can still provide useful information about the underlying process.
- **Partial Autocorrelation Function (PACF)**:Similarly, the PACF will also reflect a combination of AR and MA characteristics, but it remains useful for model identification and diagnostics.

#### AutoRegressive Integrated Moving Average (ARIMA) Model
- **Autocorrelation Function (ACF)**: For non-stationary series, ACF can help identify the order of differencing needed to achieve stationarity. After differencing, the ACF can be used to identify the ARMA components.
- **Partial Autocorrelation Function (PACF)**: The PACF is also used post-differencing to identify the ARMA components of the model.

#### Summary
- **AR Model**: PACF is important for determining the order $p $.
- **MA Model**: ACF is important for determining the order $ q $.
- **ARMA/ARIMA Model**: Both ACF and PACF are crucial for identifying the appropriate orders of $ p $, $ d $, and $q $.

Understanding ACF and PACF is essential for building and diagnosing these time series models effectively.


<br>

### 2. AutoRegressive (AR) Model
The AutoRegressive (AR) model is a type of random process that is used to describe certain time-varying processes in nature, economics, etc. The AR model specifies that the output variable depends linearly on its own previous values and a stochastic term (a random disturbance).

An auto-regressive model is a simple model that predicts future performance based on past performance. It is mainly used for forecasting when there is some correlation between values in a given time series and those that precede and succeed (back and forth).

An AR is a Linear Regression model that uses lagged variables as input. By indicating the input, the Linear Regression model can be easily built using the scikit-learn library. Statsmodels library provides autoregression model-specific functions where you must specify an appropriate lag value and train the model. It is provided in the AutoTeg class to get the results using simple steps.

- Creating the model AutoReg()
- Call fit() to train it on our dataset.
- Returns an AutoRegResults object.
- Once fit, make a prediction by calling the predict () function

The general form of an AR model of order $ p $(AR(p)) is given by:

$$Y_t = c + \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + \cdots + \phi_p Y_{t-p} + \epsilon_t $$

Where:
- $ Y_t $ is the value at time $ t $
- $ c $ is a constant
- $ \phi_1, \phi_2, \ldots, \phi_p $ are parameters of the model
- $\epsilon_t $ is white noise (error term)
- p is past values

### 3. AutoRegressive Moving Average (ARMA) Model
The AutoRegressive Moving Average (ARMA) model is a combination of the AR model and the Moving Average (MA) model. It is used for understanding and predicting future values in a time series.

ARMA is a combination of the Auto-Regressive and Moving Average models for forecasting. This model provides a weakly stationary stochastic process in terms of two polynomials, one for the Auto-Regressive and the second for the Moving Average.

The general form of an ARMA model is given by:

$$ Y_t = c + \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + \cdots + \phi_p Y_{t-p} + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q} + \epsilon_t $$

Where:
- $ Y_t $ is the value at time $ t $
- $ c $ is a constant/intercept
- $ \phi_1, \phi_2, \ldots, \phi_p $ are parameters of the AR part of the model
- $ \theta_1, \theta_2, \ldots, \theta_q $ are parameters of the MA part of the model
- $ \epsilon_t $ is white noise (error term)
- p is past values

### 4. AutoRegressive Integrated Moving Average (ARIMA) Model
The AutoRegressive Integrated Moving Average (ARIMA) model is a generalization of the ARMA model. It is used when the time series data shows evidence of non-stationarity, and an initial differencing step (corresponding to the "integrated" part of the model) can be applied to remove the non-stationarity.

- AR ==> Uses past values to predict the future.
- MA ==> Uses past error terms in the given series to predict the future.
- I==> Uses the differencing of observation and makes the stationary data.

**AR+I+MA= ARIMA**

The general form of an ARIMA model is denoted as ARIMA(p, d, q) where:
- $ p $ is the order of the AR part
- $ d $ is the degree of first differencing involved
- $ q $ is the order of the MA part

The ARIMA model can be written as:

$$ (1 - \sum_{i=1}^{p} \phi_i L^i) (1 - L)^d Y_t = c + (1 + \sum_{i=1}^{q} \theta_i L^i) \epsilon_t $$

Where:
- $ L $ is the lag operator
- $ Y_t $ is the value at time $ t $
- $ c $ is a constant
- $ \phi_i $ are the parameters of the AR part of the model
- $ \theta_i $ are the parameters of the MA part of the model
- $ \epsilon_t $ is white noise (error term)

### Implementation Steps for ARIMA
- Plot a time series format
- Difference to make stationary on mean by removing the trend
- Make stationary by applying log transform.
- Difference log transform to make as stationary on both statistic mean and variance
- Plot ACF & PACF, and identify the potential AR and MA model
- Discovery of best fit ARIMA model
- Forecast/Predict the value using the best fit ARIMA model
- Plot ACF & PACF for residuals of the ARIMA model, and ensure no more information is left.


## Key Points
- **AutoRegressive (AR) Model**: Uses the dependency between an observation and a number of lagged observations.
- **AutoRegressive Moving Average (ARMA) Model**: Combines AR and MA models; useful for stationary time series.
- **AutoRegressive Integrated Moving Average (ARIMA) Model**: Extends ARMA to non-stationary series by including differencing.

## Applications
These models are widely used in various fields such as economics, finance, environmental science, and engineering for forecasting and analyzing time series data.

- **AR Model**: Used when the data shows a clear correlation with past values.
- **ARMA Model**: Used for stationary time series where past values and past forecast errors are used.
- **ARIMA Model**: Used for non-stationary time series which requires differencing to make the series stationary.

By understanding these models, analysts and researchers can better analyze and predict time series data.


<br><br>

## Use of Deep Learning for Time-series Analysis
In recent years, the use of Deep Learning for Time Series Analysis and Forecasting has increased to resolve problem statements that couldn’t be handled using Machine Learning techniques. Let’s discuss this briefly.

![image.png](attachment:c3c988ae-e74e-4b63-9ef6-79f9c7c83e92.png)

### Recurrent Neural Networks (RNN) for Time Series Forecasting

RNN is the most traditional and accepted architecture fitment for Time-Series forecasting-based problems. It is organized into successive layers and divided into:

- **Input**
- **Hidden**
- **Output**

Each layer has equal weight, and every neuron has to be assigned to fixed time steps. Each neuron is fully connected with a hidden layer (Input and Output) with the same time steps, and the hidden layers are forwarded and time-dependent in direction.

![image.png](attachment:cd24e19e-7df0-4f85-acbf-a71fcf8845e4.png)


#### Components of RNN
- **Input:** The function vector of \( x(t) \) is the input at time step \( t \).
- **Hidden:** The function vector \( h(t) \) is the hidden state at time \( t \):
  - This is a kind of memory of the established network.
  - It is calculated based on the current input \( x(t) \) and the previous-time step’s hidden-state \( h(t-1) \).
- **Output:** The function vector \( y(t) \) is the output at time step \( t \).

#### Weights
In the RNNs, the input vector connected to the hidden layer neurons at time \( t \) is by a weight matrix of \( U \), internally weight matrix \( W \) is formed by the hidden layer neurons of time \( t-1 \) and \( t+1 \). Following this, the hidden layer with to the output vector \( y(t) \) of time \( t \) by a \( V \) (weight matrix); all the weight matrices \( U \), \( W \), and \( V \) are constant for each time step.

#### Advantages of RNN
- It remembers every piece of information, making it useful for time series prediction.
- Perfect for creating complex patterns from the input time series dataset.
- Fast in prediction/forecasting.
- Not affected by missing values, so the cleansing process can be limited.

#### Disadvantages of RNN
- The big challenge is during the training period.
- Expensive computation cost.


### LSTM (Long short term memory)

LSTM is a kind of Recurrent Neural Network (RNN) that is good at handling sequence data. It’s widely used in Machine Translation and Speech recognition. If you are already familiar with the structure of RNN, LSTM added three special gates in each of its cells to remember long-term and short-term memories compared with Vanilla RNN models, which are bad at remembering long-term sequences.

![image.png](attachment:921ae6dd-794c-4ba8-85ea-b216088c7c2e.png)

                The structure of each cell of LSTM (source)

![image.png](attachment:1bcecc7a-51da-43b9-81ad-acaacb6594f1.png)

                Notation in the previous image (source)
