# Time Series Analysis

Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time series affect the current value of another time series, this type of analysis of time series is not called "time series analysis", which focuses on comparing values of a single time series or multiple dependent time series at different points in time.

----------------------------------------------------------------------------------------------------------------------

## Overview of Framework:

##### 1.Visualize the time-series
It is essential to analyze the trends prior to building any kind of time series model. The details we are interested in pertains to any kind of trend, seasonality or random behaviour in the series. We have covered this part in the second part of this series.
##### 2.Stationarize the series
Once we know the patterns, trends, cycles and seasonality , we can check if the series is stationary or not. Dickey – Fuller is one of the popular test to check the same. We have covered this test in the first part of this article series. This doesn’t ends here! What if the series is found to be non-stationary? There are three commonly used technique to make a time series stationary.
##### 3.Identify the optimal parameters
The parameters p,d,q can be found using  ACF and PACF plots. An addition to this approach is can be, if both ACF and PACF decreases gradually, it indicates that we need to make the time series stationary and introduce a value to “d”.
##### 4.Build the Models
With the parameters in hand, we can now try to build ARIMA model. The value found in the previous section might be an approximate estimate and we need to explore more (p,d,q) combinations. The one with the lowest BIC and AIC should be our choice. We can also try some models with a seasonal component. Just in case, we notice any seasonality in ACF/PACF plots.
##### 5.Make the prediction
Once we have the final ARIMA model, we are now ready to make predictions on the future time points. We can also visualize the trends to cross validate if the model works fine.

----------------------------------------------------------------------------------------------------------------------

## Basic Time Series Concepts:

### [1] Time Series Statistical Models - Trends >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#### White Noise (Total Random Trend)

" w(t) - Uncorrelated random variable, with mean 0, variance a^2 "
" w(t) ~ iid N(0, a^2) : Gausian White Noise "


#### Moving Averages and Filtering (Smooth Trend)

" v(t) - moving average of a range of value to soomth the series "
" v(t) = i/3 * (w(t-1) + w(t) + w(t+1)) "


#### Autoregressions (Smooth Trend)

" x(t) - a regression prediction of current value as a function of the past n values "
" x(t) = x(t-1) - .9x(t-2) + w(t) "


#### Random Walk ()

X(t) = X(t-1) + Er(t)

#### Random Walk with Drift (Global Trend)

" X(t) - A random walk model with a drift, if drift = 0, it is simply a random walk "
" X(t) = d + X(t-1) + w(t) " # d = drift 



#### Signal in Noise (Periodic trend)

" real signal + noise => x(t) = s(t) + v(t) | s:signal, v:noise correlated with t"
" p(t) - using cosin wave model to micmic real signal"
" p(t) = 2cos(2pie * (t+15) / 50) + w(t) " # Add noise



### [2] Measure of Dependence >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


#### Mean Function

" u - the usual expected value "
-- "Moving Average Series" : E(v(t)) = 0 # White noise average = 0
-- "Random Walk with Drift" : E(x(t)) = d # drift will be expected
-- "Signal Plus Nosie" : E(p(t)) = 2cos(2pie*(t+15)/50) # Just the cosin wave


#### Autocovariance Function

-- "White Noise" : cov(w(s), w(t)) = {s=t then SD, s<>t then 0}
-- "Moving Average Series (t-1 + t + t+1)" : cov(v(s),v(t)) = {s=t then 3/9 * sd, 
	                                                          |s-t|=1 then 2/9 * sd, 
	                                                          |s-t|=2 then 1/9 * sd, 
	                                                          |s-t| > 2 then 0}
-- "Random Walk" : cov(x(s), x(t)) = min{s,t} sd

#### Autocorrelation Function (ACF)

" ACF - effect between x(t) and x(t+h) " 

#### Partial autocorrelation function (PACF)

" PACF - effect between x(t) and x(t+h) removing effect from x(t+1), ... , x(t+h-1) "

#### Cross-Covariance Function

"Between two series x(t), y(t) "

#### Cross-Correlation Function

"Between two series x(t), y(t) "



### [3] Stationarity >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

##### Dickey Fuller Test of Stationarity
X(t) = Rho * X(t-1) + Er(t)  X(t) - X(t-1) = (Rho - 1) X(t - 1) + Er(t)

We have to test if Rho – 1 is significantly different than zero or not. If the null hypothesis gets rejected, we’ll get a stationary time series.

Stationary testing and converting a series into a stationary series are the most critical processes in a time series modelling. You need to memorize each and every detail of this concept to move on to the next step of time series modelling.

##### Stationary Series
There are three basic criterion for a series to be classified as stationary series :

The mean of the series should not be a function of time rather should be a constant. The image below has the left hand graph satisfying the condition whereas the graph in red has a time dependent mean.

The variance of the series should not a be a function of time. This property is known as homoscedasticity. Following graph depicts what is and what is not a stationary series. (Notice the varying spread of distribution in the right hand graph)

The covariance of the i th term and the (i + m) th term should not be a function of time. In the following graph, you will notice the spread becomes closer as the time increases. Hence, the covariance is not constant with time for the ‘red series’.


### [4] Model Selection Metrics & DEA plots >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#### --- Metrics to choose Models

- " MSE - mean squared error "
- " F = MSR/MSE "
- " R^2 "
- " AIC - p49 small better (tends to be superior in smaller samples where relative number of parameter is large) "
- " AICc - p49 "
- " BIC - p50 small better (Tends to be superior in large samples, choose smaller model - higher penality "

#### --- Stationarize Methods

" It is necessary for time-series data to be stationary >>> "

#### 1) Detrending

" Assume -- Trend Stationary (Detrending) "

x(t) = u(t) + y(t) # y(t) is a stationary process


#### 2) Differencing

" Assume -- Random Walk With Drift (Differencing) "

u(t) = d + u(t-1) + w(t) # First differencing

" BackShift Operator " k>0
" ForwardShift Operator " k>0




#### 3) Transformation

" Equalize variability over the length of single series "

" Log Transformation "

" Box-Cox Transformation "




#### --- DEA Plotting

#### 1) Scatter lag plot

" find nonlinear relationship x(t) - x(t-6) or x(t) - y(t-3) "

#### 2) Regression 

" To discover a Signal in Noise "


#### --- Smoother for Time Series

#### 1) Moving Average Soomther 

" useful in discovering certain traits in a time series, long term trebd, seseanal components "
" Moving with weights to the positions "


#### 2) Kernel Smoothing 

" use a weighted function (normal kernel) "
" The wider the bandwith - b --> the smoother the result > ksmooth "


#### 3) Lowess - Nearest Neighbor Regression

" The bigger % data used to train, the smoother "


#### 4) Smoothing Splines

" fit a polynomial regression in terms of time "


#### ** Smoothing can also being used to determine nonlinear relationship between series "











----------------------------------------------------------------------------------------------------------------------

## Step1 - Visualize the Time-Series

## Step2 - Stationarize the Series

## Step3 - Identify the optimal parameters

## Step4 - Building the Models

## Step5 - Make the Prediction