You may already have seen some time‐series analysis techniques in action in my previous books (Chan, 2009 and 2013), as a way to test for stationarity or cointegration of price series. 

Time‐series techniques are most useful in markets where fundamental information and intuition are either lacking or not particularly useful for short‐term predictions. Currencies and bitcoins fit this bill.

The simplest model in time‐series analysis is AR(1).

It is just a linear regression model that relates the price in one bar to the next
$Y(t) - \mu = \phi(Y(t-1) - \mu) + \epsilon(t)$

   where $Y(t)$ is the price at time $t$, $\phi$ is the (auto)regression coefficient, and $\epsilon$ is Gaussian noise with zero mean, sometimes called innovation. 

A time series is called weakly1 stationary if its mean and variance are constant in time, and AR(1) is weakly stationary if images (the proof is left as an exercise). 

A weakly stationary time series is also mean reverting (Chan, 2013). If images, the time series will trend. If images, we have a random walk.

To estimate images, we use the arima and estimate functions in the Econometrics Toolbox.

The function images reduces to an AR(1) model if we set images and images (We will discuss the more general version in the next section.) The estimate function just applies maximum likelihood estimation to find the parameters for the AR(1) model based on the input price series. 

Note that we tested on midprices instead of trade prices to reduce bid–ask bounce, which tends to produce phantom mean‐reversion that cannot really be traded on.

Generalizing slightly from AR(1), we can consider images, represented by
3.2images

You can see that this is just a multiple regression model with the price at time t as the dependent (response) variable and past prices up to a lag of images bars as independent (predictor) variables. 

But introducing images as an additional parameter means that we can find the optimal images that gives the best fit of the images model to our data

As in many statistical models, we will use the Bayesian information criterion (BIC) that is proportional to the negative log likelihood of the model but with an additional term that is proportional to images, which penalizes complexity

Our objective is to minimize BIC

Once we have decided on the best estimate of images, we can apply the estimate function to it to find the coefficients images:

Once the next bar prediction has been made, we can use it to generate trading signals: Simply buy when the predicted price is higher than the current price, and sell when it is lower

A small extension of the AR model to include images lagged noise terms will often reduce the number of lags necessary.

This is called the ARMA(p, q) model, or an auto‐regressive moving average process, where the images lagged noise terms are described as a moving average:
3.3images

For each images and images, we save the log likelihood in images, and images in images, the latter because it is used as a penalty term when minimizing BIC.

 How do we identify the optimal images and images that minimizes BIC from the LOGL and PQ matrices? We have to turn them into one‐dimensional vectors, apply the images function, and then use the images function:

ARIMA(p, d, q) stands for autoregressive integrated moving average. 

If images is an ARIMA(p, 1, q) model, it implies that images is an ARMA(p, q), where 

The equivalence of an ARIMA(p, 1, q) model on log prices to an ARIMA(p, 0, q) model on log returns should not be confused with the statement that an ARMA(p, q) = ARIMA(p, 0, q) model on log prices is equivalent to some ARMA(images model on log returns. The latter statement is false.

An ARMA model in images's can always be transformed into an ARMA model in images's. But an ARMA model for images cannot always be transformed into an ARMA model for images. This is because an ARMA model for images can only have images as independent variables, whereas an ARMA model for images can have both images (which is just the difference of two imagess) and images as independent variables

The simple autoregressive model AR(p) in equation 3.2 can be easily generalized to m multivariate time series. This generalized model is called a vector autoregressive model, or VAR(p). All we need to do is to interpret the autoregressive coefficients images as images matrices, and allow the noises images, which are m‐vectors to have nonzero cross‐sectional correlations but zero serial correlations.

This means that images is not correlated with images, for any images, but images could be correlated with 

Since the autogressive coefficient matrices relate the current price of every time series to the lagged prices of all time series, VAR model is particularly suitable for modeling financial instruments that have correlated returns, such as a portfolio of stocks within the same industry group.

To eliminate spurious mean‐reversion effects due to bid‐ask bounce, we will use midprices at market close provided by the Center for Research of Security Prices (CRSP) from January 3, 2007, to December 31, 2013

As in the section on AR(p), we first need to determine the optimal lag p. We will use the first six years of data as training set for this determination

Once this is decided, the other parameters of the model can be determined by the function vgxvarx, which is the equivalent of the estimate function for ARIMA models. 

To make predictions using this model on the out‐of‐sample data in 2013, use the vgxpred function, which is similar to the forecast function for ARIMA.

In keeping with the linearity of the VAR models, we can construct a linear trading model as well. Furthermore, we can choose to make it sector‐neutral

We compute the mean predicted return images of all the stocks in the industry group every day, and set the target dollar allocation of a stock to be proportional to the difference between its predicted return and the industry group mean,
3.4images

We often want to predict changes in price images instead of price images itself. So it is a bit awkward to use the VAR models, and the resulting AR coefficients do not make too much intuitive sense. 

Fortunately, VAR(p) can be transformed to a model with images as the dependent variable, and various lagged images's and images's as the independent variables

This is called the VEC(q) (vector error correction) model, and is written as
3.5images

The images matrix C in equation 3.5 is called the error correction matrix

To transform the coefficients of VAR(p) to VEC(q), first note that images, and we can use the function vartovec

But we do not need a cointegrating portfolio to use VEC(q) for prediction. Some of the stocks could be trending while others are mean reverting

The AR, ARMA, VAR, and VEC models we have considered so far all use observable variables (prices of various lags) to predict their future values. However, econometricians have also concocted a class of models with hidden variables, called states, which can determine the values of observed variables (though subject to observation noise). These models are called state space models (SSM), a linear example of which is the Kalman filter, discussed in Chapter 3 of Chan (2013) and used in Chapter 5 in this book.

A state space model starts with a linear relationship that specifies the time‐evolution of the hidden state variable, usually denoted by images:
3.6images
where images is an images‐dimensional vector, images and images are possibly time‐dependent but observable matrices (images is images, while images is images), and images is k‐dimensional Gaussian white noise with zero mean, unit variances, and zero serial and cross correlations

 Equation 3.6 is often called the state transition equation.

The observable variables (also called measurements) are related to the hidden variables by another linear equation
3.7images
where images is an images‐vector, images and images are possibly time‐dependent but observable matrices (images is images, while images is images), and images is images‐dimensional Gaussian white noise, also with zero mean, unit variances, and zero serial and cross correlations

 Equation 3.7 is often called the measurement equation.

An example of a hidden variable is the familiar moving average.

We can give some structure to this hidden variable images by requiring that it evolves in a particularly simple way:
3.8images
We have assumed images is the identity matrix, which is of course invariant in time, and images is an unknown but also time‐invariant matrix that determines the covariance of the estimation errors for the moving average images.

Though we had said that images is supposed to be observable, it can be treated as an unknown parameter(s) to be estimated by applying maximum likelihood estimation on training data. (In other words, images is “observable” only to the extent that its values are not updated at each time step during Kalman filter updates.)

Given the moving average (plural if the time series is multivariate) of a time series, a trader may hypothesize that the prices are trending, and thus the best guess for the observed price at time t is just the estimated moving average at time t as well:
3.9images
where images is another unknown and time‐invariant matrix to be estimated by MLE.

We will assume that there are as many hidden state variables (five in total) as there are stocks in the computer hardware industry group. This is what a typical moving average model assumes as well—each price series has its own independent moving average. 

Furthermore, we assume also that the state noise of one moving average is uncorrelated with any other but each may have a different variance. Hence, images is a images diagonal matrix with unknown parameters. (Unknown parameters are denoted as NaN as an input to the MATLAB estimate function.) 

Similarly, we will assume the measurement noise of one stock's price is uncorrelated with another, but each may also have a different variance. Hence, images is also a images diagonal matrix with unkown parameters. 

One may also consider applying SSM on log prices instead, so that the Gaussian noise assumption is more reasonable.

Once the state transition and measurement equations are fixed, we can use the filter function to generate predictions of both the state and observation values.
The images variable in the output of the filter function is the filtered price (moving average) at time images given observed prices up to time 

This model generates filtered prices that resemble the observed prices very closely, usually with less than 0.1 percent difference.

From these predicted prices, we can calculate the predicted returns

These predicted returns can be used in the same way as we did in the VAR model to create a sector‐neutral trading strategy. 

Finding the moving average is not the only way the Kalman filter can be used to predict prices. If we assume trending behavior, we can also use it to find the slope of the recent trend in prices, leading to a prediction of the next price assuming the slope persists. This is left as an exercise for the reader.

Estimates of the hidden state itself may be useful—after all, it is supposed to be a moving average.

Finding estimates of a hidden variable in the presence of noise is the original meaning of filtering and is a well‐known concept in signal processing. 

Besides the Kalman filter, other well‐known filters in finance and economics include the Hodrick‐Prescott filter and the wavelet filter.

Another application of Kalman filtering has been discussed in Chan (2013), where it was used to find the best estimates of the hedge ratio between two cointegrated price series.

But instead of treating the two price series as measurements, we treat EWC as the measurements images, and EWA augmented with 1s as the time‐varying matrix images in equation 3.7. (The 1s are necessary to allow for the constant offset in the linear regression relationship between EWA and EWC.) 

We can see that the equity curve has started to flatten even during the latter part of the trainset. This could have been a result of regime change, where EWA and EWC have fallen out of cointegration, or more likely, a result of overfitting the noise covariance matrix images.