
## FINANCIAL ECONOMETRICS
MODULE 4 | LESSON 1

---

# **TIME SERIES AND AUTOCORRELATION**

|  |  |
|:---|:---|
|**Reading Time** |  60 minutes |
|**Prior Knowledge** | Regression analysis, Basic statistics  |
|**Keywords** | Time Series Data, Autocovariance, Autocorrelation (ACF), Partial Autocorrelation (PACF), Conditional Probability, Stationarity |


---

*In this module, we are going to talk about data points with a time stamp. A lot of our examples in the previous modules actually used this type of data. We call them time series. Time series analysis is a very important area of statistical analysis in finance, and there are unique challenges when attempting to perform statistical inference. Hence, several analytical methods have been developed that focus on time series data. In the next few modules, we will be going through different topics in time series analysis. In this lesson, we will introduce what time series data is and some challenges we may encounter when handling time series data. We will use Google’s stock price, the U.S. Dollar Index daily return, and U.S. 10-year Treasury bond yield from the beginning of 2016 to the end of 2021 as examples to demonstrate various concepts.*

In [None]:
# Load packages
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm

plt.rcParams["figure.figsize"] = (16, 9)  # Figure size and width

In [None]:
# Download datasets
m4_data = pd.read_csv("../M4. goog_eur_10.csv")
dxyr_data = pd.read_csv("../M4. dxy_r_data.csv")

# Convert date variable to date format
m4_data["Date2"] = pd.to_datetime(m4_data["Date"], format="%m/%d/%Y")
dxyr_data["Date2"] = pd.to_datetime(dxyr_data["Date"], format="%m/%d/%Y")

# Selecting columns and setting index
goog = m4_data.loc[:, ["Date2", "GOOGLE"]].set_index("Date2")
ust10 = m4_data.loc[:, ["Date2", "UST10Y"]].set_index("Date2")
dxy = dxyr_data[["Date2", "DXY_R"]].set_index("Date2")

## **1. Time Series Data Overview**

### **1.1 Time Series Data Examples and Some Characteristics**

Time series data is a series of data points with date or time stamps attached to them. Let’s look at some time series data examples in the following graphs and then we will formally define time series.


**Figure 1: Dollar Index Daily Return Time Series Daily Chart**


In [None]:
# Plot Dollar index time series chart
dxy["DXY_R"].plot(
    marker="o",
    markersize=4,
    markerfacecolor="none",
    linestyle="-",
    linewidth=0.8,
    xlabel="Year",
    ylabel="Dollar Index Daily Return",
)
plt.show()

In figure 1, we can see the dollar index daily return moves around $0$. It is common that a stock return usually oscillates around $0$. On the other hand, the volatility of the return changes in different time periods. The year 2016 on average seems to have higher volatility than other years in the graph. The high volatility in 2020 appeared around the start of the COVID-19 pandemic. We can see that there are different volatility clusters in different years. 2016 has high volatility while the beginning of 2020 before COVID-19 witnessed a period of low volatility.


**Figure 2: Google Stock Price Daily Close Time Series Daily Chart**


In [None]:
# Plot Google price time series chart
goog["GOOGLE"].plot(
    marker="o",
    markersize=4,
    markerfacecolor="none",
    linestyle="-",
    linewidth=1,
    xlabel="Year",
    ylabel="Google Stock Price Daily Adj Close",
)
plt.show()

In figure 2, the graph of Google’s stock price shows how the stock price has been moving during the period. Although the trend is up, we can also see there is a regular cyclic movement along the uptrend. We will learn how to analyze this type of time series data later.


**Figure 3: U.S. 10-Year Treasury Bond Yield Time Series Daily Chart**


In [None]:
# Plot UST10Y time series chart
ust10["UST10Y"].plot(
    marker="o",
    markersize=4,
    markerfacecolor="none",
    linestyle="-",
    linewidth=1,
    xlabel="Year",
    ylabel="UST 10 Year Yield Daily close",
)
plt.show()

In figure 3, there doesn’t seem to be a clear pattern for the U.S. 10-year Treasury Bond yield. However, we can see that if the yield in one day is going down, it would keep going down the next day. It looks like the yield movement from the previous few days can predict where the yield would go today.

From the last three figures we have noticed several patterns:

> a. trend <br>
> b. cyclical movement (seasonality) <br>
> c. volatility clusters (volatilities can be different during different time periods) <br>
> d. correlation between observations <br>
> e. extreme values or outliers <br>

These are some characteristics of time series data that we will address later in the course.

Since we would like to apply statistics to analyze time series data, we need to define time series data in a mathematical way. 



### **2.2 Time Series Definition**

The data series $ \{ x_1, x_2,\cdots ,x_{t-1}, x_t \}$ is called a time series if 

> a. $t$ is an ordered time stamp (date, hour, year, etc.) <br>
> b. $x_{t}$ is from a random variable $X_{t}$ <br>

Let's use the following table to explain this definition.


**Figure 4: Time Series Definition Example**

| Date | Price: <br>Observed Data | Associated <br>Random Variable |
| :---: | :---:   | :---:   |
| day 1 | $$x_1$$ | $$X_1$$ |
| day 2 | $$x_2$$ | $$X_2$$ |
| day 3 | $$x_3$$ | $$X_3$$ |
| day 4 | $$x_4$$ | $$X_4$$ |
| day 5 | $$x_5$$ | $$X_5$$ |


In figure 4, we assume this is the price data for Stock A for five days. We have the time series of Stock A’s price for five days $\{ x_1, x_2, x_3, x_4, x_5 \}$. The price $x_1$ is from a random variable $X_1$ and the price $x_2$ is from another random variable $X_2$. Each price value at time $t$ is a realization from its random variable at time $t$. In this course, we will limit $t$ to be discrete numbers and $X_t$ to be a continuous random variable.

For a time series, we already know that data observations do not necessarily need to be independent. Another thing to note is they also do not need to be identically distributed. What is important for a time series is that the data needs to be ordered by time. Many time series models provide results based on the order.

From now on, we will use an uppercase letter and $\{ X_t \}$ to denote a time series that contains the data observations. We will use a lowercase letter and $\{ x_t \}$ to denote a data observation from a time series. With this clarification, we will not have any confusion in the later course. 

In our example above, the time series can be written as: 

$$ \{ X_5 \} = \{ x_1, x_2, x_3, x_4, x_5 \} $$ 

In general, the time series $\{ X_t \}$ can be written as:

$$ \{ X_t \} = \{ x_t, x_{t-1}, x_{t-2}, x_{t-3}, \cdots, x_1, x_0 \} $$

For $x_t$ and $x_{t-1}$, the observations are separate by one time unit. In general, we call this one **lag**. For $x_t$ and $x_{t-k}$, the two observations are separated by $k$ lags. The following table demonstrates how to express lag in our example.


**Figure 5: Lag Expressions for Time Series**

| Time Series   | | $$\hspace{1.5cm}$$ | $$\hspace{1.5cm}$$ | Data Points | $$\hspace{1.5cm}$$ | $$\hspace{1.5cm}$$ |
| :---:             | |  :---:  |  :---:  |  :---:  |  :---:  |  :---:  |
| $$X_5$$           | | $$x_5$$ | $$x_4$$ | $$x_3$$ | $$x_2$$ | $$x_1$$ |
| $$X_{(5-1)}=X_4$$ | | $$x_4$$ | $$x_3$$ | $$x_2$$ | $$x_1$$ |         |
| $$X_{(5-2)}=X_3$$ | | $$x_3$$ | $$x_2$$ | $$x_1$$ |         |         |


We can see from figure 5, $\{ X_4 \}$ is lag 1 of $\{ X_5 \}$ and $\{ X_3 \}$ is lag 2 of $\{ X_5 \}$. 

We can do time series difference calculations among the above time series. The following table shows a few examples.


**Figure 6: Time Series Difference Calculation Example**


| Time Series <br>Difference Calculation   | | $$\hspace{1.5cm}$$ | $$\hspace{1.5cm}$$ | Data Points | $$\hspace{1.5cm}$$ | $$\hspace{1.5cm}$$ |
| :---:             | |  :---:  |  :---:  |  :---:  |  :---:  |  :---:  |
| $$X_5-X_4$$ | | $$x_5-x_4$$ | $$x_4-x_3$$ | $$x_3-x_2$$ | $$x_2-x_1$$ |         |
| $$X_5-X_3$$ | | $$x_5-x_3$$ | $$x_4-x_2$$ | $$x_3-x_1$$ |         |         |


In figure 6, it shows the calculation of one difference and two differences for time series $\{ X_{5} \}$. One difference means the current time series minus the lag 1 time series. Two difference means the current time series minus the lag 2 time series. We notice that when we conduct one difference calculation for time series, we will lose one observation as indicated in the above table. We will lose 2 observations when we conduct two differences calculation. We will use this calculation a lot going forward.


## **3. Autocorrelation**

In Module 2 Lesson 1, we discussed how one of the assumptions for OLS regression is that each observation should be independent of each other. If the data violates this assumption, this data series has **autocorrelation**. In our last section, we saw U.S. 10-Year Treasury bond yields have an autocorrelation issue because today’s bond yield is likely to go down if the yields over the past few days have been inclining. The issue of correlated observations is one challenge in analyzing time series data. We cannot apply time series data to OLS properly without addressing the issue of autocorrelation. So how do we analyze the autocorrelation for a time series? We will introduce the autocorrelation function as a measure in this section. Before getting into autocorrelation function, we would like to introduce some concepts here.


### **3.1 The Mean Function of Time Series**

First, let's look at the mean function for time series:

$$ \mu_{X_{t}} = E(X_{t}) = \int_{a}^{b} x f_{t}(x) dx $$

Where $f_{t}$ is the marginal density function for $X_{t}$ and $a$ and $b$ are bounds for random variable $X_{t}$

The mean function is a measure to describe the center of the time series. It will play a key part in the later time series theories.


### **3.2 Autocovariance Function**

The first metric we can use to study the strength of the relationship between two observations in a time series is **autocovariance function**. Autocovariance function measures the linear dependence of two observations. Autocovariance function plays a similar role to covariance for two data series, except that autocovariance function assesses the relationship among observations in one data series.

Assume $X_t$ has finite variance where $t = 1, 2, \cdots, T$, autocorrelation function is defined as:

$$ \gamma_{_{X}}(s,t) = cov(X_{s}, X_{t}) = E[ (X_{s}-\mu_{_{X_{s}}})(X_{t}-\mu_{_{X_{t}}}) ] \ $$  

for all $s$ and $t$



Here are some properties for autocovariance function:

> a. It only measures the linear relationship between $X_{s}$ and $X_{t}$ <br>
> b. $\gamma_{_{X}}(s,t) = \gamma_{_{X}}(t,s) $ <br>
> c. When $\gamma_{_{X}}(s,t)$ is close to $0$ and the distance between $s$ and $t$ is large, the time series graph would exhibit a choppy curve between $s$ and $t$ <br>
> d. When the absolute value of $\gamma_{_{X}}(s,t)$ is large and the distance between $s$ and $t$ is large, the time series graph would exhibit a smooth curve between $s$ and $t$ <br>
> e. When $s = t$, the autocovariance becomes variance of $X_t$ <br>


### **3.3 Autocorrelation**

Now we can introduce the autocorrelation function (ACF). Its definition is:

$$ \rho_{_{X}}(s,t) = \frac{\gamma_{_{X}}(s,t)}{\sqrt{\gamma_{_{X}}(s,s)\gamma_{_{X}}(t,t)}} $$

Here are some properties for ACF:

> a. It only measures the linear relationship between $X_{s}$ and $X_{t}$ <br>
> b. The value of $\rho_{_{X}}(s,t) $ is between $-1$ and $1$ <br>

By using autocorrelation, we can see if the current observation has any significant correlation with observations in different time lags.


## **4. Conditional Probability**

Before we move to the next time series topic, let's review the concept of conditional probability. Conditional probability is an important concept incorporated in the topic we will discuss next. 

Let’s use an example to explain conditional probability. Say we have a cookie jar with 20 cookies. Using the cookies' ingredients, the distribution of cookies is as follows:


**Figure 7. Cookie Distribution by Ingredients**

|           | Dark Chocolate |Milk Chocolate | White Chocolate |
| :---      | :---: | :---: | :---: |
| With Nuts |   6   |   4   |   2   |
| No Nuts   |   3   |   4   |   1   |


Now, given that we get a cookie with dark chocolate, what is the probability that this cookie contains nuts? This is a conditional probability question. In mathematic form, we can write $P(\text{cookie has nuts} \ | \ \text{cookie has dark chocolate})$. The "|" stands for "given." Instead of trying to get the answer by checking all 20 cookies, we only need to focus on nine cookies with dark chocolate now. Among these nine cookies, there are six cookies that have nuts. Hence, $P(\text{cookie has nuts} \ | \ \text{cookie has dark chocolate}) = 6 / 9$. We can see the above math operation reduces the sample space from 20 cookies to nine cookies for conditional probability.

In the formal math form, assuming we know event $A$ and event $B$ are all in the same sample space, the **conditional probability** that event $A$ will happen given that event $B$ happens is:

$$ P(A|B) $$

Please review the required reading on conditional probability to learn its properties and its relationship with joint probability and marginal probability.


## **5. Partial Autocorrelation**

In the previous section, we talked about using autocorrelation to evaluate the relationship of two observations in a time series. Say if we have $x_{t}$ and $x_{t-3}$ from a time series, we can write their autocorrelation as follows:

$$ \rho_{_{X}}(t,t-3) = \frac{cov(x_t, x_{t-3})}{\sqrt{var(x_t) var(x_{t-3})}} $$

The above autocorrelation is to assess the relationship between $x_{t}$ and $x_{t-3}$ including the correlation explained by $x_{t-1}$ and $x_{t-2}$. However, let's assume we want to evaluate the pure correlation between $x_{t}$ and $x_{t-3}$ stripped out correlation from $x_{t-1}$ and $x_{t-2}$. In this case, we would need to use partial autocorrelation (PACF). Partial autocorrelation can be written as follows:

$$ \phi_{_{X}}(t,t-3 \ | \ t-1,t-2) = \frac{cov(x_t,x_{t-3} | x_{t-1},x_{t-2})}{\sqrt{var(x_t | x_{t-1},x_{t-2}) var(x_{t-3} | x_{t-1},x_{t-2})}} $$

As you can see, partial autocorrelation is a conditional correlation. Unlike autocorrelation, partial correlation is giving the information about the pure correlation between $x_{t}$ and $x_{t-3}$ that $x_{t-1}$ and $x_{t-2}$ do not explain. For PACF notation, we can drop the condition part and simply write the notation as follows:

$$ \phi_{_{X}}(t,t-3) $$

Both ACF and PACF are very useful when we talk about time series regression analysis later.


## **6. Stationarity**

We introduced time series and discussed the mean and autocorrelation of time series. We have not yet discussed what kinds of behaviors or assumptions about time series make it possible to model them. In this section, we are going to introduce a concept call **stationarity**.

Let $\{ X_t \}$ be a time series with finite variance, $\{ X_t \}$ is stationary if 

> a. the mean function $\mu_{_{X}}(t) = E(X_{t})$ is constant and  independent of $t$ <br>
> b. Autocovariance $\gamma_{_{X}}(t-h,t)$ is independent of $t$ for each $h$. 

This means autocovariance is only dependent on time difference $h$. For example, $Cov(X_t, X_{t+h}) = Cov(X_s, X_{s+h})$.

It also means when $h = 0$, autocovariance is independent of $t$. Since autocovariance is variance when $h = 0$, variance is constant. We therefore rule out the issue of heteroskedasticity in a stationary time series. The above stationarity definition is also called **weak stationarity**. There is also strict stationarity. 

For time series $\{ X_t \}$ to be **strictly stationary**, every finite joint probability distribution of $\{ X_t \}: \{ x_{t_1}, x_{t_2}, \cdots, x_{t_k} \}$ needs to have the identical probabilistic behavior of every collective value of $\{ x_{t_{1}+h}, x_{t_{2}+h}, \cdots, x_{t_{k}+h} \}$. It means the probabilistic behavior of a time series will not change when the time series shifts time. We can also express strict stationarity in the following way.

$$ P \{ x_{t_1} \le c_1, \cdots, x_{t_k} \le c_k \} = P\{ x_{t_{1}+h} \le c_1, \cdots, x_{t_{k}+h} \le c_k \} $$

for all $k = 1, 2, \cdots$ all time stamps $t_1,t_2,...,t_k$, all numbers $c_1, c_2, \cdots, c_k$ and all time shifts $h$. It means the above joint cumulative probabilities have the same cumulative functions.

It is clear that strict stationarity implies weak stationarity. The converse is not true. However, strict stationarity is a much harder requirement and much more difficult to check. Thankfully, most statistical procedures involving time series only require weak stationarity.

Because of the definition of strict stationary, both mean and covariances of a strictly stationary time series are independent of time. 

Since the mean of $\{ X_t \}$ is independent of $t$ and is constant, we can write it as follows:

$$ \mu_{t} = \mu $$

From (b), we know autocovariance for a stationary time series only depends on the distance between two time points, and we can write autocorrelation as follows:

$$ \gamma_{_{X}}(t-h,t) = \gamma_{_{X}}(h) $$

Since autocovariance is independent of $t$, both ACF and PACF are independent of $t$, too. Hence, we can also write ACF as $\rho_{_{X}}(h)$ and PACF as $\phi_{_{X}}(h)$ or $\phi_{hh}$.

In the following time series analysis, we will introduce several models that require time series to be stationary before running the model. Hence, it is important to make sure the time series is stationary before conducting any time series analysis. Later in this course, we will introduce methods to check if a time series is stationary.


## **7. Plots of ACF and PACF**

We introduced ACF and PACF in the last section. It is usually easier to use a plot to visualize information from ACF and PACF. Let’s use our Google stock price, U.S. 10-year Treasury bond yield, and dollar index daily return as examples to show ACF and PACF plots.


**Figure 8: ACF Plots for Google Stock Price, U.S. Treasury 10-Year Bond Yield and U.S. Dollar Index Daily Return**


In [None]:
# ACF Plots for GOOGLE, U.S. UST10Y and DXY_R
fig, ax = plt.subplots(1, 3, figsize=(18, 7))
sm.graphics.tsa.plot_acf(goog["GOOGLE"], title="Google stock ACF", lags=30, ax=ax[0])
sm.graphics.tsa.plot_acf(
    ust10["UST10Y"], title="US 10-Year Treasury Bond Yield ACF", lags=30, ax=ax[1]
)
sm.graphics.tsa.plot_acf(
    dxy["DXY_R"], title="US Dollar Index Daily Return ACF", lags=30, ax=ax[2]
)
plt.show()

An ACF plot is a plot that shows a bar as the autocorrelation of a time series with different lags. You have 0 lag starting on the left end of the plot. Since autocorrelation with 0 lag is just correlation to itself, the number is 1. Then, you have autocorrelation with lag 1, lag 2, and so on and so forth. In the plot, we also have 95% confidence intervals as shaded area around $0$. If a bar is over the confidence interval, it means this autocorrelation is statistically significant.

In figure 8, we can see that Google stock and U.S. Treasury 10-year bond yield have high and slowly declining autocorrelations, which usually means there is some sort of trend in the time series. It also means these time series are not stationary. We will usually proceed to detrend the time series to make it stationary before modeling. We will show how to detrend for a time series in the next lesson for the U.S. dollar index daily return.

We see from ACF for the U.S. dollar index daily return, lag 17 and lag26 have significant autocorrelations. This piece of information will help us later to specify a time series model.

Now let’s look at PACF plots for Google stock price, U.S. 10-year Treasury bond yield, and dollar index daily return.


**Figure 9: PACF Plots for Google Stock Price, U.S. Treasury 10-Year Bond Yield, and U.S. Dollar Index Daily Return**


In [None]:
# PACF plots for GOOGLE, U.S. UST10Y and DXY_R
fig, ax = plt.subplots(1, 3, figsize=(18, 7))
sm.graphics.tsa.plot_pacf(goog["GOOGLE"], title="Google stock PACF", lags=30, ax=ax[0])
sm.graphics.tsa.plot_pacf(
    ust10["UST10Y"], title="US 10-Year Treasury Bond Yield PACF", lags=30, ax=ax[1]
)
sm.graphics.tsa.plot_pacf(
    dxy["DXY_R"], title="US Dollar Index Daily Return PACF", lags=30, ax=ax[2]
)
plt.show()

You read PACF plots as you read ACF plots except the bar now represents PACF. From figure 9, we see that for Google stock PACF lag18 is positive and significant. For U.S. Treasury 10-year bond yield, PACF lag8 and lag19 are significant. PACF lag 19 has a negative PACF. For U.S. dollar index daily return, PACF lag 16 is negative and significant. PACF lag 26 is positive and significant. 

ACF and PACF plots will be applied frequently in the following lessons. They are important tools for providing information about a time series that can help us to specify a time series model.


## **8. Conclusion**

The module introduced time series analysis. In this lesson, we started by observing a few time series graphs to give an idea of what time series are about. We then formally defined time series and showed some math operations on time series. Then, we introduced autocorrelation and partial autocorrelation of time series. Finally, we discussed what it means for a time series to be stationary. We completed the lesson with ACF and PACF plots and how to read the plots. In the next lesson, we will start to learn some basic time series statistical models.


---
Copyright 2023 WorldQuant University. This
content is licensed solely for personal use. Redistribution or
publication of this material is strictly prohibited.
