## **1. Ergodicity**

Even though we only get one data realization in the real world, we usually say the more data points we get from our sample, the better we are informed about the structure that generates these data points. This is the general idea of the Law of Large Numbers (LLN). The property of **ergodicity** is to ensure that a stochastic process can be properly established by a large number of data points under LLN. We can also say that <u> as we get more data points, we will have more information about a stochastic process that generates the data points if this process is **ergodic**. </u>

The intuitive definition for **ergodicity** is as follows: A stochastic process $\{ X_t \}_{t=-\infty }^\infty $ is ergodic <u> if any two collection of random variables partitioned far apart in the sequence are essentially **independent**. </u>


When this time difference gets large enough for any two random variables from process $X_t$, the correlation of the two random variables goes close to $0$. This diminishing dependence of two random variables as their time difference gets wider illustrates the concept of **ergodicity**. According to ergodicity, if two random variables are far apart enough, their correlation will be very low. It also means they will provide different information about the stochastic process to which they belong. Hence, we can also say the property of ergodicity property will make sure LLN will work for the stochastic process. 

If a time series $X_t$ is **stationary** and $\sum_{j=0}^{\infty } cov(X_t,X_{t-j}) < \infty $, then $x_t$ is **ergodic** for mean. It means that if we have a large enough data sample from time series $X_t$, <u> the sample mean of the time series is a consistent estimate of the population mean of the underlying process. </u>

## **2. Vector Autoregressive Model (VAR)**

Vector Autoregressive Model (VAR) is a time series model that is used to forecast two or more time series. VAR model can also be used to understand the joint dynamic interaction among time series. VAR is a multiple variables version of an AR model. Let's use a VAR model to explain how it works.

Assume we have two time series $x_t$ and $y_t$ and we would like to understand how they interact with each other. With a lag 2 VAR model, we can write the model as follows:

$$ x_t = \alpha_0 + \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \beta_1 y_{t-1}+\beta_2 y_{t-2} + e_t $$

$$ y_t = \phi_0 + \phi_1 x_{t-1} + \phi x_{t-2} + \theta_1 y_{t-1} + \theta y_{t-2} + \epsilon_t $$

Where $e_t \sim \text{white noise} (0, \sigma^2_e)$ and $\epsilon_t \sim \text{white noise} (0, \sigma^2_\epsilon)$ and both $e_t$ and $\epsilon_t$ are not autocorrelated. 


![image.png](attachment:image.png)

There are a few key features for a VAR model.

**a. VAR model is not a simultaneous model.**

One key point to note is that the current values of $x_t$ and $y_t$ only depend on the past values of both time series and <u> there is no dependence on the **current** values of variables. </u> This point is important because it means that <u> a VAR model does not require solving both equations at the same time </u> to estimate the parameters of the model. A VAR model is not a simultaneous model. We can actually estimate each equation separately. This is the key point why the VAR model is popular because this structure makes estimating the model and forecasting very convenient. 

**b. $e_t$ and $\epsilon_t$ are contemporaneously correlated.**

<u> $e_t$ and $\epsilon_t$ can be correlated at the same time $t$. </u> However <u> they are **independent** from their and other's **past values**. </u> They can both be impacted by the factors that are not in the equation system. For example, if $x_t$ and $y_t$ are two stock prices, the outside factors like the outbreak of war or government action on financial markets can impact both $e_t$ and $\epsilon_t$ at the same time. We call $e_t$ and $\epsilon_t$ **contemporaneously** (meaning at same time $t$) correlated. We also can call this equation system **seemingly unrelated regression (SUR)**. The **SUR** means despite each equation being estimated separately, the error terms' correlation means that the equations are, in fact, <u> not entirely independent. </u>



**c. $x_t$ and $y_t$ are stationary and ergodic.**

If $x_t$ and $y_t$ are $I(1)$, then we can take the first difference to both time series to make them stationary. A VAR model can actually use non-stationary time series directly in the model. However, the model will need to make some adjustments before estimation. We will talk more about this situation in the following lessons.

**d. Estimate Each Equation Using OLS Estimation.**

Like we mentioned in point A, each equation can be estimated separately because <u> 1. They are not simultaneous equations </u> and <u>2. They have identical **exogenous** variables. </u> We can estimate each equation by using OLS estimation, <u> one at a time. </u>

In our example, we have two time series, two equations, and two lags in the equations. We call this VAR model a **two-dimension VAR(2)** model. Two dimension refers to the two time series in the model. VAR(2) refers to the two lags in the model. In the next section, we are going to show how to run a VAR model for financial assets. 
