# VECM and Cointegration Analysis for Interest Rates
*A Practical Application of Cointegrating Relationship Analysis*

## Theoretical Reasoning:

The connection between linear regression and Vector Error Correction stems from the problem of spurious regressions when working with non-stationary time-series. Spurious regression arises when we regress a unit root process on an independent unit root process, and this causes significant results and a well-fitted model even if there is no real-world relationship between the variables of interest. The one and only case in which it is valid to regress one unit-root process onto another is if there exists a linear combination of the two processes such that the resulting cointegrating relationship is integrated to an order of 0, i.e. is stationary. This process is known as Cointegration, and is used to uncover the long run dynamics of a non-stationary time-series.
Given two non-stationary variables $Y_t,X_t$ such that:

<br/>
<p style="text-align: center;">$Y_t,X_t  \sim I(1)$
<br/>
    
Regressing them onto each other yields two relationships:

1. **Levels:** which makes sense only if the linear combination of the two variables yields a cointegrating relationship. In such a case we can think of this linear regression as the equilibrium level of the variable Y_t. Otherwise, this would yield a spurious regression.

<br/>
<p style="text-align: center;">$Y_t^E= \alpha +\beta X_t+ \epsilon_t$
<br/>
    
The cointegrating relationships are given by the vector β such that:

<br/>
<p style="text-align: center;">$Z_T= [Y_t  X_t  ]' \sim I(1)$                      
<br/>
    
<br/>
<p style="text-align: center;">$\beta'=[1 -\beta_2  ]$

<br/>
<p style="text-align: center;">$Z_T \beta'=[1-\beta_2 ] * [Y_t  X_t  ]' = Y_t- \beta_2 X_t \sim I(0)$
<br/><br/>
    
2.	**Differences:** which yields the short-run dynamics of how the time-series variables interact with each other and is always stationary.

<br/>
<p style="text-align: center;">$\Delta Y_t= \delta \Delta X_t+u_t$
<br/><br/>

If there exists no cointegrating relationship between the two variables, i.e. no linear combination of Y_t,X_t such that the resulting relationship is stationary, the causal directionality between the regression in levels and the one in differences is unilateral. This means that if there is a long-term trend in levels then this is also likely to be present in the differences, but not vice-versa. Therefore, it is important to look for cointegrating relationships in order to prove there is a long-term relationship between the two non-stationary variables in both levels and differences.
The way this is done, is by using the cointegration approach developed by Engle and Granger which consists of modelling an error correction model which accounts for both the short-run dynamics (embedded in the differences relationship) and the long-run dynamics (embedded in the levels relationship at equilibrium). This approach however has several limitations, namely that is only considers one cointegrating relationship at a time. Therefore, it is useful instead to use the Vector Error Correction Model (VECM) to make sense of the short-run and long-run dynamics of non-stationary variables across several cointegrating relationships, i.e. linear combinations. This is why we first start with a VAR process and manipulate the terms algebraically to arrive at the VECM process.


<br/>
<p style="text-align: center;"> $VAR(P):$ $X_t=A_0+\Sigma_{i=1}^p A_i X_{t-i}+ \epsilon_t$ 
    
<br/>
<p style="text-align: center;"> $VECM(P):$ $\Delta X_t=A_0+\Pi X_{t-1}+\Sigma_{i=1}^p C_i\Delta X_{t-i}+ \epsilon_t$
<br/><br/>

    
Using the derivation of the VECM is useful when uncovering the cointegrating relationships between variables in a multivariate setting across time which allows us to look at the interactions of the short-run and long-run dynamics of the cointegrated processes. The number of cointegrating relationships is determined by the rank of the matrix $\Pi$, which has to be between the number of variables n and 0, i.e. $0 < r < n$. Once the number of cointegrating relationships has been determined, we can decompose the matrix Π into a product of two matrices, **AB**. This is useful in so much as it allows us to treat the matrix as the relationship we illustrated above with $Z_t \beta'$.


<br/>
<p style="text-align: center;">$\Pi=A*B$
<br/><br/>

**B** = Represents r linearly independent rows that when multiplied with $X_t$ yields $r$ stationary long-term relationships.

**A** = Represents the response of the changes in each variable given by deviations from the long-term relationships in $BX_t$.

More succinctly, and in the case $r=1$, i.e. there is only one cointegrating relationship:

<br/>
<p style="text-align: center;">$\Delta X_{it}=\alpha_i U_{t-1}+\Sigma_{i=1}^p C_{i1}\Delta X_{t-i}+ \epsilon_{it}$
<br/><br/>

$U_{t-1}$ = Deviation from the long-run equilibrium to which there is an error correcting mechanism that pushes these deviations toward the long-run equilibrium.

$\alpha_i$ = Speed of adjustment of short-run shocks to the process to the long-term equilibrium.

$C_{i1} \Delta X_{t-i}$ = Captures the short-run dynamics in the process.

Therefore, we can see that the connection between linear regression and the VECM process is one of investigating both the short-run and long-run dynamics of non-stationary variables with have cointegrating relationships.


### VAR and Differencing

In practice, modelling the stationary changes in weekly interest rates with a VAR process for the Euro-Dollar exchange rate "thorws away" all level of information. Differenencing is necessary if we want to apply the VAR approach to stationary series.

### VECM and Cointegration

When using the VECM approach instead we can simultaneoulsy model both the level and the difference of the series, provided the levels of the series are cointegrated.

In [3]:
library(readxl)
library(urca)
library(tsDyn)

euro_dollar_rates <- read_excel("./Module_5_Data_Euro-Dollar_Rates.xls", sheet = "Weekly,_Ending_Friday")
head(euro_dollar_rates)

DATE,WED1,WED3,WED6
<dttm>,<dbl>,<dbl>,<dbl>
1989-12-29,8.56,8.33,8.19
1990-01-05,8.25,8.25,8.17
1990-01-12,8.19,8.19,8.16
1990-01-19,8.19,8.2,8.25
1990-01-26,8.2,8.25,8.31
1990-02-02,8.19,8.25,8.31


## Johansen Test

The Johanses test involves testing for the number of statistically non-zero eigenvalues of the matrix $\Pi$. For this we estimmate an unrestricted VECM and do two tests: the trace and max eigenvalue tests developed by Johanses(1988).

In [4]:
d_one_month_rate <- euro_dollar_rates$WED1
d_three_month_rate <- euro_dollar_rates$WED3
d_six_month_rate <- euro_dollar_rates$WED6

d_rates <- cbind(d_one_month_rate, d_three_month_rate, d_six_month_rate)

jotest1 <- ca.jo(d_rates, type ="eigen", K=9, ecdet="none", spec = "longrun")
summary(jotest1)


###################### 
# Johansen-Procedure # 
###################### 

Test type: maximal eigenvalue statistic (lambda max) , with linear trend 

Eigenvalues (lambda):
[1] 0.146156486 0.065201628 0.006132301

Values of teststatistic and critical values of test:

           test 10pct  5pct  1pct
r <= 2 |   5.41  6.50  8.18 11.65
r <= 1 |  59.27 12.91 14.90 19.19
r = 0  | 138.89 18.90 21.07 25.75

Eigenvectors, normalised to first column:
(These are the cointegration relations)

                      d_one_month_rate.l9 d_three_month_rate.l9
d_one_month_rate.l9             1.0000000              1.000000
d_three_month_rate.l9          -1.6015884              2.904978
d_six_month_rate.l9             0.6080626             -3.886571
                      d_six_month_rate.l9
d_one_month_rate.l9              1.000000
d_three_month_rate.l9           -1.257257
d_six_month_rate.l9              3.767824

Weights W:
(This is the loading matrix)

                     d_one_month_rate.l9 d_three

In [5]:
jotest2 <- ca.jo(d_rates, type ="trace", K=9, ecdet="none", spec = "longrun")
summary(jotest2)


###################### 
# Johansen-Procedure # 
###################### 

Test type: trace statistic , with linear trend 

Eigenvalues (lambda):
[1] 0.146156486 0.065201628 0.006132301

Values of teststatistic and critical values of test:

           test 10pct  5pct  1pct
r <= 2 |   5.41  6.50  8.18 11.65
r <= 1 |  64.67 15.66 17.95 23.52
r = 0  | 203.56 28.71 31.52 37.22

Eigenvectors, normalised to first column:
(These are the cointegration relations)

                      d_one_month_rate.l9 d_three_month_rate.l9
d_one_month_rate.l9             1.0000000              1.000000
d_three_month_rate.l9          -1.6015884              2.904978
d_six_month_rate.l9             0.6080626             -3.886571
                      d_six_month_rate.l9
d_one_month_rate.l9              1.000000
d_three_month_rate.l9           -1.257257
d_six_month_rate.l9              3.767824

Weights W:
(This is the loading matrix)

                     d_one_month_rate.l9 d_three_month_rate.l9
d_one_month

## Johansen Test Results

These tests show strong evidence of cointegration, satisfying the requirements we place on $\Pi$. Therefore, we can see that the number of cointegrating relationships $r = 2$ and $n = 3$ such that: $0 < 2 < 3$.

### Test 

* If test value > critical value (10%, 5%, 1%) --> Reject $H_0$

* If test value < critical value (10%, 5%, 1%) --> Fail to reject $H_0$

#### Relationship between hypothesis and cointegration

| Null Hypothesis ($H_0$) |Test| 1% | 5% | 10% | Alternative Hypothesis ($H_1$) | Results|
| ----- | ---- |  ---- |  ---- | ---- |   ---- |  ---- | 
|r <= 2 | 5.41|5.41 > 11.65|5.41 > 8.18 | 5.41 > 6.50|r > 2: fail to reject the null hypothesis| max 2 cointegrating relationships|
| r <= 1| 64.67|64.67 > 23.52| 64.67 > 17.95 | 64.67 > 15.66|r > 1: reject the null hypothesis|more than 1 cointegrating relationship|
| r = 0 | 203.56|203.56 > 37.22 |203.56 > 31.52 |203.56 > 28.71 |r > 0: reject the null hypothesis| more than 0 cointegrating relationships|

Both tests strongly reject the hypothesis that there are no or at most 1 cointegrating relationship, but fail to reject that there are at most 2.

We conclude that there are 2 cointegrating relationships between the variables. We can now estimate the VECM imposing this restriction.

In [6]:
VECM <- VECM(d_rates, 1, r = 2, include = "const", estim="ML",LRinclude = "none")
summary(VECM)

#############
###Model VECM 
#############
Full sample size: 888 	End sample size: 886
Number of variables: 3 	Number of estimated slope parameters 18
AIC -15291.93 	BIC -15196.19 	SSR 20.12427
Cointegrating vector (estimated by ML):
   d_one_month_rate d_three_month_rate d_six_month_rate
r1     1.000000e+00                  0       -0.9827959
r2     5.829457e-17                  1       -0.9930257


                            ECT1               ECT2               
Equation d_one_month_rate   -0.3478(0.0325)*** 0.3937(0.0585)***  
Equation d_three_month_rate -0.1019(0.0242)*** -0.0007(0.0436)    
Equation d_six_month_rate   -0.0965(0.0256)*** 0.0389(0.0461)     
                            Intercept          d_one_month_rate -1
Equation d_one_month_rate   -0.0112(0.0039)**  0.0278(0.0517)     
Equation d_three_month_rate -0.0127(0.0029)*** -0.1679(0.0385)*** 
Equation d_six_month_rate   -0.0094(0.0031)**  -0.1466(0.0408)*** 
                            d_three_month_rate -1 d_six_mont