## Task 2


The data file FwdSpot1.dat contains monthly spot and 1-month forward exchange rates,
the datafilee FwdSpot3.dat- ñmonthly spot and 3-month forward exchange rates, in $/foreign
currency, for the British Pound, French Franc and Japanese Yen, for 1973:3 to 1992:8 (234
observations). Each row contains the month, the year, the spot rates for Pound, Franc, and
Yen, and then the forward rates for the same three currencies. Download the data, then
take logarithms of the rates.

In [1]:
from scipy.stats import chi2, norm # for tests

import matplotlib.pyplot as plt # for graphs
import numpy as np # for matrix operations
import pandas as pd # for DataFrames
import statsmodels.api as sm # for statistics

In [2]:
data = pd.read_csv('FwdSpot1.dat', header=None,
                    sep=' ', skipinitialspace=True)
# skipinitialspace=True - to exclude the leading spaces - because otherwise
# we a column of nans and a colun of dates
data

Unnamed: 0,0,1,2,3,4,5,6,7
0,3,73,2.4755,0.2203,0.003752,2.469621,0.220649,0.003780
1,4,73,2.4869,0.2187,0.003763,2.482403,0.218873,0.003766
2,5,73,2.5720,0.2305,0.003788,2.568699,0.230600,0.003803
3,6,73,2.5825,0.2410,0.003805,2.578110,0.241000,0.003834
4,7,73,2.5072,0.2424,0.003772,2.501225,0.242823,0.003823
...,...,...,...,...,...,...,...,...
229,4,92,1.7793,0.1801,0.007513,1.769202,0.179159,0.007508
230,5,92,1.8315,0.1853,0.007834,1.822495,0.184383,0.007829
231,6,92,1.9110,0.1966,0.008019,1.900904,0.195526,0.008013
232,7,92,1.9190,0.2004,0.007852,1.908094,0.199241,0.007847


Note that date is in the descending order. (column 1)

In [3]:
# clean the data
data = data.iloc[:, 2:]
# rename columns according to the task
data.rename(columns={2: 'BP', 3: 'FF', 4: 'JY', 5: 'BP_f', 6: 'FF_f', 7: 'JY_f'}, inplace=True)
# take logarithms
data = data.apply(np.log)
data.head(3)

Unnamed: 0,BP,FF,JY,BP_f,FF_f,JY_f
0,0.906442,-1.512765,-5.585466,0.904065,-1.511182,-5.578031
1,0.911037,-1.520054,-5.582539,0.909227,-1.519264,-5.581742
2,0.944684,-1.467504,-5.575917,0.9434,-1.467071,-5.571965


We are interested in testing the conditional unbiasedness hypothesis that:

$E_t[st+k] = f_{t,k}$

where:
- $s_t$ is the spot rate at time t
- $f_{t,k}$ is the forward rate for k-month forwards at time t
- $E_t$ denotes the mathematical expectation conditional on time t information

The statement above says that the forward rate is a conditionally unbiased predictor of the future spot exchange rate.

To test this theory, one nests (1) within the following econometric model:

$s_{t+k} - s_t = β + γ (f_{t,k} - s_t) + ε_{t+k}$ 

$E_t[ε_{t+k}] = 0$

and test H0: β = 0; γ = 1.

The current spot rate is subtracted to achieve stationarity.
The difference $s_{t+k} - s_t$ is called the exchange rate depreciation, the difference $f_{t,k} - s_t$ the
forward premium. 

For the three currencies and both types of forwards, estimate (2) by OLS
and test for conditional unbiasedness. Do not forget HAC variance estimation whenever
appropriate; explain why it is needed or not needed. Discuss the test results.

### OLS model

So, our model for the OLS is the following:

$ed_{t+k} = \beta + \gamma fp_{t+k} + \epsilon_{t+k}$

### Beta:

$\hat{\beta} = (X' X)^{-1} X' Y$

$\hat{\beta}$ is the vector of estimated coefficients.

$X$ is the matrix of predictor variables (also known as the design matrix).

$Y$ is the vector of the target values.

In [4]:
def beta_est(self):
    beta_hat = np.linalg.inv(self.X.T @ self.X) @ self.X.T @ self.Y
    return beta_hat

### Covariance matrix (Wald)

To get estimated covariance matrix (White estimator):

$\hat{e} = \hat{Y} - X * \hat{\beta}'$

$\hat{V}_{\beta} = \hat{Q}_{xx}^{-1}\hat{V}_{xe}\hat{Q}_{xx}^{-1}$; This is our covariance matrix.

$\hat{V}_{xe} = \frac{1}{n}X' \Omega X$, where $\Omega$ is a diagonal matrix with error term $\hat{e_{i}^{2}}$ on the $i_{}^{th}$ place.

$\hat{Q}_{xx}^{-1} = (\frac{1}{n}X'X)_{}^{-1}$

In [5]:
# def White_est(self, beta_hat):
#     n = self.X.shape[0]
#     e_hat = self.Y - self.X @ beta_hat
#     QxxInv = np.linalg.inv(self.X.T @ self.X / n)
#     Omega = np.diag((e_hat**2).reshape(n))
#     V_ex = self.X.T @ Omega @ self.X / n
#     V_b = QxxInv @ V_ex @ QxxInv
#     return V_b

### Covariance matrix (HAC: Newly-West)

To get estimated covariance matrix (HAC: Newly-West):

$Z = X\hat{e}$

lags: i = 1, ..., T-1

$\hat{\Gamma}_j = \frac{1}{T} \sum_{t = max(1,1+j)}^{t = min(1,1+j)} (Z_t - \overline{Z}_t)(Z_{t-j} - \overline{Z}_t)'\xrightarrow{p} \Gamma_j$

then the estimator itself:

$\hat{V}_z^{NW} = \sum_{j=-m}^{m}(1-\frac{|j|}{m+1})\hat{\Gamma}_j$

where

m = $\lfloor 4(\frac{T}{100}_{}^{1/3})  \rfloor$

$\lfloor$ - integer part

In [6]:
# def G_est(self, Z, j):
#     Z_mean = Z.mean(axis=0)
#     T = Z.shape[0]
#     i_min = max(0, j)
#     i_max = min(T, T + j)
#     Z_t = Z[i_min:i_max] - Z_mean
#     Z_t_j = Z[i_min-j:i_max-j] - Z_mean
#     G_j = Z_t.T @ Z_t_j / T
#     return G_j

# def NW_est(self, beta_hat):
#     n = self.X.shape[0]
#     e_hat = self.Y - self.X @ beta_hat
#     Z = self.X * e_hat[:, None]
#     T = n
#     m = int(4 * (T / 100) ** (1/3))
#     V = 0
#     for j in range(-m, m + 1):
#         V += (1 - abs(j) / (m + 1)) * self.G_est(Z, j)
#     QxxInv = np.linalg.inv(self.X.T @ self.X / n)
#     V_b = QxxInv @ V @ QxxInv
#     return V_b

Get the standard deviation:

$se(\hat{\beta_j}) = \sqrt{\frac{1}{n}[\hat{V_{\beta}}]_{jj}}$

In [7]:
# def SD(self, V):
#     n = self.X.shape[0]
#     return [(np.diag(V/ n)[i]) ** 0.5 for i in range(V.shape[0])]

Now let's get the Wald test

$t_w = nh(\hat{\beta})'(\hat{H}\hat{V}_{\beta}\hat{H}')_{}^{-1}h(\hat{\beta})$

$ h = H \beta - q$, q- restriction RHS

Note that 

$ed_{t+k} = \beta + \gamma fp_{t+k} + \epsilon_{t+k}$

$H_0$ : β = 0; γ = 1.

Then H = [[0, 1], [1, 0]] - both hypothesis ([0, 1] - γ, [1, 0] - \beta). Thus h = [1,0] (γ = 1, β = 0)

In [8]:
# example
# H = np.array([[0, 1], [1, 0]])
# h = np.array([1, 0])

# def Wald_test(self, beta_hat, V_hat, H, q):
#     n = self.X.shape[0]
#     h = H @ beta_hat - q
#     W = n * h.T @ np.linalg.inv(H @ V_hat @ H.T) @ h
#     pv2 = chi2.sf(W, H.shape[0])
#     return pv2

## Class with all these functions

In [9]:
# class Estimators:
#     def __init__(self, X, Y):
#         self.X = X
#         self.Y = Y
        
#     def beta_est(self):
#         beta_hat = np.linalg.inv(self.X.T @ self.X) @ self.X.T @ self.Y
#         return beta_hat
    
#     def White_est(self, beta_hat):
#         n = self.X.shape[0]
#         e_hat = self.Y - self.X @ beta_hat
#         QxxInv = np.linalg.inv(self.X.T @ self.X / n)
#         Omega = np.diag((e_hat**2).reshape(n))
#         V_ex = self.X.T @ Omega @ self.X / n
#         V_b = QxxInv @ V_ex @ QxxInv
#         return V_b
    
#     def G_est(self, Z, j):
#         Z_mean = Z.mean(axis=0)
#         T = Z.shape[0]
#         i_min = max(0, j)
#         i_max = min(T, T + j)
#         Z_t = Z[i_min:i_max] - Z_mean
#         Z_t_j = Z[i_min-j:i_max-j] - Z_mean
#         G_j = Z_t.T @ Z_t_j / T
#         return G_j

#     def NW_est(self, beta_hat):
#         n = self.X.shape[0]
#         e_hat = self.Y - self.X @ beta_hat
#         Z = self.X * e_hat[:, None]
#         T = n
#         m = int(4 * (T / 100) ** (1/3))
#         V = 0
#         for j in range(-m, m + 1):
#             V += (1 - abs(j) / (m + 1)) * self.G_est(Z, j)
#         QxxInv = np.linalg.inv(self.X.T @ self.X / n)
#         V_b = QxxInv @ V @ QxxInv
#         return V_b

#     def SD(self, V):
#         n = self.X.shape[0]
#         return [(np.diag(V/ n)[i]) ** 0.5 for i in range(V.shape[0])]
        
#     def Wald_test(self, beta_hat, V_hat, H, q):
#         n = self.X.shape[0]
#         h = H @ beta_hat - q
#         W = n * h.T @ np.linalg.inv(H @ V_hat @ H.T) @ h
#         pv2 = chi2.sf(W, H.shape[0])
#         return pv2


In [10]:
# get it into the mofule - use it from there
# test - comment previous code cells
import sys

path = 'C:/Users/Popov/Documents/NES_studies/Python/NES_Helper' # Местоположение файла на диске
sys.path.append(path)

from NES_helper import Estimators as est

## k = 1 (lags)

### One month prediction

Econometric model:

$s_{t+k} - s_t = β + γ (f_{t,k} - s_t) + ε_{t+k}$

In [11]:
# let's at first do only for the 1-lag
k = 1
# note, that in our dataset previous years are at the bottom
# lagged spot prices for pound, franc and yen.

s_t_k = data.iloc[:, :3]

# current values
s_t = s_t_k.shift(-k)
# calculate the exchange rate depreciation
e_d = (s_t - s_t_k.values).dropna().values
# current futures values
f_t_k = data.iloc[:, 3:]
# now calculate the forward premium
f_p = (f_t_k - s_t_k.values).iloc[:e_d.shape[0]].values

Because k = 1 (# of lags) => $e_t$ serially uncorrelated. Thus, Hac is not needed => White estimator.

In [12]:
coef_names = [['alpha_1', 'beta_1'], ['alpha_2', 'beta_2'], ['alpha_3', 'beta_3']]
names = data.columns.tolist() 
# Loop through OLS (for each currency)
for i, name in enumerate(coef_names):
    print("OLS for " + names[i])
    
    X = sm.add_constant(f_p[:, i])
    Y = e_d[:, i]
    
    # Initialize the Estimators class with data
    esti = est(X, Y)
    
    # Estimate beta
    beta_hat = esti.beta_est()
    
    # Calculate White standard errors
    V_hat = esti.White_est(beta_hat)
    SDs = esti.SD(V_hat)
    
    # Perform Wald test
    p_val = esti.Wald_test(beta_hat, V_hat, np.array([[0, 1], [1, 0]]), np.array([1, 0]))
    
    # Print coefficients with their standard errors
    for coef, coef_name, sd in zip(beta_hat, coef_names[i], SDs):
        print(f"{coef_name}: {round(coef, 4)} ({round(sd, 4)})")
        
    print("p-value for Wald test" + "is:", round(p_val, 3), end="\n\n")
    print('----------------------\n')

OLS for BP
alpha_1: -0.0023 (0.0024)
beta_1: -0.7261 (0.6401)
p-value for Wald testis: 0.024

----------------------

OLS for FF
alpha_2: -0.0023 (0.0026)
beta_2: -0.9606 (0.85)
p-value for Wald testis: 0.057

----------------------

OLS for JY
alpha_3: 0.0036 (0.0023)
beta_3: -0.1528 (0.5595)
p-value for Wald testis: 0.086

----------------------



On 5% level the $H_0$ is not rejected for franc, yen, but is rejected for pound.

## k = 3 (lags)

### Three month prediction

In [13]:
data = pd.read_csv('FwdSpot3.dat', header=None,
                    sep=' ', skipinitialspace=True)
# skipinitialspace=True - to exclude the leading spaces - becuase otherwise
# we a column of nans
data

Unnamed: 0,0,1,2,3,4,5,6,7
0,3,73,2.4755,0.2203,0.003752,2.458357,0.221151,0.003840
1,4,73,2.4869,0.2187,0.003763,2.473906,0.219171,0.003782
2,5,73,2.5720,0.2305,0.003788,2.562098,0.230973,0.003843
3,6,73,2.5825,0.2410,0.003805,2.571008,0.241259,0.003886
4,7,73,2.5072,0.2424,0.003772,2.488521,0.243514,0.003919
...,...,...,...,...,...,...,...,...
229,4,92,1.7793,0.1801,0.007513,1.751900,0.177368,0.007502
230,5,92,1.8315,0.1853,0.007834,1.804800,0.182498,0.007821
231,6,92,1.9110,0.1966,0.008019,1.882100,0.193424,0.008004
232,7,92,1.9190,0.2004,0.007852,1.886600,0.196951,0.007840


In [14]:
# clean the data
data = data.iloc[:, 2:]
# rename columns according to the task
data.rename(columns={2: 'BP', 3: 'FF', 4: 'JY', 5: 'BP_f', 6: 'FF_f', 7: 'JY_f'}, inplace=True)
# take logarithms
data = data.apply(np.log)
data.head(3)

Unnamed: 0,BP,FF,JY,BP_f,FF_f,JY_f
0,0.906442,-1.512765,-5.585466,0.899493,-1.50891,-5.562283
1,0.911037,-1.520054,-5.582539,0.905798,-1.517903,-5.577502
2,0.944684,-1.467504,-5.575917,0.940826,-1.465454,-5.561502


In [15]:
# let's at first do only for the 1-lag
k = 3
# note, that in order to construct our model,
# we need to say that we ate in the period t+k. Then k-lag - period t.
# current spot prices for pound, franc and yen.

s_t_k = data.iloc[:, :3]

# lagged values
s_t = s_t_k.shift(-k)
# calculate the exchange rate depreciation
e_d = (s_t - s_t_k.values).dropna().values
# current futures values
f_t_k = data.iloc[:, 3:]
# now calculate the forward premium
f_p = (f_t_k - s_t_k.values).iloc[:e_d.shape[0]].values

Because k = 3 > 1(# of lags) => $e_t$ serially correlated. Thus, HAC is needed => Newey-West estimator (the standard one).

In [16]:
# Loop through OLS (for each currency)
coef_names = [['alpha_1', 'beta_1'], ['alpha_2', 'beta_2'], ['alpha_3', 'beta_3']]
names = data.columns.tolist()
for i, name in enumerate(coef_names):
    print("OLS for " + names[i])
    
    X = sm.add_constant(f_p[:, i])
    Y = e_d[:, i]
    
    # Initialize the Estimators class with data
    esti = est(X, Y)
    
    # Estimate beta
    beta_hat = esti.beta_est()
    
    # Calculate Newey-West standard errors
    V_hat = esti.NW_est(beta_hat)
    SDs = esti.SD(V_hat)
    
    # Perform Wald test
    p_val = esti.Wald_test(beta_hat, V_hat, np.array([[0, 1], [1, 0]]), np.array([1, 0]))
    
    # Print coefficients with their standard errors
    for coef, coef_name, sd in zip(beta_hat, coef_names[i], SDs):
        print(f"{coef_name}: {round(coef, 4)} ({round(sd, 4)})")
        
    print("p-value for Wald test" + "is:", round(p_val, 3), end="\n\n")
    print('----------------------\n')

OLS for BP
alpha_1: -0.0187 (0.0076)
beta_1: -2.0586 (0.7002)
p-value for Wald testis: 0.0

----------------------

OLS for FF
alpha_2: -0.0044 (0.0091)
beta_2: -0.4804 (0.8427)
p-value for Wald testis: 0.153

----------------------

OLS for JY
alpha_3: 0.0131 (0.0071)
beta_3: -0.602 (0.4634)
p-value for Wald testis: 0.002

----------------------



On 5% level the $H_0$ is not rejected for franc, but is rejected for pound, yen. (3 months forecast)

recall the results for one month forecast:

On 5% level the $H_0$ is not rejected for franc, yen, but is rejected for pound. (1 month forecast)

Two procedures suggest that the null hypothesis is rather not true for the pound, but true for the franc. For Yen it isn’t that obvious (for 1 month forecast- rejected $H_0$, 3 months forecast - not).

Given that beta are negative and not nearly 1 as assumed, the cases when the null hypothesis is not rejected cause doubts.