# OLS on a System of Regressions

This notebook illustrates how to estimate a system of regressions with OLS - and to test (coefficients) across the regressions.

## Load Packages and Extra Functions

The key function `OlsSure()` is from the (local) `FinEcmt_OLS` module.

In [1]:
MyModulePath = joinpath(pwd(),"src")
!in(MyModulePath,LOAD_PATH) && push!(LOAD_PATH,MyModulePath)
using FinEcmt_OLS

In [2]:
#=
include(joinpath(pwd(),"src","FinEcmt_OLS.jl"))
using .FinEcmt_OLS
=#

In [3]:
using DelimitedFiles, LinearAlgebra, Distributions

## Loading Data

In [4]:
x    = readdlm("Data/FFmFactorsPs.csv",',',skipstart=1)
(Rme,Rf) = (x[:,2],x[:,5])          #market excess return, interest rate

x  = readdlm("Data/FF25Ps.csv",',') #no header line
R  = x[:,2:end]                     #returns for 25 FF portfolios
Re = R .- Rf                        #excess returns for the 25 FF portfolios
Re = Re[:,[1,7,13,19,25]]           #use just 5 assets to make the printing easier

(T,n) = size(Re)                    #number of observations and test assets

(388, 5)

## A Function for Joint Estimation of Several Regressions (OLS)

Consider the linear regressions

$
y_{it}=x_{t}^{\prime}\beta_i+u_{it}, 
$

where $i=1,2,..,n$ indicates $n$ different dependent variables. The $K$ regressors are the *same* across the $n$ regressions. (This is often called SURE, Seemingly Unrelated Regression Equations.)

For the case of two regressions, the variance-covariance matrix has the following structure. 
Stack the $\beta$ coefficients into a vector (from equation 1 first, then from equation 2.) Then, the variance-covariance matrix is

$
\mathrm{Var}\left(
	\begin{bmatrix}
		\hat{\beta}_{1}\\
		\hat{\beta}_{2}%
	\end{bmatrix}
	\right)  =
	\begin{bmatrix}
		S_{xx}^{-1} & \mathbf{0}\\
		\mathbf{0} & S_{xx}^{-1}
	\end{bmatrix}
	\Omega
	\begin{bmatrix}
		S_{xx}^{-1} & \mathbf{0}\\
		\mathbf{0} & S_{xx}^{-1}
	\end{bmatrix}
$

where 

$
\Omega  = \mathrm{Var}\left(
\sum\nolimits_{t=1}^{T}
\begin{bmatrix}
	x_{t}u_{1t}\\
	x_{t}u_{2t}
\end{bmatrix}
\right) 
$

Notice that $x_{t}u_{1t}$ is a vector with $K$ elements (as many as there are regressors) and $x_{t}u_{2t}$ is similar. The $\Omega$ matrix is thus $2K \times 2K$.

The case of $n$ regressions (rather than 2) involves creating similar matrices. This is implemented in the `OlsSure()` function.

In [5]:
@doc2 OlsSure

```
OlsSure(Y,X,NWQ=false,m=0)
```

LS of `Y` on `X`; where `Y` is Txn, and `X` is the same for all regressions

### Input

  * `Y::Matrix`:     Txn, the n dependent variables
  * `X::Matrix`:     Txk matrix of regressors (including deterministic ones)
  * `NWQ:Bool`:      if true, then Newey-West's covariance matrix is used, otherwise Gauss-Markov
  * `m::Int`:        scalar, bandwidth in Newey-West

### Output

  * `b::Matrix`:     kxn, regression coefficients (one column for each `Y[:,i]`)
  * `u::Matrix`:     Txn, residuals Y - Yhat
  * `Yhat::Matrix`:  Txn, fitted values X*b
  * `V::Matrix`:     covariance matrix of θ=vec(b)
  * `R²::Matrix`:    1xn matrix, R² values


In [6]:
using CodeTracking
println(@code_string OlsSure([1],[1]))   #print the source code

function OlsSure(Y,X,NWQ=false,m=0)

    (T,n) = (size(Y,1),size(Y,2))
    k     = size(X,2)

    b     = X\Y
    Yhat  = X*b
    u     = Y - Yhat

    Sxx = X'X

    if NWQ
        g      = hcat([X.*u[:,i] for i=1:n]...)    #hcat(X.*u[:,1],X.*u[:,2], etc)
        S      = CovNW(g,m)           #Newey-West covariance matrix
        SxxM_1 = kron(I(n),inv(Sxx))
        V      = SxxM_1 * S * SxxM_1
    else
        V = kron(cov(u),inv(Sxx))      #traditional covariance matrix, Gauss-Markov
    end

    R²   = 1 .- var(u,dims=1)./var(Y,dims=1)

    return b, u, Yhat, V, R²

end


## Using the Function

In [7]:
(b,u,yhat,V,R²) = OlsSure(Re,[ones(T) Rme],true)
Stdb   = sqrt.(reshape(diag(V),2,n))      #V = Cov(vec(b)), in vec(b) 1:2 are for asset 1, 3:4 for asset 2,...
tstat  = b./Stdb

printblue("CAPM regressions: α is the intecept, γ the coeff on Rme\n")
assetNames = [string("asset ",i) for i=1:n]
xNames      = ["c","Rme"]

println("coeffs")
printmat(b;colNames=assetNames,rowNames=["α","γ"])

println("t-stats")
printmat(tstat;colNames=assetNames,rowNames=["α","γ"])

[34m[1mCAPM regressions: α is the intecept, γ the coeff on Rme[22m[39m

coeffs
    asset 1   asset 2   asset 3   asset 4   asset 5
α    -0.504     0.153     0.305     0.279     0.336
γ     1.341     1.169     0.994     0.943     0.849

t-stats
    asset 1   asset 2   asset 3   asset 4   asset 5
α    -1.720     1.045     2.436     2.094     2.070
γ    22.322    30.609    28.416    23.209    17.242



## Testing Across Regressions

To test across regressions, we first stack the point estimates into a vector by `θ = vec(b)`.

The test below applies the usual $\chi^2$ test, where 

$
H_0: R\theta=q,
$

where $R$ is a $J \times k$ matrix and $q$ is a $J$-vector. To test this, use

$
(R\theta-q)^{\prime}(RVR^{\prime}) ^{-1}(R\theta-q)\overset{d}{\rightarrow}\chi_{J}^{2}.
$

The $R$ matrix clearly depends on which hypotheses that we want to test.

The next cell creates a matrix of coefficient names that will help us see how the results are organised.

In [8]:
bNames = fill("",2,n)       #matrix of coef names, subscript for the asset number
for i = 1:n
    bNames[:,i] = [string("α",'₀'+i),string("γ",'₀'+i)]         #'₀'+1 to get ₁
end
printmat(bNames)

        α₁        α₂        α₃        α₄        α₅
        γ₁        γ₂        γ₃        γ₄        γ₅



In [9]:
θ = vec(b)

printblue("stacking the coeffs into a vector:")
printmat(θ;rowNames=vec(bNames))

[34m[1mstacking the coeffs into a vector:[22m[39m
α₁    -0.504
γ₁     1.341
α₂     0.153
γ₂     1.169
α₃     0.305
γ₃     0.994
α₄     0.279
γ₄     0.943
α₅     0.336
γ₅     0.849



In [10]:
#R = [1 0 -1 0 zeros(1,2*n-4)]           #are intercepts the same for assets 1 and 2?
R = zeros(n,2*n)                         #are all intercepts == 0?
for i = 1:n
    R[i,(i-1)*2+1] = 1
end

printblue("The R matrix:")
hypNames = string.("hypothesis ",1:size(R,1))
printmat(R;colNames=bNames,rowNames=hypNames,width=4,prec=0)

J = size(R,1)
printlnPs("The number of hypotheses that we test: $J \n")

q = zeros(J)

printblue("R*vec(b) - q:")
printmat(R*θ-q;rowNames=hypNames)

[34m[1mThe R matrix:[22m[39m
hypothesis 1   1   0   0   0   0   0   0   0   0   0
hypothesis 2   0   0   1   0   0   0   0   0   0   0
hypothesis 3   0   0   0   0   1   0   0   0   0   0
hypothesis 4   0   0   0   0   0   0   1   0   0   0
hypothesis 5   0   0   0   0   0   0   0   0   1   0

The number of hypotheses that we test: 5 

[34m[1mR*vec(b) - q:[22m[39m
hypothesis 1    -0.504
hypothesis 2     0.153
hypothesis 3     0.305
hypothesis 4     0.279
hypothesis 5     0.336



In [11]:
println("Joint test of all hypotheses")

Γ = R*V*R'
test_stat = (R*θ - q)'inv(Γ)*(R*θ - q)

critval = quantile(Chisq(J),0.9)          #10% critical value

printmat([test_stat,critval];rowNames=["test statistic","10% crit value"])

Joint test of all hypotheses
test statistic    10.707
10% crit value     9.236

