# OLS on a System of Regressions

This notebook illustrates how to estimate a system of regressions with OLS - and to test (coefficients) across the regressions.

## Load Packages and Extra Functions

In [1]:
using Printf, DelimitedFiles, Statistics, LinearAlgebra, Distributions

include("jlFiles/printmat.jl")
include("jlFiles/CovNWFn.jl")        #function for Newey-West covariance matrix

CovNWFn

## Loading Data

In [2]:
x    = readdlm("Data/FFmFactorsPs.csv",',',skipstart=1)
Rme  = x[:,2]                #market excess return
Rf   = x[:,5]                #interest rate


x  = readdlm("Data/FF25Ps.csv",',') #no header line: x is matrix
R  = x[:,2:end]                     #returns for 25 FF portfolios
Re = R .- Rf                        #excess returns for the 25 FF portfolios
Re = Re[:,[1;7;13;19;25]]           #use just 5 assets to make the printing easier 

(T,n) = size(Re)                    #number of observations and test assets

(388, 5)

## A Function for Joint Estimation of Several Regressions (OLS)

Consider the linear regressions

$
y_{it}=\beta_i^{\prime}x_{t}+\varepsilon_{it}, 
$

where $i=1,2,..,n$ indicates $n$ different dependent variables. The regressors are the *same* across the $n$ regressions. (This is often called SURE, Seemingly Unrelated Regression Equations.)

The next cell defines a function for this estimation. 

In [3]:
"""
    OlsSureFn(Y,X,NWQ=false,m=0)

LS of Y on X; where Y is Txn, and X is the same for all regressions

# Usage
(b,res,Yhat,Covb,R2) = OlsSureFn(Y,X,NWQ,m)

# Input
- `Y::Matrix`:     Txn, the n dependent variables
- `X::Matrix`:     Txk matrix of regressors (including deterministic ones)
- `NWQ:Bool`:      if true, then Newey-West's covariance matrix is used, otherwise Gauss-Markov
- `m::Int`:        scalar, bandwidth in Newey-West

# Output
- `b::Matrix`:     n*kx1, regression coefficients
- `u::Matrix`:     Txn, residuals Y - Yhat
- `Yhat::Matrix`:  Txn, fitted values X*b
- `V::Matrix`:     covariance matrix of vec(b)
- `R2::Vector`:    n-vector, R2 values

"""
function OlsSureFn(Y,X,NWQ=false,m=0)

    (T,n) = (size(Y,1),size(Y,2))
    k     = size(X,2)

    b     = X\Y
    Yhat  = X*b
    u     = Y - Yhat

    g     = zeros(T,n*k)
    for i = 1:n
      vv      = (1+(i-1)*k):(i*k)   #1:k,(1+k):2k,...
      g[:,vv] = X.*u[:,i]           #X*u for regression i
    end

    Sxx = X'X
    if NWQ
        S     = CovNWFn(g,m)            #Newey-West covariance matrix
        Sxx_1 = kron(I(n),inv(Sxx))
        V     = Sxx_1 * S * Sxx_1
    else
        V = kron(cov(u),inv(Sxx))      #traditional covariance matrix, Gauss-Markov 
    end

    R2   = 1 .- var(u,dims=1)./var(Y,dims=1)

    return b, u, Yhat, V, R2

end

OlsSureFn

In [4]:
(b,u,yhat,V,R2) = OlsSureFn(Re,[ones(T) Rme])
Stdb   = sqrt.(reshape(diag(V),2,n))      #V = Cov(vec(b)), in vec(b) 1:2 are for asset 1, 3:4 for asset 2,...
tstat  = b./Stdb

printblue("CAPM regressions:\n")
assetNames = [string("asset ",i) for i=1:n]
xNames      = ["c","Rme"]

println("coeffs")
printmat(b,colNames=assetNames,rowNames=xNames)

println("t-stats")
printmat(tstat,colNames=assetNames,rowNames=xNames)

[34m[1mCAPM regressions:[22m[39m

coeffs
      asset 1   asset 2   asset 3   asset 4   asset 5
c      -0.504     0.153     0.305     0.279     0.336
Rme     1.341     1.169     0.994     0.943     0.849

t-stats
      asset 1   asset 2   asset 3   asset 4   asset 5
c      -1.656     1.031     2.471     2.163     2.073
Rme    20.427    36.534    37.298    33.848    24.279



## Testing Across Regressions

To test across regressions, we first stack the point estimates into a vector by `θ = vec(b)`.

The test below applies the usual $\chi^2$ test, where 

$
H_0: R\theta=q,
$

where $R$ is a $J \times K$ matrix and $q$ is a $J$-vector. To test this, use

$
(R\theta-q)^{\prime}(RVR^{\prime}) ^{-1}(R\theta-q)\overset{d}{\rightarrow}\chi_{J}^{2}.
$

The $R$ matrix clearly depends on which hypotheses that we want to test.

In [5]:
bNames = fill("",2,n)       #useful to have a corresponding matrix of coef names
for i = 1:n
    bNames[:,i] = [string("c",i),string("β",i)]
end
printmat(bNames)

        c1        c2        c3        c4        c5
        β1        β2        β3        β4        β5



In [6]:
θ = vec(b)

printblue("stacking the coeffs into a vector:")
printmat(θ,rowNames=vec(bNames))

[34m[1mstacking the coeffs into a vector:[22m[39m
c1    -0.504
β1     1.341
c2     0.153
β2     1.169
c3     0.305
β3     0.994
c4     0.279
β4     0.943
c5     0.336
β5     0.849



In [7]:
#R = [1 0 -1 0 zeros(1,2*n-4)]           #are intercepts the same for assets 1 and 2?
R = zeros(n,2n)                          #are all intercepts == 0? 
for i = 1:n
    R[i,(i-1)*2+1] = 1
end

printblue("The R matrix:")
hypNames = string.("hypothesis ",1:size(R,1))
printmat(R,colNames=bNames,rowNames=hypNames,width=4,prec=0)

J = size(R,1)
printlnPs("The number of hypotheses that we test: $J \n")

printblue("R*vec(b):")
printmat(R*θ,rowNames=hypNames)

[34m[1mThe R matrix:[22m[39m
              c1  β1  c2  β2  c3  β3  c4  β4  c5  β5
hypothesis 1   1   0   0   0   0   0   0   0   0   0
hypothesis 2   0   0   1   0   0   0   0   0   0   0
hypothesis 3   0   0   0   0   1   0   0   0   0   0
hypothesis 4   0   0   0   0   0   0   1   0   0   0
hypothesis 5   0   0   0   0   0   0   0   0   1   0

The number of hypotheses that we test: 5 

[34m[1mR*vec(b):[22m[39m
hypothesis 1    -0.504
hypothesis 2     0.153
hypothesis 3     0.305
hypothesis 4     0.279
hypothesis 5     0.336



In [8]:
println("Joint test of all hypotheses")

Γ = R*V*R'
test_stat = (R*θ)'inv(Γ)*(R*θ)

critval = quantile(Chisq(J),0.9)          #10% critical value

printmat([test_stat,critval],rowNames=["test statistic","10% crit value"])

Joint test of all hypotheses
test statistic    10.930
10% crit value     9.236

