# Betas and Covariance Matrices

This notebook estimates (single and multi-) index models. It uses those to construct (alternative) estimates of the covariance matrix of the asset returns.

## Load Packages and Extra Functions

In [1]:
using Printf, LinearAlgebra, Statistics, DelimitedFiles

include("jlFiles/printmat.jl")
include("jlFiles/OlsM.jl");

## Loading Data

We load data from two data files: for the returns on Fama-French equity factors and then also for the 25 Fama-French portfolios. To keep the output simple (easy to display...), we use only 5 of those portfolios.

In [2]:
x    = readdlm("Data/FFmFactorsPs.csv",',',skipstart=1)
Rme  = x[:,2]                #market excess return
RSMB = x[:,3]                #small minus big firms
RHML = x[:,4]                #high minus low book-to-market ratio
Rf   = x[:,5]                #interest rate

x  = readdlm("Data/FF25Ps.csv",',') #no header line: x is matrix
R  = x[:,2:end]                     #returns for 25 FF portfolios
Re = R .- Rf                        #excess returns for the 25 FF portfolios
Re = Re[:,[1,7,13,19,25]]           #use just 5 assets to make the printing easier

(T,n) = size(Re)                    #no. obs and  no. test assets

(388, 5)

# Covariance Matrix with Average Correlations

The next cell contains to two functions that will help us construct a covariance matrix from *(a)* estimated standard deviations and *(b)* a common (average) number for all correlations.

In [3]:
"""
    vech(x,k=0)

Stack the matrix elements on and below the principal diagonal into a vector (k=0). 
With k=-1, instead stacks just the elements below the diagonal.
"""
function vech(x,k=0)
    vv = tril(trues(size(x)),k)
    y  = x[vv]
    return y
end


"""
Cov_withSameCorr(σ,ρ)    

Compute covariance matrix from vector of standard deviations (σ) 
and a single correlation (assumed to be the same across all pairs).
"""
function Cov_withSameCorr(σ,ρ)
    σ = vec(σ)                         #to make sure it is a vector
    n = length(σ)
    CorrMat = ones(n,n)*ρ + (1-ρ)*I    #corr matrix, correlation = ρ
    CovMat  = (σ*σ').*CorrMat
    return CovMat
end

Cov_withSameCorr

In [4]:
S = cov(Re)  #Covariance matrix calculated from data (to compare with)         
#printblue("Covariance matrix calculated from data (to compare with):")
#printmat(S)

C = cor(Re)              #nxn correlation matrix
printblue("Correlation matrix:")
printmat(C)

ρ_avg = mean(vech(C,-1))
printblue("Average correlation:")
printlnPs(ρ_avg)

[34m[1mCorrelation matrix:[22m[39m
     1.000     0.821     0.664     0.581     0.468
     0.821     1.000     0.919     0.819     0.696
     0.664     0.919     1.000     0.909     0.805
     0.581     0.819     0.909     1.000     0.852
     0.468     0.696     0.805     0.852     1.000

[34m[1mAverage correlation:[22m[39m
     0.753


In [5]:
Σ₁ = Cov_withSameCorr(std(Re,dims=1),ρ_avg)

printblue("Covariance matrix assuming same correlation:")
printmat(Σ₁)

printblue("Difference to data:")
printmat(Σ₁-S)

[34m[1mCovariance matrix assuming same correlation:[22m[39m
    73.475    39.483    33.433    32.422    32.509
    39.483    37.371    23.844    23.123    23.185
    33.433    23.844    26.796    19.580    19.632
    32.422    23.123    19.580    25.200    19.039
    32.509    23.185    19.632    19.039    25.335

[34m[1mDifference to data:[22m[39m
     0.000    -3.540     3.962     7.410    12.321
    -3.540    -0.000    -5.229    -2.013     1.778
     3.962    -5.229    -0.000    -4.052    -1.349
     7.410    -2.013    -4.052    -0.000    -2.494
    12.321     1.778    -1.349    -2.494    -0.000



# Covariance Matrix from a Single-Index Model

## Step 1: Do Regressions

### A Remark on the Code
- The function `OlsM` is included in the file `OlsM.jl` (see the first cell of the notebook). It does OLS estimation and reports the point estimates and standard deviations of the residuals.
- The functions reports for several dependent variables (different columns in the first function input).

In [6]:
x          = [ones(T) Rme]                   #regressors

(b,σ)      = OlsM(Re,x)                      
(β,VarRes) = (b[2:2,:],vec(σ).^2)

colNames = [string("asset ",i) for i=1:n]
printblue("β for $n assets, from OLS of Re on constant and Rme:")
printmat(β,colNames=colNames,rowNames=["β on Rme"])

[34m[1mβ for 5 assets, from OLS of Re on constant and Rme:[22m[39m
           asset 1   asset 2   asset 3   asset 4   asset 5
β on Rme     1.341     1.169     0.994     0.943     0.849



## Step 2: Use OLS Estimates to Calculate the Covariance Matrix

The single index model implies that the covariance of assets $i$ and $j$ is

$\sigma_{ij} = \beta_i \beta_j \text{Var}(R_{mt}) + \text{Cov}(\varepsilon_{it},\varepsilon_{jt}),$

but where we *assume* that $\text{Cov}(\varepsilon_{it},\varepsilon_{jt}) = 0 \ \text{ if } \ i \neq j$

The betas are typically estimated by a linear regression (OLS), as above.

### A Remark on the Code
- The `CovFromIndexModel()` function can handle both the case with one factor (as used here) and with several factors (a multi-index model, as will be used further down). In the code $\beta$ is a $K \times n$ matrix, where $K$ is the number of factors ($K=1$ for the single index model) and $n$ is the number of assets.

In [7]:
"""
    CovFromIndexModel(b,VarRes,Ω)

Calculate covariance matrix from a multi-factor model.

b is Kxn where K is the number of factors and n is the number of assets.

Cov(Ri,Rj) = bᵢ'*Ω*bⱼ, where Ω is the Cov(indices) and bᵢ is the vector of regression
coefficients when regressing Ri on a constant and the indices (and bⱼ is for asset j).
"""
function CovFromIndexModel(b,VarRes,Ω)    #coefs for regression i is in b[:,i]
    n    = length(VarRes)
    CovR = fill(NaN,n,n)
    for i = 1:n, j = 1:n         #loop over both i and j
        CovR[i,j] = i == j ? b[:,i]'Ω*b[:,i] + VarRes[i] : b[:,i]'Ω*b[:,j]
    end
    return CovR
end

CovFromIndexModel

In [8]:
σₘₘ = var(Rme)
Σ₂ = CovFromIndexModel(β,VarRes,σₘₘ)

printblue("Covariance matrix calculated from betas:")
printmat(Σ₂)

printblue("Difference to data:")
printmat(Σ₂-S)

[34m[1mCovariance matrix calculated from betas:[22m[39m
    73.475    33.232    28.269    26.797    24.146
    33.232    37.371    24.644    23.361    21.049
    28.269    24.644    26.796    19.872    17.906
    26.797    23.361    19.872    25.200    16.973
    24.146    21.049    17.906    16.973    25.335

[34m[1mDifference to data:[22m[39m
    -0.000    -9.791    -1.202     1.784     3.958
    -9.791    -0.000    -4.428    -1.775    -0.357
    -1.202    -4.428    -0.000    -3.760    -3.075
     1.784    -1.775    -3.760    -0.000    -4.560
     3.958    -0.357    -3.075    -4.560     0.000



# Covariance Matrix from a Multi-Index Model

A multi-index model is based on 

$R_{it} =a_{i}+b_{i}^{\prime}I_{t}+\varepsilon_{it}$,

where $b_{i}$ is a $K$-vector of slope coefficients.

If $\Omega$ is the covariance matrix of the indices $I_t$, then the covariance of
assets $i$ and $j$ is

$\sigma_{ij}=b_{i}^{\prime}\Omega b_{j}  + \text{Cov}(\varepsilon_{it},\varepsilon_{jt}),$

but where we assume that $\text{Cov}(\varepsilon_{it},\varepsilon_{jt}) = 0 \ \text{ if } \ i \neq j$

In [9]:
x          = [ones(T) Rme RSMB RHML]               #regressors

(b,σ)      = OlsM(Re,x)                      
(β,VarRes) = (b[2:end,:],vec(σ).^2)

printblue("OLS slope coefficients:")
printmat(β,colNames=colNames,rowNames=["β on Rme", "β on RSMB", "β on RHML"])

[34m[1mOLS slope coefficients:[22m[39m
            asset 1   asset 2   asset 3   asset 4   asset 5
β on Rme      1.070     1.080     1.035     1.056     1.041
β on RSMB     1.264     0.768     0.437     0.153    -0.088
β on RHML    -0.278     0.160     0.487     0.603     0.770



In [10]:
Ω = cov(x[:,2:end])      #covariance matrix of the (non-constant) factors

Σ₃ = CovFromIndexModel(β,VarRes,Ω)

printblue("Covariance matrix calculated from betas:")
printmat(Σ₃)

printblue("Difference to data:")
printmat(Σ₃-S)

[34m[1mCovariance matrix calculated from betas:[22m[39m
    73.475    41.847    31.050    25.449    18.940
    41.847    37.371    27.384    24.141    20.145
    31.050    27.384    26.796    22.239    20.109
    25.449    24.141    22.239    25.200    20.717
    18.940    20.145    20.109    20.717    25.335

[34m[1mDifference to data:[22m[39m
    -0.000    -1.176     1.578     0.436    -1.248
    -1.176    -0.000    -1.688    -0.995    -1.261
     1.578    -1.688    -0.000    -1.393    -0.873
     0.436    -0.995    -1.393    -0.000    -0.816
    -1.248    -1.261    -0.873    -0.816    -0.000



# A Shrinkage Estimator

is 

$
\Sigma = \delta F + (1-\delta)S
$,

where $0 < \delta < 1$, $F$ is the "target covariance matrix" (for instance, from the constant correlation approach or one of the index models) and $S$ is the sample variance-covariance matrix. 

This approach will (by definition) be worse in-sample, but may well be better out-of-sample (a forecast for the coming period).

In [11]:
δ = 0.6
Σ₄ = δ*Σ₁ + (1-δ)*S

printmat(Σ₄)

    73.475    40.899    31.849    29.458    27.581
    40.899    37.371    25.935    23.928    22.473
    31.849    25.935    26.796    21.201    20.172
    29.458    23.928    21.201    25.200    20.036
    27.581    22.473    20.172    20.036    25.335

