# Calendar Time Regressions vs. Panel Regressions (extra)

This notebook illustrates how calendar time regressions (form portfolios based on characteristics and then estimate a system of regressions) are related to panel regressions. The approach is to *apply* some functions, rather than to explain how they are built up.

## Load Packages and Extra Functions

In [1]:
using Printf, Statistics, LinearAlgebra, FileIO, JLD2, HDF5

include("jlFiles/printmat.jl")
include("jlFiles/Ols.jl")
include("jlFiles/CovNWFn.jl")
include("jlFiles/OlsSureFn.jl")
include("jlFiles/PanelOls.jl")
include("jlFiles/UtilityFunctions.jl");

## Loading Data

The data is in a JLD2 file. This is a very useful data format for Julia. To load it, the [JLD2.jl](https://github.com/JuliaIO/JLD2.jl) package is used.

`Re` is a TxN matrix with daily excess returns, `Factors` is a Tx3 matrix of pricing factors, and `N_Changes` is an N-vector with the number of fund changes (over the entire sample).

In [2]:
(Re,Factors,N_Changes) = load("Data/PPM.jld2","ER","Factors","N_Changes")

(T,N) = size(Re)
D     = N_Changes .> 50                #logical dummies: very active
D0    = .!D                            #inactive

println("T=$(size(Re,1)) and N=$(size(Re,2))")

T=2354 and N=2637


## Individual alphas

The following code takes the matrix of individual daily
excess return `Re` (a $T\times N$ matrix) and runs one regression for each of the $N$ individuals on
three risk factors (in `Factors`, a $T\times 3$ matrix) which includes excess returns on Swedish equity, Swedish bonds and international equity.

The `D` vector ($N$ elements) is: `D[i] = false` if investor $i$ is classified as inactive (no/few portfolio changes, see above), and `D[i] = true` if active (many portfolio changes).

The cell shows the average alphas for each of the two (D) groups.

In [3]:
alphaM = fill(NaN,N)                                #individual alphas
for i = 1:N
   #local b           #local/global is needed in script
   b, = OlsNWFn(Re[:,i],[Factors ones(T)],0)
   alphaM[i] = b[end]
end

printblue("\nAverage annualised alphas for each of the two groups:\n")
xx = [mean(alphaM[D0]) mean(alphaM[D])]*252
colNames = ["Inactive","Active"]
printmat(xx;colNames,rowNames=["α"])


[34m[1mAverage annualised alphas for each of the two groups:[22m[39m

   Inactive    Active
α    -0.787     6.217



## Calendar Time Portfolios

The following code creates two time series (with $T$ observations in each) of portfolio returns: one for inactive investors, the other for active investors. In both cases, the portfolios are equally weighted, so the return is the average return of those in the portfolio.

Then, it calculates the (time series) average excess returns, the Sharpe ratios and finally the alphas.

The alphas and betas are estimated with OLS, and we test the hypothesis that the two alphas are the same, using a OLS system estimation (SURE).

### A Remark on the Code

- The system/SURE approach is implemented in the function `OlsSureFn` (included in one of the first cells above).

- if `x` is a 1x1 matrix or a vector with a single element, then `only(x)` will create a scalar.

In [4]:
PortfRe = hcat( mean(Re[:,D0],dims=2),mean(Re[:,D],dims=2) )   #create portfolios as average across individuals

printblue("group by group, annualised values:\n")
Avg = mean(PortfRe,dims=1)*252          #average excess return on portfolios, annualised
Std = std(PortfRe,dims=1)*sqrt(252)
SR  = Avg./Std
(b,res,yhat,Covb) = OlsSureFn(PortfRe,[ones(T) Factors],true,0)

xx = [Avg;Std;SR;b[1:1,:]*252]
printmat(xx;colNames,rowNames=["Avg","Std","SR","α"])

[34m[1mgroup by group, annualised values:[22m[39m

     Inactive    Active
Avg    -1.262     5.534
Std    15.728    13.882
SR     -0.080     0.399
α      -0.787     6.217



In [5]:
R       = [1 0 0 0 -1 0 0 0]                       #testing if α₁ = α₂
a_diff  = only(R*vec(b))                           #only() to make it a scalar
tstatLS = a_diff/sqrt(only(R*Covb*R'))

printblue("diff of annual alphas:\n")
xx = [a_diff*252;tstatLS]
printmat(xx;rowNames=["α₁-α₂","t-stat"])

[34m[1mdiff of annual alphas:[22m[39m

α₁-α₂     -7.004
t-stat    -2.784



## Panel Regressions

Finally, a panel ($T\times N$) regression is done by simply stacking all data points---but by interacting the factors and constant with the activity dummies. The hypothesis of the same alphas is tested by both an OLS approach (assuming that all data is iid) and a Driscoll-Kraay approach (which accounts for cross-sectional correlations).

The code for the panel regression is in the function `PanelOls()`. It does a straightforward LS regression and then estimates the covariance matrix in several different ways: traditional OLS, White, Driscoll-Kraay and optionally also clustered (the cluster/group membership can be supplied to the function). Also, autocorrelation can be accounted for by applying a Newey-West approach to the (White, DK, clustered) methods.

In calling on `PanelOls()` we use the individual returns (`Re` which is $T \times N$) as the dependent variables, a $T \times K \times N$ array containing the regressors (interactions of `[ones(T) Factors]` with the dummies in`[D0 D]`). This approach is somewhat wasteful with memory since the dummies are (here) time-invariant. However,`PanelOls()` is set up to handle also more general cases.

In [6]:
# ?PanelOls        #uncomment to see the documentation

In [7]:
printblue("panel regression:\n")

x  = [ones(T) Factors]
K1 = size(x,2)
X = fill(NaN,T,2*K1,N)                  #create TxKxN array of regressors
for i = 1:N
    X[:,:,i] = hcat(x.*D0[i],x.*D[i])
end

fnO = PanelOls(Re,X)                        #panel regression

R       = [1 0 0 0 -1 0 0 0]                #testing if α₁ = α₂
a_diff  = only(R*vec(fnO.theta))

tstatLS = a_diff/sqrt(only(R*fnO.CovLS*R'))
tstatDK = a_diff/sqrt(only(R*fnO.CovDK*R'))

xx = [a_diff*252;tstatLS;tstatDK]
printmat(xx;rowNames=["α₁-α₂","t-stat (LS)","t-stat (DK)"])

printred("\nCompare with calendar time regressions. Also notice the difference (any?) between the two t-stats")

[34m[1mpanel regression:[22m[39m

α₁-α₂          -7.004
t-stat (LS)   -24.017
t-stat (DK)    -2.784


[31m[1mCompare with calendar time regressions. Also notice the difference (any?) between the two t-stats[22m[39m
