# Calendar Time Regressions vs. Panel Regressions

## Loading Packages

In [1]:
using Compat, Missings        #in Julia 0.6 
#using Dates, DelimitedFiles  #in Julia 0.7

include("jlFiles/printmat.jl")
include("jlFiles/NWFn.jl")
include("jlFiles/OlsSureFn.jl")

OlsSureFn

## Loading Data

In [2]:
ER1 = readdlm("Data/PPM_ER1.csv",',')                   #load data from csv files
ER2 = readdlm("Data/PPM_ER2.csv",',')
ER  = [ER1;ER2]
#ER = randn(2354,2637)           #uncomment this line (and comment out the previous 3 lines)
                                 #if you do not have ER1.csv and ER2.csv
(ER1,ER2) = (nothing,nothing)

Factors   = readdlm("Data/PPM_Factors.csv",',')         
Investors = readdlm("Data/PPM_N_Changes.csv",',')
N_Changes = Investors[:,1]

(T,N) = size(ER)
D     = N_Changes .> 50                #logical dummies: [active]
D0    = .!D                 

println("T=$(size(ER,1)) and N=$(size(ER,2))")
println("done loading data")

T=2354 and N=2637
done loading data


## Individual alphas

The following code takes the matrix of individual daily
excess return $ER_{T\times N}$ and runs one regression for each individual on
a three risk $Factors_{T\times3}$ (excess returns on Swedish equity, Swedish
bonds and international equity). 

The $D$ vector ($N$ elements) is: ```D[i] = false``` if investor $i$ is inactive, true if active. 

The next cell shows the average alphas for the two groups.

In [3]:
alphaM = fill(NaN,N)                                #individual alphas
for i = 1:N
   local b 
   b, = OlsSureFn(ER[:,i],[Factors ones(T)])
   alphaM[i] = b[end]
end

println("\nAverage annualised alphas for the two groups")
printmat([mean(alphaM[D0]) mean(alphaM[D])]*252)


Average annualised alphas for the two groups
    -0.787     6.217



## Calendar Time Portfolios

The following code creates two time series ($T\times1$) of portfolio returns: one for the cross-sectional average return of inactive investor, another for active investors. 

Then, it calculates the average excess returns and the Sharpe ratios. 

The alphas and betas are estimated with OLS, and we test the hypothesis that the two alphas are the same (using a SURE approach).

In [4]:
println("\nLS, group by group")

PortfER      = fill(NaN,(T,2))     #create portfolios as average across individuals
PortfER[:,1] = Compat.mean(ER[:,D0],dims=2)    #Tx1, portfolio return = average individual return
PortfER[:,2] = Compat.mean(ER[:,D],dims=2)     #0.7 syntax


Avg = Compat.mean(PortfER,dims=1)*252          #average excess return on portfolios
Std = Compat.std(PortfER,dims=1)*sqrt(252)     #0.7 syntax
SR  = Avg./Std
(b,res,yhat,Covb) = OlsSureFn(PortfER,[ones(T) Factors])

println("\nStats for the two portfolios:")
println("     Avg       Std       SR      alpha")
printmat([Avg' Std' SR' b[1:1,:]'*252])

R       = [1 0 0 0 -1 0 0 0]                       #testing if alpha(1) = alpha(2)
a_diff  = (R*vec(b))[1]                            #[1] to make it a scalar      
tstatLS = a_diff/sqrt((R*Covb*R'/T)[1])

println("diff of annual alpha (inactive - 51+), tstat (LS)")
printmat([a_diff*252 tstatLS])


LS, group by group

Stats for the two portfolios:
     Avg       Std       SR      alpha
    -1.262    15.728    -0.080    -0.787
     5.534    13.882     0.399     6.217

diff of annual alpha (inactive - 51+), tstat (LS)
    -7.004    -2.784



## Panel Regressions

Finally, a panel ($T\times N$) regression is done by simply stacking all data
points---but by interacting the factors (and constant) with the dummies. The
hypothesis of the same alphas is tested by both an OLS approach (assuming that
all data is iid) and a DK approach (which accounts for cross-sectional correlations).

The code for that panel regression is in the function `HszDkFn()`. It does a
straightforward LS regression (by a loop over $t$, to save memory space) and
then estimates the covariance matrix of the moment conditions as in
Driscoll-Kraay (allowing for cross-sectional correlations). The coding makes no attempts to be quick.

In [5]:
function HDirProdFn(x,y)
#HDirProdFn    Calculates horizontal direct product of two matrices with equal number of rows.
#              z[i,:] is the Kronecker product of x[i,:] and y[i,:]
  Kx = size(x,2)       #columns in x
  Ky = size(y,2)       #columns in y
  z  = repeat(y,outer=(1,Kx)) .* kron(x,ones(Int,1,Ky))   #in 0.7: repeat(y,1,Kx) works too
  return z
end
#-----------------------------------------------

function HszDkFn(y,x,z)
#HszDkFn   LS and Driscoll-Kray standard errors for panel, assuming x(t,i) = x(t) * z(i)

  (T,N) = (size(y,1),size(y,2))
  K     = size(x,2)*size(z,2)

  Sxx = zeros(K,K)
  Sxy = zeros(K,1)
  for t = 1:T                           #OLS by looping over t
    y_t  = y[t,:]                       #dependent variable, Nx1
    x0_t = repeat(x[t:t,:],outer=(N,1)) #factors, NxK, could simplify?
    x_t  = HDirProdFn(z,x0_t)           #effective regressors, z is NxKz, x_t is NxK
    Sxx  = Sxx + x_t'x_t/(T*N)          #building up Sxx and Sxy
    Sxy  = Sxy + x_t'y_t/(T*N)
  end
  theta = Sxx\Sxy

  s2     = 0.0
  omegaj = zeros(K,K)
  for t = 1:T                          #Covariance matrix by looping over t
    y_t  = y[t,:]                      #create y_t and x_t (again)
    x0_t = repeat(x[t:t,:],outer=(N,1))
    x_t  = HDirProdFn(z,x0_t)
    e_t  = y_t - x_t*theta             #residuals in t
    h_t  = (x_t'e_t)'/N                #moment conditions in t (divided by N)
    omegaj = omegaj + h_t'h_t          #building up covariance matrix
    s2     = s2 + sum(e_t.^2)/N^2
  end
  Shat = omegaj/T^2                     #estimate of S
  s2   = s2/T^2

  zx_1  = inv(Sxx)
  CovDK = zx_1 * Shat * zx_1'                     #covariance matrix, DK
  stdDK = sqrt.(diag(CovDK))                      #standard errors, DK

  CovLS = zx_1 * s2                               #covariance matrix, LS iid
  stdLS = sqrt.(diag(CovLS))                      #standard errors, LS iid

  return theta,CovDK,CovLS

end

HszDkFn (generic function with 1 method)

In [6]:
println("\npanel")
(theta,CovDK,CovLS) = HszDkFn(ER,[ones(T) Factors],[D0 D] .+ 0.0)

R       = [1 0 0 0 -1 0 0 0]                #testing if alpha(1) = alpha(2)
a_diff  = (R*vec(theta))[1]
tstatLS = a_diff/sqrt((R*CovLS*R')[1])
tstatDK = a_diff/sqrt((R*CovDK*R')[1])

println("\ndiff of annual alpha (inactive - 51+)")
println("     diff    tstat (LS)  tstat (DK)")
printmat([a_diff*252 tstatLS tstatDK])

println("\nCompare with calendar regressions. Also notice the difference (any?) between the two t-stats")


panel

diff of annual alpha (inactive - 51+)
     diff    tstat (LS)  tstat (DK)
    -7.004   -24.017    -2.784


Compare with calendar regressions. Also notice the difference (any?) between the two t-stats
