# Calendar Time Regresasions vs. Panel Regressions

In [1]:
include("jlFiles/printmat.jl")
include("jlFiles/OlsFn.jl")

OlsFn (generic function with 1 method)

In [2]:
ER1 = readdlm("Data/PPM_ER1.csv",',')                   #load data from csv files
ER2 = readdlm("Data/PPM_ER2.csv",',')
ER  = [ER1;ER2]
#ER = randn(2354,2637)           #uncomment this line (and comment out the previous 3 lines)
                                 #if you do not have ER1.csv and ER2.csv
(ER1,ER2) = (nothing,nothing)

Factors   = readdlm("Data/PPM_Factors.csv",',')         #no header line: x is matrix
Investors = readdlm("Data/PPM_N_Changes.csv",',')
N_Changes = Investors[:,1]

(T,N) = size(ER)
D     = N_Changes .> 50                #logical dummies: [active]
D0    = broadcast(!,D)                 #works on both 0.5 and 0.6. Could do .!D on 0.6

println("done loading data")

done loading data


## Individual alphas

The following code takes the matrix of individual daily
excess return $ER_{T\times N}$ and runs one regression for each individual on
a three risk $Factors_{T\times3}$ (excess returns on Swedish equity, Swedish
bonds and international equity). 

The $D_{N\times1}$ vector contains dummies: 0 (false) if inactive investor, 1 (true) if active investor. 

The code shows the average alphas for the two groups.

In [3]:
alphaM = fill(NaN,N)                                #individual alphas
for i = 1:N
   b, = OlsFn(ER[:,i],[Factors ones(T)])
   alphaM[i] = b[end]
end

println("\nAverage annualised alphas for the two groups")
printmat([mean(alphaM[D0]) mean(alphaM[D])]*252)


Average annualised alphas for the two groups
    -0.787     6.217



## Calendar Time Portfolios

The following code creates two time series
$T\times1$ of portfolio returns: one for the cross-sectional average return
of inactive investor, another for active investors. 

Then, it calculates the average excess returns and the Sharpe ratios. 

The alphas and betas are estimated with OLS, and we test the hypothesis that the two alphas are the
same (using a SURE approach).

In [4]:
println("\nLS, group by group")

PortfER      = fill(NaN,(T,2))     #create portfolios as average across individuals
PortfER[:,1] = mean(ER[:,D0],2)    #Tx1, portfolio return = average individual return
PortfER[:,2] = mean(ER[:,D],2)

Avg = mean(PortfER,1)*252          #average excess return on portfolios
Std = std(PortfER,1)*sqrt(252)
SR  = Avg./Std
(b,res,yhat,Covb) = OlsFn(PortfER,[ones(T) Factors])

println("Stats for the portfolios:\n Avg, Std, SR, alpha")
printmat([Avg' Std' SR' b[1:1,:]'*252])

R       = [1 0 0 0 -1 0 0 0]                       #testing if alpha(1) = alpha(2)
a_diff  = R*vec(b)                                 
tstatLS = a_diff/sqrt((R*Covb*R')[1])

println("diff of annual alpha (inactive - 51+), tstat (LS)")
printmat([a_diff*252 tstatLS])


LS, group by group
Stats for the portfolios:
 Avg, Std, SR, alpha
    -1.262    15.728    -0.080    -0.787
     5.534    13.882     0.399     6.217

diff of annual alpha (inactive - 51+), tstat (LS)
    -7.004    -2.780



## Panel Regressions

Finally, a panel ($T\times N$) regression is done by simply stacking all data
points---but by interacting the factors (and constant) with the dummies. The
hypothesis of the same alphas is tested by both an OLS approach (assuming that
all data is iid) and a DK approach (which accounts for cross-sectional correlations).

The code for that panel regression is in the function HszDkFn. It does a
straightforward LS regression (by a loop over $t$, to save memory space) and
then estimates the covariance matrix of the moment conditions as in
DK (allowing for cross-sectional correlations).

In [5]:
function HDirProdFn(x,y)
#HDirProdFn    Calculates horizontal direct product of two matrices with equal number of rows.
#              z[i,:] is the Kronecker product of x[i,:] and y[i,:]
  Kx = size(x,2)       #columns in x
  Ky = size(y,2)       #columns in y
  z = repmat(y,1,Kx) .* kron(x,ones(Int,1,Ky))
  return z
end

function HszDkFn(y,x,z)
#HszDkFn   LS and Driscoll-Kray standard errors for panel, assuming x(t,i) = x(t) * z(i)

  (T,N) = size(y,1,2)
  K = size(x,2)*size(z,2)

  Sxx = 0.0
  Sxy = 0.0
  for t = 1:T                          #OLS by looping over t
    y_t  = y[t:t,:]'                     #dependent variable, Nx1
    x0_t = repmat(x[t:t,:],N,1)          #factors, NxK, could simplify?
    x_t  = HDirProdFn(z,x0_t)          #effective regressors, z is NxKz, x_t is NxK
    Sxx  = Sxx + x_t'x_t/(T*N)         #building up Sxx and Sxy
    Sxy  = Sxy + x_t'y_t/(T*N)
  end

  theta = Sxx\Sxy

  s2     = 0.0
  omegaj = zeros(K,K)
  for t = 1:T                          #Covariance matrix by looping over t
    y_t  = y[t:t,:]'                     #create y_t and x_t (again)
    x0_t = repmat(x[t:t,:],N,1)
    x_t  = HDirProdFn(z,x0_t)
    e_t  = y_t - x_t*theta             #residuals in t
    h_t  = (x_t'e_t)'/N                #moment conditions in t (divided by N)
    omegaj = omegaj + h_t'h_t          #building up covariance matrix
    s2     = s2 + sum(e_t.^2)/N^2
  end
  Shat = omegaj/T^2                     #estimate of S
  s2   = s2/T^2

  zx_1  = inv(Sxx)
  CovDK = zx_1 * Shat * zx_1'                     #covariance matrix, DK
  stdDK = sqrt.(diag(CovDK))                      #standard errors, DK

  CovLS = zx_1 * s2                               #covariance matrix, LS iid
  stdLS = sqrt.(diag(CovLS))                      #standard errors, LS iid

  return theta,CovDK,CovLS

end

HszDkFn (generic function with 1 method)

In [6]:
println("\npanel")
(theta,CovDK,CovLS) = HszDkFn(ER,[ones(T) Factors],[D0 D]+0.0)

R       = [1 0 0 0 -1 0 0 0]                #testing if alpha(1) = alpha(2)
a_diff  = R*theta
tstatLS = a_diff/sqrt.(R*CovLS*R')
tstatDK = a_diff/sqrt.(R*CovDK*R')

println("diff of annual alpha (inactive - 51+), tstat (LS), tstat (DK)")
printmat([a_diff*252 tstatLS tstatDK])

println("\nCompare with calendar regressions. Also notice the difference between the t-stats")


panel
diff of annual alpha (inactive - 51+), tstat (LS), tstat (DK)
    -7.004   -24.017    -2.784


Compare with calendar regressions. Also notice the difference between the t-stats
