# Portfolio Sorts

This notebook implements univariate and bivariate portfolio sorts.

## Load Packages and Extra Functions

In [1]:
using Printf, Dates, DelimitedFiles, Statistics

include("jlFiles/printmat.jl")
include("jlFiles/lagFn.jl")                #creating lags of a matrix
include("jlFiles/EMAFn.jl")                #moving average of rows of a matrix

EMAFn

In [2]:
"""
    ReturnStats(Re,Annfactor=252)

Calculate average excess return, the std and the SR ratio - and annualise. 
Returns a 3xn matrix with (annualised) [μ;σ;SR], where `n=size(x,2)`.
"""
function ReturnStats(Re,Annfactor=252)
    μ  = mean(Re,dims=1)*Annfactor
    σ  = std(Re,dims=1)*sqrt(Annfactor)
    SR = μ./σ
    stats = [μ;σ;SR]
    return stats
end

ReturnStats

## Load Data

The data set contains daily data for "dates", the equity market return, riskfree rate and the the returns of the 25 Fama-French portfolios. All returns are in percent.

In [3]:
x   = readdlm("Data/MomentumSR.csv",',')
dN  = Date.(x[:,1],"yyyy-mm-dd")                  #Julia dates 
y   = convert.(Float64,x[:,2:end])

(Rm,Rf,R) = (y[:,1],y[:,2],y[:,3:end])

println("\nThe first few rows of dN, Rm and Rf")
printmat([dN[1:4] Rm[1:4] Rf[1:4]])

println("size of dN, Rm, Rf, R")
println(size(dN),"\n",size(Rm),"\n",size(Rf),"\n",size(R))

(T,n) = size(R);                      #number of periods and assets


The first few rows of dN, Rm and Rf
1979-01-02     0.615     0.035
1979-01-03     1.155     0.035
1979-01-04     0.975     0.035
1979-01-05     0.685     0.035

size of dN, Rm, Rf, R
(9837,)
(9837,)
(9837,)
(9837, 25)


# A Univariate Sort

on recent retturns.

In [4]:
X = lagFn(EMAFn(R,22));                     #sort on lag of MA(22) of R

#println("rows 2-4 of X*10:")
#printmat(X[2:4,:]*10,width=5,prec=2,colNames=string.(1:n))

In [5]:
(RHi,RLo) = (fill(NaN,T),fill(NaN,T))
for t = 2:T         #loop over periods, save portfolio returns
    #local sort1, wHi, wLo          #only needed in script
    sort1                  = sortperm(X[t,:])  #X is lagged already, sort1[1] is index of worst asset
    (wLo,wHi)              = (zeros(n),zeros(n))
    wLo[sort1[1:5]]       .= 1/5    #equally weighted inside Lo portfolio
    wHi[sort1[end-4:end]] .= 1/5
    RLo[t]                 = wLo'R[t,:]
    RHi[t]                 = wHi'R[t,:]
end

ReLo = RLo[2:end] - Rf[2:end]
ReHi = RHi[2:end] - Rf[2:end];       #cut out t=1 and create excess returns

Calculate the mean (excess) return, its standard deviation and the Sharpe ratio. Annualize by assuming 252 trading days per year. Compare with the excess return on passively holding an equity market index.

In [6]:
Rme = Rm - Rf           #market excess return
Stats = ReturnStats([ReLo ReHi Rme[2:end]],252)

printblue("Stats for the portfolio returns, annualized:")
printmat(Stats,colNames=["Lo" "Hi" "market"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m
          Lo        Hi    market
μ      3.836    14.168     8.374
σ     19.042    17.331    16.837
SR     0.201     0.818     0.497



# A Univariate Sort (again)

on recent returns - but using another approach, which turns out to be easy to apply to the double sort as well.

The `sortLoHi(x,v,m)` function below create vectors `vL` and `vH` with trues/falses (Bools) indicating membership of the Lo and Hi groups.

The `EWportf(v)` function takes such a Bool vector (eg. `vL`) and (equal) portfolio weights. It handles the case of an empty portfolio (all elements in `v` are falsse) by setting all portfolio weights to `NaN`.

In [7]:
"""
    sortLoHi(x,v,m)

Create vectors `vL` and `vH` with trues/falses indicating membership of the Lo and Hi
groups. It sorts according to x[v], setting the `m` lowest (in `vL`) and 
highest values (in `vH`) to `true`. All other elements 
(also those in x[.!v]) are set to false.

# Input
- `x::Vector`:    n-vector, sorting variable
- `v::Vector`:    n-vector of true/false. Sorting is done within x[v]
- `m::Int`:       number of assets in Lo/Hi portfolio

# Output
- `vL::Vector`:   n-vector of true/false, indicating membership of Lo portfolio
- `vH::Vector`:   n-vector of true/false, indicating membership of Hi portfolio

"""
function sortLoHi(x,v,m)
    
    xb  = copy(x)
    nv  = sum(v)
    (nv < 2m) && error("sum(v) < 2m")

    (vL,vH) = [falses(length(x)) for i=1:2]
    xb[.!v]               .= Inf   #v[i] = false are put to Inf to sort last
    sort1                  = sortperm(xb)   #lowest are first
    vL[sort1[1:m]]        .= true
    vH[sort1[nv-m+1:nv]]  .= true
    
    return vL, vH
end


"""
    EWportf(v)

Create (equal) portfolio weights from a vector of trues/falses. If all elements are falses,
then the weights are NaNs.

# Examples
- EWportf([true,false,true]) gives [0.5,0.0,0.5]. 
- EWportf([false,false]) gives [NaN,NaN]

"""
function EWportf(v) 
    w = ifelse( all(.!v), fill(NaN,length(v)), v/sum(v) )
    return w
end    

EWportf

In [8]:
X = lagFn(EMAFn(R,22));                     #sort on lag of MA(22) of R

m = 5

(RH,RL) = (fill(NaN,T),fill(NaN,T))
for t = 2:T                    #loop over periods, save portfolio returns 
    #local vL,vH,wL,wH         #only needed in script
    (vL,vH) = sortLoHi(X[t,:],trues(n),m)
    (wL,wH) = (EWportf(vL),EWportf(vH))       #portfolio weights, EW
    RL[t]   = wL'R[t,:]
    RH[t]   = wH'R[t,:]
end

ReL = RL[2:end] - Rf[2:end]
ReH = RH[2:end] - Rf[2:end]       #cut out t=1, excess returns

Statm = ReturnStats([ReL ReH Rme[2:end]],252)
printblue("Stats for the portfolio returns, annualized:")
printmat(Stats,colNames=["Lo" "Hi" "market"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m
          Lo        Hi    market
μ      3.836    14.168     8.374
σ     19.042    17.331    16.837
SR     0.201     0.818     0.497



# An Independent Double Sort

on recent volatility (X) and and returns (Z).

This creates four portfolios: (Low X, Low Z), (Low X, High Z), (High X, Low Z) and (High X, High Z). Each of them is an intersection from independent sorts on X and Z.

The size of each of these portfolios can vary over time - and sometimes be an empty portfolio. The portfolio weights created by the `EWportf()` function are then `NaN`.

In [9]:
X = lagFn(EMAFn(abs.(R),22))                #sort on lag of MA(22) of |R|, first sort variable
Z = lagFn(EMAFn(R,22));                     #sort on lag of MA(22) of R, second sort variable

In [10]:
(mX,mZ) = (10,10)

(RLL,RLH,RHL,RHH) = [fill(NaN,T) for i=1:4]
for t = 2:T                    #loop over periods, save portfolio returns
    #local vXL,vXH,vZL,vZH,vLL,vLH,vHL,vHH,wLL,wLH,wHL,wHH       #only needed in script
    (vXL,vXH) = sortLoHi(X[t,:],trues(n),mX)       #in Lo/Hi according to X
    (vZL,vZH) = sortLoHi(Z[t,:],trues(n),mZ)       #in Lo/Hi according to Z
    vLL     = vXL .& vZL                                  #in Lo X,Low Z
    vLH     = vXL .& vZH                                  #in Lo X, Hi Z
    vHL     = vXH .& vZL                                  #in Hi X, Lo Z
    vHH     = vXH .& vZH                                  #in Hi X, Hi Z
    (wLL,wLH,wHL,wHH) = (EWportf(vLL),EWportf(vLH),EWportf(vHL),EWportf(vHH))  #portfolio weights, EW
    (RLL[t],RLH[t],RHL[t],RHH[t])  = (wLL'R[t,:],wLH'R[t,:],wHL'R[t,:],wHH'R[t,:])
end

There are NaNs in the return series, since the portfolios are sometimes empty. We need to decide on how to handle those data points. 

The assumption below is that an empty portfolio (when all weights are `NaNs`) means that the full investment is done in the riskfree asset and thus the excess return is zero. We thus replace the excess return of a `NaN` return with zero.

We typically want to study `RLH - RLL` and `RHH - RHL` since they show the "effect" of the second sort variable (`Z`) while controlling for the first (`X`).

In [11]:
ReAll = [RLL RLH RHL RHH] .- Rf
replace!(ReAll,NaN=>0)         #replace NaN by 0, assuming investment in riskfree    
Stats = ReturnStats(ReAll,252) 

printblue("Stats for the portfolio returns, annualized:")
colNames = ["LL","LH","HL","HH"]
printmat(Stats,colNames=colNames,rowNames=["μ";"σ";"SR"])

printblue("Study LH-LL and HH-HL to see the momentum effect (controlling for volatility)")
RLH_LL = ReAll[:,2] - ReAll[:,1]         #(L,H) minus (L,L)
RHH_HL = ReAll[:,4] - ReAll[:,3]         #(H,H) minus (H,L)

Stats = ReturnStats([RLH_LL RHH_HL],252)
printmat(Stats,colNames=["RLH-RLL","RHH-RHL"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m
          LL        LH        HL        HH
μ      6.327    12.499     4.708    12.628
σ     14.635    14.335    20.553    19.057
SR     0.432     0.872     0.229     0.663

[34m[1mStudy LH-LL and HH-HL to see the momentum effect (controlling for volatility)[22m[39m
     RLH-RLL   RHH-RHL
μ      6.172     7.920
σ      7.880    11.196
SR     0.783     0.707



# A Dependent Double Sort

n recent volatility (X) and and returns (Z).

This also creates four portfolios: (Low X, Low Z), (Low X, High Z), (High X, Low Z) and (High X, High Z). In this case, the Low X group is split up into (Low X, Low Z) and (Low X, High Z), and similarly for the high X group.

The size of each of these portfolios is constant over time (unless there are missing values).

In [12]:
(mX,mZ) = (10,5)

(RLL,RLH,RHL,RHH) = [fill(NaN,T) for i=1:4]
for t = 2:T                    #loop over periods, save portfolio returns
    #local vXL,vXH,vLL,vLH,vHL,vHH,wLL,wLH,wHL,wHH       #only needed in script
    (vXL,vXH) = sortLoHi(X[t,:],trues(n),mX)       #Lo/Hi according to X
    (vLL,vLH) = sortLoHi(Z[t,:],vXL,mZ)            #within Lo X, Lo/Hi according to Z
    (vHL,vHH) = sortLoHi(Z[t,:],vXH,mZ)            #within Hi X, Lo/Hi according to Z    
    (wLL,wLH,wHL,wHH) = (EWportf(vLL),EWportf(vLH),EWportf(vHL),EWportf(vHH))  #portfolio weights, EW
    (RLL[t],RLH[t],RHL[t],RHH[t])  = (wLL'R[t,:],wLH'R[t,:],wHL'R[t,:],wHH'R[t,:])
end

In [13]:
ReAll = [RLL RLH RHL RHH] .- Rf
replace!(ReAll,NaN=>0)         #replace NaN by 0, assuming investment in riskfree    
Stats = ReturnStats(ReAll,252) 

printblue("Stats for the portfolio returns, annualized:")
colNames = ["LL","LH","HL","HH"]
printmat(Stats,colNames=colNames,rowNames=["μ";"σ";"SR"])

printblue("Study LH-LL and HH-HL to see the momentum effect (controlling for volatility)")
RLH_LL = ReAll[:,2] - ReAll[:,1]         #(L,H) minus (L,L)
RHH_HL = ReAll[:,4] - ReAll[:,3]         #(H,H) minus (H,L)

Stats = ReturnStats([RLH_LL RHH_HL],252)
printmat(Stats,colNames=["RLH-RLL","RHH-RHL"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m
          LL        LH        HL        HH
μ      7.433    12.768     5.357    13.211
σ     15.000    14.456    20.809    19.769
SR     0.495     0.883     0.257     0.668

[34m[1mStudy LH-LL and HH-HL to see the momentum effect (controlling for volatility)[22m[39m
     RLH-RLL   RHH-RHL
μ      5.335     7.854
σ      4.563     7.319
SR     1.169     1.073

