# Portfolio Sorts

This notebook implements univariate and bivariate portfolio sorts.

## Load Packages and Extra Functions

The key functions used in this notebook are from the (local) `FinEcmt_OLS` module.

In [1]:
MyModulePath = joinpath(pwd(),"src")
!in(MyModulePath,LOAD_PATH) && push!(LOAD_PATH,MyModulePath)
using FinEcmt_OLS
using FinEcmt_TimeSeries: EMA

In [2]:
#=
include(joinpath(pwd(),"src","FinEcmt_OLS.jl"))
include(joinpath(pwd(),"src","FinEcmt_TimeSeries.jl"))
using .FinEcmt_OLS
using .FinEcmt_TimeSeries: EMA
=#

In [3]:
using Dates, DelimitedFiles, Statistics

## Load Data

The data set contains daily data for "dates", the equity market return, riskfree rate and the the returns of the 25 Fama-French portfolios. All returns are in percent.

In [4]:
x   = readdlm("Data/MomentumSR.csv",',')
dN  = Date.(x[:,1],"yyyy-mm-dd")                  #Julia dates
y   = convert.(Float64,x[:,2:end])

(Rm,Rf,R) = (y[:,1],y[:,2],y[:,3:end])

println("\nThe first few rows of dN, Rm and Rf")
printmat([dN[1:4] Rm[1:4] Rf[1:4]])

println("size of dN, Rm, Rf, R")
println(size(dN),"\n",size(Rm),"\n",size(Rf),"\n",size(R))

(T,n) = size(R);                      #number of periods and assets


The first few rows of dN, Rm and Rf
1979-01-02     0.615     0.035
1979-01-03     1.155     0.035
1979-01-04     0.975     0.035
1979-01-05     0.685     0.035

size of dN, Rm, Rf, R
(9837,)
(9837,)
(9837,)
(9837, 25)


# A Univariate Sort

on recent returns. 

The next cells create a sorting variable `X` and does a portfolio sort to create a `Lo` portfolio and a `Hi` portfolio (corresponding to the 5 lowest and highest values of `X`). The sort is done for each trading day of the sample, so the portfolios are dynamic (has time-varying portfolio weights).

### A Remark on the Code

- The `EMA(R,22)` function (included above) calculates a moving average of `R[t-21:t,i]` for each column `i`. We then lag thus result using `lag()`.
- With `x = [9,7,8]`, the `rankPs(x)` function (included above) gives the output `[3,1,2]`. This says, for instance, that `7` is the lowest number (rank 1).

In [5]:
m = 5                       #number of assets in each of Lo and Hi portfolio
q = 22
X = lag(EMA(R,q));      #(lag of) moving average of R

#println("rows 2-4 of X*10:")
#printmat(X[2:4,:]*10,width=5,prec=2,colNames=string.(1:n))

In [6]:
(RHi,RLo) = (fill(NaN,T),fill(NaN,T))
for t = 2:T         #loop over periods, save portfolio returns
    #local r, wHi, wLo          #local/global is needed in script
    r                 = rankPs(X[t,:])  #X is lagged already, sort1[1] is index of worst asset
    (wLo,wHi)         = (zeros(n),zeros(n))
    wLo[r.<=m]       .= 1/m        #low rank: in Lo portfortolio
    wHi[(n-m+1).<=r] .= 1/m        #high rank: in Hi portfolio
    RLo[t]            = wLo'R[t,:]
    RHi[t]            = wHi'R[t,:]
end

Rp = mean(R,dims=2);         #'passive' portfolio, equal weight on all assets

## Return Statistics

...comparing with a passive portfolio. 

Excess returns for the `Lo`, `Hi` and passive (`p`) portfolios are calculated.

We then use the `ReturnStats(Re,Annfactor)` function (included above) to calculate the mean (excess) return, its standard deviation and the Sharpe ratio. Annualisation is done by assuming 252 trading days per year.

In calculating the return stats, we drop the first `q` observations (and thus extract observation `q+1:end`, since we use `q` observations to form the first dynamic portfolio.

In [7]:
ReLo = RLo - Rf         #create excess returns
ReHi = RHi - Rf
Rep  = Rp - Rf          #excess return of passive portfolio

Stats = ReturnStats([ReLo ReHi Rep][q+1:end,:],252)    #return stats for obs q+1 to T

printblue("Stats for the portfolio returns, annualized:\n")
printmat(Stats;colNames=["Lo" "Hi" "passive"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m

μ      3.745    14.022     9.626
σ     19.056    17.340    16.928
SR     0.197     0.809     0.569



# A Univariate Sort (again)

on recent returns - but using another approach, which turns out to be easy to apply to the double sort as well.

### A Remark on the Code

- The `sortLoHi(x,v,m)` function (included above) below create n-vectors `vL` and `vH` with trues/falses indicating membership of the Lo and Hi groups.
- As an example, `(vL,vH) = sortLoHi([3,1,2],[false,true,true],1)` gives `vL=[false,true,false]` and `vH=[false,false,true]`. This means that the 'Lo' portfolio consists of asset 2 and the 'Hi' portfolio of asset 3. Asset 1 is not assigned to any.

- The `EWportf(v)` function (also from `FinEcmt_OLS`) takes such a Bool vector (eg. `vL`) and calculates (equal) portfolio weights among those assets that are `true` in `v`. It handles the case of an empty portfolio (all elements in `v` are false) by setting all portfolio weights to `NaN`.

In [8]:
@doc2 sortLoHi

```
sortLoHi(x,v,m)
```

Create vectors `vL` and `vH` with trues/falses indicating membership of the Lo and Hi groups. It sorts according to `x[v]`, setting the m lowest (in `vL`) and m highest values (in `vH`) to `true`. All other elements  (also those in `x[.!v]`) are set to false.

### Input

  * `x::Vector`:    n-vector, sorting variable
  * `v::Vector`:    n-vector of true/false. Sorting is done within x[v]
  * `m::Int`:       number of assets in Lo/Hi portfolio

### Output

  * `vL::Vector`:   n-vector of true/false, indicating membership of Lo portfolio
  * `vH::Vector`:   n-vector of true/false, indicating membership of Hi portfolio


In [9]:
#using CodeTracking
#println(@code_string sortLoHi([1],trues(1),1))    #print the source code

In [10]:
X = lag(EMA(R,q))                      #sort on lag of MA(q) of R

m = 5                                      #m asset in each of Lo and Hi

(RH,RL) = (fill(NaN,T),fill(NaN,T))
for t = 2:T                    #loop over periods, save portfolio returns
    #local vL,vH,wL,wH         #local/global is needed in script
    (vL,vH) = sortLoHi(X[t,:],trues(n),m)
    (wL,wH) = (EWportf(vL),EWportf(vH))       #portfolio weights, EW
    RL[t]   = wL'R[t,:]
    RH[t]   = wH'R[t,:]
end

ReL = RL - Rf
ReH = RH - Rf       #cut out t=1, excess returns

Statm = ReturnStats([ReL ReH Rep][q+1:end,:],252)
printblue("Stats for the portfolio returns, annualized:\n")
printmat(Stats,colNames=["Lo" "Hi" "passive"],rowNames=["μ";"σ";"SR"])

printred("Compare with the previous results to verify that they are the same")

[34m[1mStats for the portfolio returns, annualized:[22m[39m

μ      3.745    14.022     9.626
σ     19.056    17.340    16.928
SR     0.197     0.809     0.569

[31m[1mCompare with the previous results to verify that they are the same[22m[39m


# An Independent Double Sort

on recent volatility ($X$) and and returns ($Z$).

This creates four portfolios: (Low $X$, Low $Z$), (Low $X$, High $Z$), (High $X$, Low $Z$) and (High $X$, High $Z$). Each of them is an intersection from independent sorts on $X$ and $Z.$

The size of each of these portfolios can vary over time: sometimes the portfolio is empty. The portfolio weights created by the `EWportf()` function are then `NaN`.


### A Remark on the Code

- `vXL .& vZL` tests if `vXL[i]` and `vZL[i]` are both true (and then repeats for each `i`).

In [11]:
X = lag(EMA(abs.(R),q))                #(lag of) MA of |R|, first sort variable
Z = lag(EMA(R,q));                     #(lag of) MA of R, second sort variable

In [12]:
(mX,mZ) = (10,10)

(RLL,RLH,RHL,RHH) = [fill(NaN,T) for i=1:4]
for t = 2:T                    #loop over periods, save portfolio returns
    #local vXL,vXH,vZL,vZH,vLL,vLH,vHL,vHH,wLL,wLH,wHL,wHH       #local/global is needed in script
    (vXL,vXH) = sortLoHi(X[t,:],trues(n),mX)       #in Lo/Hi according to X
    (vZL,vZH) = sortLoHi(Z[t,:],trues(n),mZ)       #in Lo/Hi according to Z
    vLL     = vXL .& vZL                                  #in Lo X,Low Z
    vLH     = vXL .& vZH                                  #in Lo X, Hi Z
    vHL     = vXH .& vZL                                  #in Hi X, Lo Z
    vHH     = vXH .& vZH                                  #in Hi X, Hi Z
    (wLL,wLH,wHL,wHH) = (EWportf(vLL),EWportf(vLH),EWportf(vHL),EWportf(vHH))  #portfolio weights, EW
    (RLL[t],RLH[t],RHL[t],RHH[t])  = (wLL'R[t,:],wLH'R[t,:],wHL'R[t,:],wHH'R[t,:])
end

## Handling NaNs and Reporting Results

There are NaNs in the return series, since the portfolios are sometimes empty. We need to decide on how to handle those data points. 

The assumption below is that an empty portfolio (when all weights are `NaNs`) means that the full investment is done in the riskfree asset and thus the excess return is zero. We thus replace the excess return of a `NaN` return with zero.

We typically want to study `RLH - RLL` and `RHH - RHL` since they show the "effect" of the second sort variable (`Z`) while controlling for the first (`X`).

In [13]:
ReAll = [RLL RLH RHL RHH] .- Rf  #excess returns
replace!(ReAll,NaN=>0)           #replace NaN by 0, assuming investment in riskfree
Stats = ReturnStats(ReAll[q+1:end,:],252)

printblue("Stats for the portfolio returns, annualized:\n")
colNames = ["LL","LH","HL","HH"]
printmat(Stats,colNames=colNames,rowNames=["μ";"σ";"SR"])

printblue("Study LH-LL and HH-HL to see the momentum effect (controlling for volatility):\n")
RLH_LL = ReAll[:,2] - ReAll[:,1]         #(L,H) minus (L,L)
RHH_HL = ReAll[:,4] - ReAll[:,3]         #(H,H) minus (H,L)

Stats = ReturnStats([RLH_LL RHH_HL][q+1:end,:],252)
printmat(Stats,colNames=["LH-LL","HH-HL"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m

          LL        LH        HL        HH
μ      6.207    12.508     4.731    12.483
σ     14.644    14.350    20.575    19.069
SR     0.424     0.872     0.230     0.655

[34m[1mStudy LH-LL and HH-HL to see the momentum effect (controlling for volatility):[22m[39m

       LH-LL     HH-HL
μ      6.301     7.752
σ      7.878    11.195
SR     0.800     0.692



# A Dependent Double Sort

on recent volatility ($X$) and and returns ($Z$).

This also creates four portfolios: (Low $X$, Low $Z$), (Low $X$, High $Z$), (High $X$, Low $Z$) and (High $X$, High $Z$). However, in this case, the Low $X$ group is split up into: (Low $X$, Low $Z$) and (Low $X$, High $Z$), and similarly for the high $X$ group.

The size of each of these portfolios is constant over time (unless there are missing values).

### A Remark on the Code

- The `vXL` from `(vXL,vXH) = sortLoHi(X[t,:],trues(n),mX)` is an n-vector where element $i$ is `true` when asset $i$ belongs to 'low according to `X[t,:]`'.

- `sortLoHi(Z[t,:],vXL,mZ)` sorts `Z[t,vXL]` into low/high. The other elements in `Z[t,:]` will not belong to either.

In [14]:
(mX,mZ) = (10,5)

(RLL,RLH,RHL,RHH) = [fill(NaN,T) for i=1:4]
for t = 2:T                    #loop over periods, save portfolio returns
    #local vXL,vXH,vLL,vLH,vHL,vHH,wLL,wLH,wHL,wHH       #local/global is needed in script
    (vXL,vXH) = sortLoHi(X[t,:],trues(n),mX)       #Lo/Hi according to X
    (vLL,vLH) = sortLoHi(Z[t,:],vXL,mZ)            #within Lo X, Lo/Hi according to Z
    (vHL,vHH) = sortLoHi(Z[t,:],vXH,mZ)            #within Hi X, Lo/Hi according to Z
    (wLL,wLH,wHL,wHH) = (EWportf(vLL),EWportf(vLH),EWportf(vHL),EWportf(vHH))  #portfolio weights, EW
    (RLL[t],RLH[t],RHL[t],RHH[t])  = (wLL'R[t,:],wLH'R[t,:],wHL'R[t,:],wHH'R[t,:])
end

In [15]:
ReAll = [RLL RLH RHL RHH] .- Rf
replace!(ReAll,NaN=>0)         #replace NaN by 0, assuming investment in riskfree
Stats = ReturnStats(ReAll[q+1:end,:],252)

printblue("Stats for the portfolio returns, annualized:\n")
colNames = ["LL","LH","HL","HH"]
printmat(Stats,colNames=colNames,rowNames=["μ";"σ";"SR"])

printblue("Study LH-LL and HH-HL to see the momentum effect (controlling for volatility):\n")
RLH_LL = ReAll[:,2] - ReAll[:,1]         #(L,H) minus (L,L)
RHH_HL = ReAll[:,4] - ReAll[:,3]         #(H,H) minus (H,L)

Stats = ReturnStats([RLH_LL RHH_HL][q+1:end,:],252)
printmat(Stats,colNames=["LH-LL","HH-HL"],rowNames=["μ";"σ";"SR"])

[34m[1mStats for the portfolio returns, annualized:[22m[39m

          LL        LH        HL        HH
μ      7.324    12.648     5.212    13.063
σ     15.010    14.466    20.824    19.783
SR     0.488     0.874     0.250     0.660

[34m[1mStudy LH-LL and HH-HL to see the momentum effect (controlling for volatility):[22m[39m

       LH-LL     HH-HL
μ      5.324     7.852
σ      4.565     7.326
SR     1.166     1.072

