# Robust linear programming in portfolio optimization using the NAG Library

# Correct Rendering of this notebook

This notebook makes use of the `latex_envs` Jupyter extension for equations and references.  If the LaTeX is not rendering properly in your local installation of Jupyter , it may be because you have not installed this extension.  Details at https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/latex_envs/README.html

## Introduction

We consider a classic portfolio problem with loss risk constraints:
\vspace{0.1cn}
\begin{equation}\label{prob}
\begin{array}{ll}
\underset{x\in\Re^n}{\mbox{maximize}} & \bar{r}^Tx\\[0.6ex]
\mbox{subject to} & \mbox{Prob}(r^Tx\leq\alpha)\leq\beta,\\[0.6ex]
     & \sum_{i=1}^n x_i=1,\\[0.6ex]
     & x\geq0.
\end{array}
\end{equation}
\vspace{0.1cn}
Here we suppose the return of aseets $r\in\Re^n$ is Gaussian with mean $\bar{r}\in\Re^n$ and covariance $V\in\Re^{n\times n}$, $\alpha$  is a given unwanted return level (e.g. an excessive lose) and $\beta$ is a given maximum probability. 

We will demonstrate that problem (\ref{prob}) can be transformed into a second-order cone programming (SOCP) of the following form and solved easily using the SOCP solver introduced at Mark $27$ in the NAG Library:
\vspace{0.1cn}
\begin{equation}\label{SOCP}
\begin{array}{ll}
\underset{x\in\Re^n}{\mbox{minimize}} & c^Tx\\[0.6ex]
\mbox{subject to} & l_A\leq Ax\leq u_A,\\[0.6ex]
     & l_x\leq x\leq u_x,\\[0.6ex]
     & x\in{\cal K},
\end{array}
\end{equation}
\vspace{0.1cn}
where $A\in\Re^{m\times n}$, $l_A, u_A\in\Re^m$, $c, l_x, u_x\in\Re^n$ are the problem data, and $\cal K={\cal K}^{n_1}\times\cdots\times{\cal K}^{n_r}\times\Re^{n_l}$ where ${\cal K}^{n_i}$ is either a quadratic cone or a rotated quadratic cone defined as follows:
\begin{itemize}
\item Quadratic cone:
\begin{equation}\label{SOC}
{\cal K}_q^{n_i}:=\left\lbrace x=\left(x_1,\ldots,x_{n_i}\right)\in\Re^{n_i}~:~x_1^2\geq\sum_{j=2}^{n_i} x_j^2, ~x_1\geq0\right\rbrace.
\end{equation}
\item Rotated quadratic cone:
\begin{equation}\label{RSOC}
{\cal K}_r^{n_i}=\left\lbrace x=\left(x_1,x_2,\ldots,x_{n_i}\right)\in\Re^{n_i}~:~2x_1x_2\geq\sum_{j=3}^{n_i} x_j^2,\quad x_1\geq0,\quad x_2\geq0\right\rbrace.
\end{equation}
\end{itemize}

## A closer look at the probability constraint

Let $u=r^Tx$ with $\sigma=x^TVx$ denoting its variance, the probability constraint in problem (\ref{prob}) can be written as
\vspace{0.1cn}
$$
\mbox{Prob}\left(\frac{u-\bar{u}}{\sqrt{\sigma}}\leq\frac{\alpha-\bar{u}}{\sqrt{\sigma}}\right)\leq\beta.
$$
\vspace{0.1cn}
Note that $(u-\bar{u})/\sqrt{\sigma}$ is a standard Gaussian random variable, the probability above is simply $\Phi((\alpha-\bar{u})/\sqrt{\sigma})$, where
\vspace{0.1cn}
$$
\Phi(z) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^z e^{-t^2/2}dt
$$
\vspace{0.1cn}
is the CDF of the standard Gaussian random variable. Thus the probability constraint in problem (\ref{prob}) can be written as
\vspace{0.1cn}
$$
\frac{\alpha-\bar{u}}{\sqrt{\sigma}}\leq\Phi^{-1}(\beta)
$$
\vspace{0.1cn}
or, equivalently,
\vspace{0.1cn}
\begin{equation}\label{label_1}
\bar{u} + \Phi^{-1}(\beta)\sqrt{\sigma}\geq\alpha.
\end{equation}
\vspace{0.1cn}
From $\bar{u}=\bar{r}^Tx$ and $\sigma=x^TVx=\|Fx\|^2$ by factorizing $V=F^TF$, (\ref{label_1}) is equivalent to
\vspace{0.1cn}
\begin{equation}\label{label_2}
\bar{r}^Tx + \Phi^{-1}(\beta)\|Fx\|\geq\alpha.
\end{equation}
\vspace{0.1cn}
Depending on the value of $\beta$, (\ref{label_2}) could be convex or concave. By setting $\beta\leq0.5$ (which is reasonable for risk control), (\ref{label_2}) is convex and can be written as
\vspace{0.1cn}
\begin{equation}\label{label_3}
\bar{r}^Tx + \Phi^{-1}(\beta)t=\alpha ~ \mbox{and} ~ \|Fx\|\leq t.
\end{equation}
\vspace{0.1cm}
Note that by letting $Fx=y$, $\|Fx\|=\|y\|\leq t$ in (\ref{label_3}) fits exactly the quadratic cone constraint (\ref{SOC}), thus can be solved by second-order cone programming. The final equivalent SOCP model to problem (\ref{prob}) is
\vspace{0.1cn}
\begin{equation}\label{final}
\begin{array}{ll}
\underset{x\in\Re^n}{\mbox{maximize}} & \bar{r}^Tx\\[0.6ex]
\mbox{subject to} & \sum_{i=1}^n x_i=1,\\[0.6ex] 
     & Fx = y,\\[0.6ex]
     & \bar{r}^Tx + \Phi^{-1}(\beta)t=\alpha, \\[0.6ex]
     & (t;y)\in{\cal K}^{n+1}_q,\\[0.6ex]
     & x\geq0.
\end{array}
\end{equation}
\vspace{0.1cn}

## Using the NAG Library

We demonstrate how to use the NAG SOCP solver to model and solve problem (\ref{prob}) by solving the equivalent SOCP problem (\ref{final}).

In [1]:
# Import utility libraries and the NAG Library
import numpy as np
import math as mt
from scipy.stats import norm
from naginterfaces.library import opt
from naginterfaces.library import lapackeig

As an example, we set $\alpha=0.0001$, $\beta=0.05$ and randomly generate $\bar{r}$ and $V$ for $8$ assets as follows.

In [2]:
# Set alpha and beta
alpha = 0.0001
beta = 0.05

# Fix random seed
np.random.seed(9)

# Number of assets
n_assets = 8

# Vector of expected returns
r = np.ones(n_assets)*.02 + np.random.rand(n_assets)*.15

# Covariance matrix
V = np.matrix(np.random.randn(n_assets, n_assets))
V = V.T * V
V = V / np.max(np.abs(np.diag(V))) * .2

Now factorize $V=F^TF$ using eigenvalue decomposition from the NAG Library.

In [3]:
# Note one could use sparse factorization if V is input as sparse matrix
U, lamda = lapackeig.dsyevd('V', 'L', V)

# Find positive eigenvalues and corresponding eigenvectors
i = 0
k = 0
F = []

while i<len(lamda):
    if lamda[i] > 0:
        F = np.append(F, mt.sqrt(lamda[i])*U[:,i])
        k += 1
    i += 1

F = F.reshape((k, n_assets))

For modelling, NAG SOCP solver requires several input arguments for objective and constraints. Now we initialize the data that will be used to feed NAG SOCP solver.

In [4]:
# Number of variables
n = n_assets
# Number of constraints
m = 0

# Objective coefficient c
c = np.full(n, 0.0, float)

# Bounds on variables
blx = np.full(n, -1.e20, float)
bux = np.full(n, 1.e20, float)

# Linear constraint bu <= Ax <= bu
# A in coordinate list format (COO)
irowa = np.empty(0, int)
icola = np.empty(0, int)
a = np.empty(0, float)
# Bounds on Ax
bl = np.empty(0, float)
bu = np.empty(0, float)

# Cone constraints
ctype = []
group = []

Because we will add auxiliary variables and constraints during the process, it is necessary we keep tracking the number of variables and constraints in the model by maintaining the up-to-date problem size.

In [5]:
# Initialize the up-to-date problem size
n_up = n
m_up = m

Now we keep modifying the above data during adding objective and constraints one by one. First is the objective coefficient $c$.

In [6]:
# Add objective function min -r'x
c = -r

Then we add the long-only constraint.

In [7]:
# Number of linear constraints will increase by 1
m += 1

# Set lower bound on x to 0
blx[0:n] = np.zeros(n)

# Add sum(x) = 1
irowa = np.append(irowa, np.full(n, m_up+1, dtype=int))
icola = np.append(icola, np.arange(1, n+1))
a = np.append(a, np.full(n, 1.0, dtype=float))
bl = np.append(bl, np.full(1, 1.0, dtype=float))
bu = np.append(bu, np.full(1, 1.0, dtype=float))

Now add the probability constraint by adding
$$
Fx = y,~ \bar{r}^Tx + \Phi^{-1}(\beta)t=\alpha ~and~(t;y)\in{\cal K}^{n+1}_q.
$$

In [8]:
# Get quantile function of beta
quantile = norm.ppf(beta)

# Up-to-date problem size
m_up = m
n_up = n

# Then k + 1 more variables need to be added together with
# k + 1 linear constraints and a second-order cone contraint
# Enlarge the model
n = n + k + 1
m = m + k + 1

# All the added auxiliary variables do not take part in obj
c = np.append(c, np.zeros(k+1))

# Enlarge bounds on x, add inf bounds on the new added k+1 variables
blx = np.append(blx, np.full(k+1, -1.e20, dtype=float))
bux = np.append(bux, np.full(k+1, 1.e20, dtype=float))

# Enlarge linear constraints
# Sparsity pattern of F (COO)
row, col = np.nonzero(F)
val = F[row, col]

# Convert to 1-based and move row down by m
# Add Fx = y and Phi^-1(beta)*t + r_bar'x - alpha = 0
# [x,t,y]
row = row + 1 + m_up
col = col + 1

row = np.append(row, np.arange(m_up+1, m_up+k+1+1))
col = np.append(np.append(col, np.arange(n_up+2, n_up+k+1+1)), n_up+1)
val = np.append(val, np.append(np.full(k, -1.0, dtype=float), quantile))

irowa = np.append(irowa, row)
icola = np.append(icola, col)
a = np.append(a, val)
bl = np.append(bl, np.append(np.zeros(k), alpha))
bu = np.append(bu, np.append(np.zeros(k), alpha))

# coeffient of x in Phi^-1(beta)*t + r'x - alpha = 0
irowa = np.append(irowa, np.full(n_assets, m_up+k+1, dtype=int))
icola = np.append(icola, np.arange(1, n_assets+1))
a = np.append(a, r)

# Enlarge cone constraints
ctype.extend('Q')
group_temp = np.arange(n_up+1, n_up+k+1+1)
group.append(group_temp)

By now, all the data we need is ready. Feed them to the NAG SOCP solver and solve.

In [9]:
# Create problem handle
handle = opt.handle_init(n)

# Set objective function
opt.handle_set_linobj(handle, c)

# Set box constraints
opt.handle_set_simplebounds(handle, blx, bux)

# Set linear constraints
opt.handle_set_linconstr(handle, bl, bu, irowa, icola, a)

# Set cone constraints
i = 0
while i<len(ctype):
    opt.handle_set_group(handle, ctype[i], 0, group[i])
    i += 1

# Set options
for option in [
        'Print Options = NO',
        'Print File = 1',
        'SOCP Scaling = A'
]:
    opt.handle_opt_set(handle, option)

# Call socp interior point solver
slt = opt.handle_solve_socp_ipm(handle)

naginterfaces.base.opt.handle_solve_socp_ipm:
naginterfaces.base.opt.handle_solve_socp_ipm:  ------------------------------------------------
naginterfaces.base.opt.handle_solve_socp_ipm:   E04PT, Interior point method for SOCP problems
naginterfaces.base.opt.handle_solve_socp_ipm:  ------------------------------------------------
naginterfaces.base.opt.handle_solve_socp_ipm:
naginterfaces.base.opt.handle_solve_socp_ipm:  Original Problem Statistics
naginterfaces.base.opt.handle_solve_socp_ipm:
naginterfaces.base.opt.handle_solve_socp_ipm:    Number of variables                          17
naginterfaces.base.opt.handle_solve_socp_ipm:    Number of linear constraints                 10
naginterfaces.base.opt.handle_solve_socp_ipm:    Number of nonzeros                           89
naginterfaces.base.opt.handle_solve_socp_ipm:    Number of cones                               1
naginterfaces.base.opt.handle_solve_socp_ipm:
naginterfaces.base.opt.handle_solve_socp_ipm:
naginterfaces.base.o

Now we can print the optimal portfolio and the corresponding return.

In [10]:
# Optimal portfolio
slt.x[0:n_assets]

array([1.93322779e-10, 3.97165177e-01, 3.35524516e-01, 7.15459782e-10,
       1.16762817e-01, 1.15110232e-10, 8.27684160e-02, 6.77790766e-02])

In [11]:
# Optimal expected return
r.dot(slt.x[0:n_assets])

0.08505797359531912

# Conclusion

In this notebook, we demonstrated how to use the NAG Library to model and solve portfolio optimization with probability constraint via SOCP. It is worth noting that SOCP is widely used in portfolio optimization due to its flexibility and versatility to handle a large variety of problems with different kinds of constraints, not only the probability constraint mentioned above, e.g. leverage constraint, turnover constraint, max position constraint and tracking-error constraint. We refer the readers to \cite{AG03, LVBL98, NAGDOC} for more details.

# References

[<a id="cit-AG03" href="#call-AG03">1</a>] Alizadeh Farid and Goldfarb Donald, ``_Second-order cone programming_'', Mathematical programming, vol. 95, number 1, pp. 3--51,  2003.

[<a id="cit-LVBL98" href="#call-LVBL98">2</a>] Lobo Miguel Sousa, Vandenberghe Lieven, Boyd Stephen <em>et al.</em>, ``_Applications of second-order cone programming_'', Linear algebra and its applications, vol. 284, number 1-3, pp. 193--228,  1998.

[<a id="cit-NAGDOC" href="#call-NAGDOC">3</a>] Numerical Algorithms Group, ``_NAG documentation_'',  2019.  [online](https://www.nag.com/numeric/fl/nagdoc_latest/html/frontmatter/manconts.html)

