## Installing the NAG Library and running this notebook
To run this notebook, you will need to install the NAG Library for Python (Mark 29.3 or newer) and a license key. You can find the software and obtain a license key (trials are available) from [Getting Started with the NAG Library](https://www.nag.com/content/getting-started-nag-library?lang=py&os=linuxto).


# Portfolio optimization with MILP using the NAG Library

A mixed integer linear programming (MILP) model is an extension to a linear programming model where some or all of the variables are constrained to be integer. The ability to handle integer variables makes this an extremely powerful tool with applications in a huge number of industries. Here, we will consider a MILP model of an optimal mean/Value-at-Risk (VaR) portfolio optimization problem by [Benati and Rizzi (2007)](#References). This is an extension to a classic Markowitz model in which the aim is to maximize returns while minimizing risk. However, in this instance, the variance risk measure has been replaced by VaR.

VaR is a widely used risk measure in portfolio management: it quantifies the maximum potential loss within a specified confidence level over a defined time frame. Specifically, VaR is simply the $\alpha$-quantile of the return distribution function. Unlike other risk measures such as standard deviation or expected shortfall, VaR provides a clear and intuitive assessment of downside risk, making it a preferred choice for investors concerned with the probability of experiencing significant losses. While the mathematical properties of VaR are slightly unappealing (it is a piece-wise linear function and not convex), simplicity in its interpretation makes VaR a popular tool nonetheless.

In this instance, VaR is calculated using a non-parametric method - this means that no assumptions are made about the distribution of portfolio returns. Instead of fitting a parametric model (e.g. normal or student-t distribution) to historical data, which can be restrictive, we use historical simulation. This method reorganizes historical return data, putting it in order of worst to best. The $\alpha$-quantile is then estimated by the position of the observation that has the $\alpha$-percent of data on the left.

In this notebook, we solve the "Min Risk/Fixed Return" implementation of this problem. This problem contains VaR constraints, cardinality constraints, semicontinuous constraints, a minimum return constraint, a budget (full investment) constraint, and long-only constraints.

In [1]:
# Import modules
from naginterfaces.base import utils
from naginterfaces.library import opt, mip
import numpy as np
import time

For this model, we need to set some parameters. 

We have past observations $i \in I=\{1,\dots,T\}$ and assets $j \in J=\{1,\dots,K\}$.

We set $r^*$ to be the minimum expected return that will be accepted and $r^{\textrm{Min}}$ is the minimum return that can be observed in the market. We also choose $r^{\textrm{VaR}}$: a parameter set by the decision maker to control risk. They will only accept portfolios for which the probability of a return less than $r^{\textrm{VaR}}$ is less than or equal to $\alpha^{\textrm{VaR}}$. Further, we have probabilities $p_i$ associated with each observation $x_i$. These probabilities represent the occurrence of past realization $i$. For illustrative purposes, we set each observation to have equal probability and we synthetically generate returns data.

In [2]:
# Set parameters
n_assets = 300 # K
n_periods = 20 # T
r_star = 0.05
r_min = -1
r_var = 0.05
prob = 1/n_periods

# Synthetic data generation of expected returns for each asset
np.random.seed(0)
expected_returns = 0.25 * np.random.randn(n_periods, n_assets) 

Next, we define the variables used in the model:
* Variables $\lambda_j$ are the percentage of wealth that is allocated to asset $j$.
* Variables $x_i$ represent the portfolio observed return in time $i$.
* Variables $y_i$ are binary (associated with $x_i$) and used for modelling the VaR constraints.
* Variables $z_j$ are binary (associated with $\lambda_j$) and used for modelling the cardinality and semicontinuous constraints.
* $\alpha^{\textrm{VaR}}$ is the probability associated with VaR.

In [3]:
# There are asset_weights[n_assets] + binary_y[n_periods] + observed_return[n_periods] + binary_z[n_assets] + aVaR[1]
n_vars = n_assets + n_periods + n_periods + n_assets + 1

# Create index sets for each set of variables:
idx_asset_w = np.arange(1, n_assets+1, dtype=int)
idx_bin_y = np.arange(n_assets+1, n_assets+n_periods+1, dtype=int)
idx_obs_ret = np.arange(n_assets+n_periods+1, n_assets+2*n_periods+1, dtype=int)
idx_bin_z = np.arange(n_assets+2*n_periods+1, 2*n_assets+2*n_periods+1, dtype=int)
idx_avar = [n_vars]

# Initialize the problem handle:
handle = opt.handle_init(nvar=n_vars)

# Set binary variables:
opt.handle_set_property(handle=handle, ptype='Bin', idx=idx_bin_y)
opt.handle_set_property(handle=handle, ptype='Bin', idx=idx_bin_z)

## Constraints:
Since this is a MILP model, we set all linear constraints using [***handle_set_linconstr()***](https://support.nag.com/numeric/py/nagdoc_latest/naginterfaces.library.opt.handle_set_linconstr.html).

The first constraint that we implement enforces that the optimal portfolio should be greater than the **minimum acceptable portfolio expected return**:
$$\sum_{i=1}^{T} p_i x_i \geq r^*.$$

In [4]:
ones_periods = np.ones(n_periods, dtype=int)
opt.handle_set_linconstr(
    handle=handle,
    bl=r_star,
    bu=1.e20,
    irowb=ones_periods,
    icolb=idx_obs_ret,
    b=[prob]*n_periods
);

Next, we need to enforce that $x_i$ is the **result of the percentage of wealth allocated to each asset and the expected return of that asset** for each time period:

$$\sum_{j=1}^{K} \lambda_j r_{ij} = x_i \quad \forall \, i=1,\dots,T.$$

This requires slight rearrangement to be input into the model - we stack the variables $\lambda$ and $x$ and express the rearrangement in matrix notation:

\begin{equation*}
\begin{bmatrix}
r_{i1}  &
\dots &
r_{iK} &
-1
\end{bmatrix}
\begin{bmatrix}
\lambda_1  \\
\vdots \\
\lambda_K \\
x_i
\end{bmatrix}
= 0 \quad \forall \, i = 1,\dots,T.
\end{equation*}


In [5]:
ones_assets_1 = np.ones(n_assets+1, dtype=int)
for i in range(n_periods):
    i_obs_ret = n_assets + n_periods + 1 + i
    returns = expected_returns[i, :]
    returns_x = np.append(returns, -1.)
    idx_col = np.append(idx_asset_w, i_obs_ret)
    opt.handle_set_linconstr(
        handle=handle,
        bl=0.,
        bu=0.,
        irowb=ones_assets_1,
        icolb=idx_col,
        b=returns_x
    )

Next, we want to add constraints that **prevent the selection of portfolios with VaR below the threshold**. To do this, we introduce binary variables $y_i$ associated with each $x_i$. Then the VaR constraint can be modelled in the following way:

$$r^{\textrm{Min}} + (r^{\textrm{VaR}} - r^{\textrm{Min}})y_i \leq x_i \quad \forall \, i=1,\dots,T,$$
$$\sum_{i=1}^{T} p_i(1-y_i) \leq \alpha^{\textrm{VaR}}.$$

The first constraint enforces that $y_i$ is equal to $0$ for $x_i$ less than $r^{\textrm{VaR}}$. In the second constraint, this corresponds to $1 - y_i = 1 - 0 = 1$, leading to the summation of probabilities of time periods $i$ with returns less than the VaR threshold. If this probability is greater than $\alpha^{\textrm{VaR}}$, it results in an infeasible portfolio.

Rearranged into matrix notation, the first constraint becomes:

\begin{equation*}
\begin{bmatrix}
(r^{\textrm{Min}}-r^{\textrm{VaR}})  &
1
\end{bmatrix}
\begin{bmatrix}
y_i  \\
x_i
\end{bmatrix}
\geq r^{\textrm{Min}} \quad \forall \, i = 1,\dots,T,
\end{equation*}

and the second constraint becomes:

\begin{equation*}
\begin{bmatrix}
p_{1}  &
\dots &
p_{T} &
1
\end{bmatrix}
\begin{bmatrix}
y_1  \\
\vdots \\
y_T \\
\alpha^{\textrm{VaR}}
\end{bmatrix}
\geq 1.
\end{equation*}

In [6]:
y_coef = r_min - r_var
for i in range(n_periods):
    y_i = n_assets + 1 + i
    x_i = n_assets + n_periods + 1 + i     
    opt.handle_set_linconstr(
        handle=handle,
        bl=r_min,
        bu=1.e20,
        irowb=[1, 1],
        icolb=[y_i, x_i],
        b=[y_coef, 1.]
    )

ones_periods_1 = np.ones(n_periods+1, dtype=int)
idx_bin_avar = np.append(idx_bin_y, idx_avar)
opt.handle_set_linconstr(
    handle=handle,
    bl=1.0,
    bu=1.e20,
    irowb=ones_periods_1,
    icolb=idx_bin_avar,
    b=np.append([prob]*n_periods, 1.)
);

**Full investment** constraint:

$$\sum_{j=1}^{K} \lambda_j = 1.$$

In [7]:
ones_assets_float = np.ones(n_assets, dtype=float)
ones_assets = np.ones(n_assets, dtype=int)
opt.handle_set_linconstr(
    handle=handle,
    bl=1.0,
    bu=1.0,
    irowb=ones_assets,
    icolb=idx_asset_w,
    b=ones_assets_float
);

Next, we want to add **semicontinuous constraints** to confine the optimal allocation of the assets to be between $5\%$ and $70\%$:
$$\lambda_j \in 0 \cup [0.05, 0.7] \quad \forall \, j = 1,\dots, K.$$

Using binary variables $z_j$, the semicontinuous constraint can be expressed as follows:

$$0.05 \cdot z_j \leq \lambda_j \leq 0.7 \cdot z_j \quad \forall \, j=1,\dots, K.$$

These constraints enforce that if $z_j$ is equal to $1$, then $\lambda_j$ can take any value in the interval between $0.05$ and $0.7$. When $z_j$ equals $0$, then $\lambda_j$ must also equal $0$.

In [8]:
for j in range(n_assets):
    lambda_j = 1 + j
    z_j = n_assets + 2*n_periods + 1 + j
    opt.handle_set_linconstr(
        handle=handle,
        bl=-1.e20,
        bu=0.0,
        irowb=[1, 1],
        icolb=[lambda_j, z_j],
        b=[1., -0.7]
    )
   
    opt.handle_set_linconstr(
        handle=handle,
        bl=0.0,
        bu=1.e20,
        irowb=[1, 1],
        icolb=[lambda_j, z_j],
        b=[1., -0.05]
    )

Next, we want $10\%$ of the assets to be included in the optimal portfolio, so we add a **cardinality constraint**:
$$||\lambda||_0 \leq 0.1 \cdot K.$$

This can be modelled by adding a binary variable $z_j$ for each $\lambda_j$. When $z_j$ is equal to $0$, this indicates that asset $\lambda_j$ is not allocated; when $z_j$ is equal to $1$, asset $\lambda_j$ has been allocated. Then the cardinality constraint can be modelled by summing the binary variables:

$$\sum_{j=1}^K z_j \leq 0.1 \cdot K.$$

In [9]:
opt.handle_set_linconstr(
    handle=handle,
    bl=-1.e20,
    bu=0.1*n_assets,
    irowb=ones_assets,
    icolb=idx_bin_z,
    b=ones_assets_float
);

Bound constraints to **prevent short-selling**:

$$\lambda_j \geq 0 \quad \forall \, j=1,\dots,K.$$

In [10]:
# Set bound constraints:
zeros_asset_w = np.zeros(n_assets)
neg_lrg_bnd = np.full((n_periods + n_periods + n_assets + 1), -1.e20, dtype=float).ravel()
bl = np.hstack([zeros_asset_w, neg_lrg_bnd])
opt.handle_set_simplebounds(
    handle,
    bl=bl,
    bu=[1.e20]*n_vars,
)

## Objective function:
We want to minimize risk:

\begin{equation*}
\min_{\alpha^{\textrm{VaR}},\lambda,x,y} \alpha^{\textrm{VaR}}
\end{equation*}


Even though the objective function is linear, we use [***handle_set_quadobj()***](https://support.nag.com/numeric/py/nagdoc_latest/naginterfaces.library.opt.handle_set_quadobj.html) because it is sparse.

In [11]:
# Set the objective function:
opt.handle_set_quadobj(
        handle,
        idxc=idx_avar,
        c=[1.0],
    )

## Optimal mean/Value-at-Risk model
We can now specify the full model and solve it using the [**MILP solver**](https://support.nag.com/numeric/py/nagdoc_latest/naginterfaces.library.mip.handle_solve_milp.html).

\begin{equation*}
    \begin{aligned}
        &\min_{\alpha^{\textrm{VaR}},\lambda,x,y,z} &&\alpha^{\textrm{VaR}}
        \\
        &\textrm{subject to } &&\sum_{i=1}^{T} p_i x_i \geq r^*,
        \\
        & &&\sum_{j=1}^{K} \lambda_j r_{ij} = x_i \quad \forall \, i=1,\dots,T,
        \\
        & &&r^{\textrm{Min}} + (r^{\textrm{VaR}} - r^{\textrm{Min}})y_i \leq x_i \quad \forall \, i=1,\dots,T,
        \\
        & &&\sum_{i=1}^{T} p_i(1-y_i) \leq \alpha^{\textrm{VaR}},
        \\
        & &&\sum_{j=1}^{K} \lambda_j = 1,
        \\
        & &&\sum_{j=1}^K z_j \leq 0.1 \cdot K,
        \\
        & && 0.05 \cdot z_j \leq \lambda_j \leq 0.7 \cdot z_j \quad \forall \, j = 1,\dots, K,
        \\
        & &&y_i \in \{0,1\} \quad \forall \, i=1,\dots,T,
        \\
        & &&z_j \in \{0,1\} \quad \forall \, j=1,\dots,K,
        \\
        & &&\lambda_j \geq 0 \quad \forall \, j=1,\dots,K.
    \end{aligned}
\end{equation*}

In [12]:
# Set some algorithmic options:
for option in [
        'Print Level = 1',
]:
    opt.handle_opt_set(handle, option)

# Use an explicit I/O manager for abbreviated iteration output:
iom = utils.FileObjManager(locus_in_output=False)

# Start timer:
start_solve = time.time()

# Call the MILP solver:
mip.handle_solve_milp(handle, io_manager=iom)

# Print computation time:
end = time.time()
print(f"\n Computation time: {end-start_solve:.3f} seconds")

# Destroy the handle:
opt.handle_free(handle)

 H02BK, Solver for MILP problems
 Begin of Options
     Print File                    =                   9     * d
     Print Level                   =                   1     * U
     Print Options                 =                 Yes     * d
     Print Solution                =                  No     * d
     Monitoring File               =                  -1     * d
     Monitoring Level              =                   4     * d

     Infinite Bound Size           =         1.00000E+20     * d
     Task                          =            Minimize     * d
     Time Limit                    =         1.00000E+06     * d

     Milp Presolve                 =                 Yes     * d
     Milp Random Seed              =                   0     * d
     Milp Feasibility Tol          =         1.00000E-06     * d
     Milp Rel Gap                  =         1.00000E-04     * d
     Milp Abs Gap                  =         1.00000E-06     * d
     Milp Small Matrix Value       = 

In this instance, we solved a model with $641$ variables ($320$ of which were binary) and $644$ constraints. This was efficiently solved in $0.41$ seconds.

Using MILP allows the modelling of complex constraints, adding flexibility and sophistication to your problem. The MILP solver used in this portfolio optimization example is available in the NAG Library Optimization Modelling Suite. Learn more about it [here](https://nag.com/mixed-integer-linear-programming/). 

## References

Benati, S., & Rizzi, R. (2007). A mixed integer linear programming formulation of the optimal mean/value-at-risk portfolio problem. *European Journal of Operational Research*, *176*(1), 423-434.