# Robust Optimization for Genetic Selection

This is a working notebook, looking at how the quadratic optimization problem (QP) which arises in the context of robust genetic selection can be solved with Python (tested with 3.10 specifically). There are some standard packages which this depends on, imported below.

<!-- TODO this needs updating to reflect the current purpose of this file -->

In [2]:
import numpy as np                  # defines matrix structures
from qpsolvers import solve_qp      # used for quadratic optimization
from time import perf_counter       # fine grained timing 
import gurobipy as gp               # Gurobi optimization interface (1)
from gurobipy import GRB            # Gurobi optimization interface (2)

Utility functions and output settings used in this notebook are defined in the two cells below.

In [3]:
def printTime(task, tic, toc):
    """Quick function for nicely printing a time to 5 s.f."""
    print(f"{task} in {toc - tic:0.5f} seconds\n")


def printMatrix(matrix, description="ans =", precision=3):
    """Quick function for nicely printing a matrix"""
    print(f"{description}\n", np.round(matrix, precision))


# want to round rather than truncate when printing
np.set_printoptions(threshold=np.inf)

# only show numpy output to five decimal places
np.set_printoptions(formatter={'float_kind':"{:.5f}".format})

## Standard Problem

In the context of genetic selection, we want to maximize selection of genetic merit (a measure of desirable traits) while minimizing risks due to inbreeding. This can be formed mathematically as
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu\ \text{ subject to }\ w_{\mathcal{S}}^{T}e_{\mathcal{S}}^{} = \frac{1}{2},\ w_{\mathcal{D}}^{T}e_{\mathcal{D}}^{} = \frac{1}{2},\ l\leq w\leq u,
$$
where $w$ is the vector of proportional contributions, $\Sigma$ is a matrix encoding risk, $\mu$ is a vector encoding returns, $l$ encodes lower bounds on contributions, $u$ encodes upper bounds on contributions, $\mathcal{S}$ is an index set of candidates who are sires, and $\mathcal{D}$ is an index set of candidates who are dams.

In this representation of the problem, $\lambda$ is a control variable which balances how we trade of between risk and return. Each value of $\lambda$ will give a different solution on the critical frontier of the problem.

### Constraint Formulation

Since it is beneficial to work with problems in a standard form,
$$
    \min_x \frac{1}{2} x^T A x + q^T x\ \text{ subject to }\ Gx\leq h,\ Mx = m,\ l\leq x\leq u,
$$
we will need to do a very slight rearrangement of the problem to incorporate our two sum-to-half constraints within a single equality constraint. We also do not use the $Gx\leq h$ constraint in our problem.

We observe that the two vector constraints
$$
    w_{\mathcal{S}}^{T}e_{\mathcal{S}}^{} = \frac{1}{2},\ w_{\mathcal{D}}^{T}e_{\mathcal{D}}^{} = \frac{1}{2},
$$
are equivalent to the single matrix constraint
$$
    Mw := \begin{bmatrix}
        \mathbb{I}\lbrace 1\in\mathcal{S}\rbrace & \mathbb{I}\lbrace 2\in\mathcal{S}\rbrace & \cdots & \mathbb{I}\lbrace n\in\mathcal{S}\rbrace \\
        \mathbb{I}\lbrace 1\in\mathcal{D}\rbrace & \mathbb{I}\lbrace 2\in\mathcal{D}\rbrace & \cdots & \mathbb{I}\lbrace n\in\mathcal{D}\rbrace \end{bmatrix}w = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},
$$
where $\mathbb{I}\lbrace i\in\mathcal{I}\rbrace$ is an indicator function denoting whether index $i$ is in the set of indices $\mathcal{I}$.

### Toy Example ($n = 3$)

Lets see how this works in an example. We will start by looking how this problem might be solving using Python's [qpsolvers](https://qpsolvers.github.io/qpsolvers/index.html) library. Consider the problem where
$$
    \mu = \begin{bmatrix} 1 \\ 5 \\ 2 \end{bmatrix},\quad
    \Sigma = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 3 \end{bmatrix}, \quad
    \mathcal{S} = \lbrace 1 \rbrace, \quad
    \mathcal{D} = \lbrace 2, 3 \rbrace, \quad
    l = {\bf 0}, \quad
    u = {\bf 1}.
$$
We define these variables in Python using the following code.

In [4]:
# KEY PROBLEM VARIABLES
problem_size = 3
expected_breeding_values = np.array([
    1.0,
    5.0,
    2.0
])
relationship_matrix = np.array([
    [1, 0, 0],
    [0, 5, 0],
    [0, 0, 3]
])
sire_indices = [0]
dam_indices  = [1,2]
lower_bound = np.full((problem_size, 1), 0.0)
upper_bound = np.full((problem_size, 1), 1.0)

We have the additional variables which need setting up so that the problem works in `qpsolvers`. 

In [5]:
# OPTIMIZATION SETUP VARIABLES
lam = 0.5
# define the M so that column i is [1;0] if i is a sire and [0;1] otherwise 
M = np.zeros((2, problem_size))
M[0, sire_indices] = 1
M[1, dam_indices] = 1
# define the right hand side of the constraint Mx = m
m = np.array([[0.5], [0.5]])

Finally, we solve the problem using the modules' `solve_qp` function. This utilises Gurobi via an API, a fact which will be important once we start to consider larger problem sizes. 

In [6]:
# SOLVE THE PROBLEM
def solveGS(lam):
    return solve_qp(
        P = relationship_matrix,
        q = -lam*expected_breeding_values,
        G = None,
        h = None,
        A = M,
        b = m,
        lb = lower_bound,
        ub = upper_bound,
        solver = "gurobi"
    )

print(f"QP solution: w = {solveGS(lam)}")

Set parameter Username
Academic license - for non-commercial use only - expires 2025-02-26
QP solution: w = [0.50000 0.37500 0.12500]


Excellent, we have code which is telling us that our optimal contributions for $\lambda = 0.5$ are $w_1 = 0.5$, $w_2 = 0.375$, and $w_3 = 0.125$. We could vary our $\lambda$ value too and find other points on the frontier.

In [7]:
print(f"lambda = 0.0: w = {solveGS(0.0)}")
print(f"lambda = 0.2: w = {solveGS(0.2)}")
print(f"lambda = 0.4: w = {solveGS(0.4)}")
print(f"lambda = 0.6: w = {solveGS(0.6)}")
print(f"lambda = 0.8: w = {solveGS(0.8)}")
print(f"lambda = 1.0: w = {solveGS(1.0)}")

lambda = 0.0: w = [0.50000 0.18750 0.31250]
lambda = 0.2: w = [0.50000 0.26250 0.23750]
lambda = 0.4: w = [0.50000 0.33750 0.16250]
lambda = 0.6: w = [0.50000 0.41250 0.08750]
lambda = 0.8: w = [0.50000 0.48750 0.01250]
lambda = 1.0: w = [0.50000 0.50000 0.00000]


## Robust Optimization

A wrinkle to this is that we don't typically our problem variables with certainty. There _are_ ways to define $\Sigma$ based on relationships within the cohort (which are known and discussed more later) but $\mu$ is an estimated variable. In particular we say $\mu$ has a univariate normal distribution, $\mu\sim N(\bar{\mu}, \Omega)$. This means we must turn to optimization tools which can address this uncertainty.¹

Robust optimization is one such tool in which we adjust the objective function to model the inherent uncertainty in the problem. We may either do this with a quadratic uncertainty set, in which case our objective has an additional square-root term as in
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu - \kappa\sqrt{w^{T}\Omega w}\ \text{ subject to }\ Mw = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},\ l\leq w\leq u,
$$
or with a box uncertainty set, in which case our objective has an additional absolute value term as in
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu - \kappa\|\Omega^{\frac{1}{2}} w\|\ \text{ subject to }\ Mw = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},\ l\leq w\leq u,
$$
where $\kappa\in\mathbb{R}$ is our robust optimization parameters. For practical reasons relating to continuity and differentiability, the quadratic uncertainty set is far more favourable to work with.

This is obviously no longer a quadratic problem, so `qpsolvers` is no longer a viable tool. We will instead now need to work with the Gurobi API directly.

<!-- TODO this cell skips over the detail about how to go from the bilevel robust problem to the single level robust problem theory wise, it would be worth delving into that in more detail. -->

### Using Gurobi

To illustrate how the setup changes when uncertainty is added, we will first look at how Gurobi handles the standard problem. The following code returns the same solution as 

In [8]:
# create a model for standard genetic selection
model = gp.Model("standardGS")

# define variable of interest as a continuous 
w = model.addMVar(shape=problem_size, vtype=GRB.CONTINUOUS, name="w")

# set the objective function
model.setObjective(
    0.5*w@(relationship_matrix@w) - lam*w.transpose()@expected_breeding_values,
GRB.MINIMIZE)

# add sub-to-half constraints
model.addConstr(M @ w == m, name="sum-to-half")
# add weight-bound constraints
model.addConstr(w >= lower_bound, name="lower bound")
model.addConstr(w <= upper_bound, name="upper bound")

# solve the problem with Gurobi
model.optimize()
print(f"w = {w.X}")

Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - "Ubuntu 22.04.4 LTS")

CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 22 rows, 3 columns and 24 nonzeros
Model fingerprint: 0xd96d3c43
Model has 3 quadratic objective terms
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [5e-01, 2e+00]
  QObjective range [1e+00, 5e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [5e-01, 1e+00]
Presolve removed 21 rows and 1 columns
Presolve time: 0.01s
Presolved: 1 rows, 2 columns, 2 nonzeros
Presolved model has 2 quadratic objective terms
Ordering time: 0.00s

Barrier statistics:
 AA' NZ     : 0.000e+00
 Factor NZ  : 1.000e+00
 Factor Ops : 1.000e+00 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     

Unfortunately Gurobi cannot handle our problem with its objective function in the form
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu - \kappa\sqrt{w^{T}\Omega w}\ \text{ subject to }\ Mw = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},\ l\leq w\leq u,
$$
so some further adjustments are needed first. If we define a real auxillary variable $z\geq0$ such that $z\leq\sqrt{w^{T}\Omega w}$, then our problem becomes
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu - \kappa z\ \text{ s.t. }\ z\leq\sqrt{w^{T}\Omega w},\ Mw = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},\ l\leq w\leq u.
$$
<!-- TODO: better explain the idea of $z$ pushing up so that this switch doesn't have any practical difference -->
However, Gurobi _still_ can't handle this due to the presence of the square root, so we further make use of both $z$ and $\sqrt{w^{T}\Omega w}$ being positive to note that $z\leq\sqrt{w^{T}\Omega w}$ can be squared on both sides:
$$
    \min_w \frac{1}{2}w^{T}\Sigma w - \lambda w^{T}\mu - \kappa z\ \text{ s.t. }\ z^2\leq w^{T}\Omega w,\ Mw = \begin{bmatrix} 0.5 \\ 0.5\end{bmatrix},\ l\leq w\leq u.
$$

We will define
$$
    \bar{\mu} = \begin{bmatrix} 1 \\ 5 \\ 2 \end{bmatrix},\quad
        \Omega = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & \frac{1}{8} \end{bmatrix}, \kappa = 0.5
$$
and our other problem variables as before.

In [25]:
omega = np.array([
    [1, 0, 0],
    [0, 4, 0],
    [0, 0, 1/8]
])

kappa = 0.5

We then formulate this in Python as follows.

In [39]:
# create a new model for robust genetic selection
model = gp.Model("robustGS")

# define variables of interest as a continuous
w = model.addMVar(shape=problem_size, vtype=GRB.CONTINUOUS, name="w")
z = model.addVar(name="z")

# setup the robust objective function
model.setObjective(
    0.5*w@(relationship_matrix@w) - lam*w.transpose()@expected_breeding_values - kappa*z,
GRB.MINIMIZE)

# add quadratic uncertainty constraint
model.addConstr(z**2 <= np.inner(w, omega@w), name="uncertainty")
model.addConstr(z >= 0, name="z positive")
# add sub-to-half constraints
model.addConstr(M @ w == m, name="sum-to-half")
# add weight-bound constraints~
model.addConstr(w >= lower_bound, name="lower bound")
model.addConstr(w <= upper_bound, name="upper bound")

# solve the problem with Gurobi
model.optimize()
print(f"w = {w.X},\nz = {z.X}.")

Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - "Ubuntu 22.04.4 LTS")

CPU model: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 23 rows, 4 columns and 25 nonzeros
Model fingerprint: 0xb2b6242a
Model has 3 quadratic objective terms
Model has 3 quadratic constraints
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  QMatrix range    [1e-01, 4e+00]
  Objective range  [5e-01, 2e+03]
  QObjective range [1e+00, 5e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [5e-01, 1e+00]
Presolve removed 22 rows and 1 columns
Presolve time: 0.01s
Presolved: 9 rows, 8 columns, 14 nonzeros
Presolved model has 3 second-order cone constraints
Ordering time: 0.00s

Barrier statistics:
 AA' NZ     : 2.200e+01
 Factor NZ  : 4.500e+01
 Factor Ops : 2.850e+02 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Resid

We can repeat this experiment with varying $\kappa$ to see how how different tolerances for uncertainty impact the robust turning point returned.

| $\kappa$ |        $w$            |   $z$   | $f(w,z)$ |
| ---: | ------------------------: | ------: | -------: |
|  0.0 | [0.50000 0.37500 0.12500] | 0.02756 | -0.81250 |
|  0.5 | [0.50000 0.35282 0.14718] | 0.05204 | -0.83655 |
|  1.0 | [0.50000 0.33073 0.16927] | 0.05985 | -0.86451 |
|  2.0 | [0.50000 0.28663 0.21337] | 0.07544 | -0.93214 |
|  4.0 | [0.50000 0.19816 0.30184] | 0.10671 | -1.11428 |
|  8.0 | [0.50000 0.07511 0.42489] | 0.15022 | -1.65453 |
| 16.0 | [0.50000 0.07511 0.42489] | 0.15022 | -2.85630 |

We can see that in the case $\kappa = 0$ we return the standard optimization solution, as expected. However, we do have $z\neq0$ which suggests that a small amount of numerical error has been introduced as a result of asking Gurobi to optimize $z$ as well as $w$.
<!-- TODO check that these values are correct, talk about subbing into the KKT conditions -->




## Footnotes

1. In reality, $\Omega$ is also an estimated variable but we can ignore this for now to avoid going down a rabbit hole of uncertainties.