# CHAPTER 5 - Convex Optimisation Problems

---
---

**Author:** Dr Giordano Scarciotti (g.scarciotti@imperial.ac.uk) - Imperial College London 

**Module:** ELEC70066 - Advanced Optimisation

**Version:** 1.1.5 - 14/02/2023

---
---

The material of this chapter is adapted from $[1]$.

In this chapter we define convex and quasiconvex optimisation problems and look at several popular classes of problems. Contents:

*   Section 5.1 Convex Optimisation
*   Section 5.2 Linear Optimisation Problems
*   Section 5.3 Quadratic Optimisation Problems
*   Section 5.4 Geometric Programming
*   Section 5.5 Generalised Inequality constraints
*   Section 5.6 Vector Optimisation

It is assumed that the student is familiar with the definitions of objective function, inequality constraints, equality constraints, constrained and unconstrained optimisation problem, domain, feasible (infeasible) point and set, optimal value, local and global minimizer, level sets, active/inactive constraint, slack variables.

It is assumed that the student knows how to manipulate an optimisation problem by changing a maximisation problem into a minimisation problem, change variables, introduce or eliminare slack variables, eliminate/introduce linear equality constraints.

If you have attended "Optimisation" you are good to go. If not, read Chapter $4.1$ of $[1]$

# 5.1 Convex Optimisation

## 5.1.1 Definitions

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/icMMC4kOrOc"></iframe>')

A **convex optimisation problem** in standard form is described by

$$
\begin{array}{lll}
\min & f_0(x) &\\
s.t. & f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array} \tag{1}
$$

where $f_0,\dots,f_m$ are convex functions. Thus, what make an optimisation problem convex are three requirements:


1.   The objective function $f_0$ must be convex
2.   The inequality constraint functions $f_1$, ..., $f_m$ must be convex
3.   The equality constraint functions must be affine

The domain of a convex optimisation problem is convex as it is the intersection of convex domains, hyperspaces and hyperplanes.

If $f_0$ is quasiconvex and the other requirements (i.e. 2. and 3.) stay the same, then we have a **quasiconvex optimisation problem**.

For both convex and quasiconvex optimisation problems, the optimal set is convex.

Two optimisation problems are called **equivalent** if from the solution of one, the solution of the others can be readily obtained, and vice versa. Equivalent problems are not considered equal.






---

**Example 5.1:** The problem 

$$
\begin{array}{lll}
\min & x_1^2 + x_2^2 &\\
s.t. & \frac{x_1}{(1+x_2^2)} \le 0, & \\
& (x_1+x_2)^2=0,  & 
\end{array}
$$

is not convex because the equality constraint is not affine and the inequality constraint is not convex. The problem is equivalent to

$$
\begin{array}{lll}
\min & x_1^2 + x_2^2 &\\
s.t. & x_1 \le 0, & \\
& x_1+x_2=0,  & 
\end{array}
$$

which is a convex optimisation problem in standard form. 

---


The example shows that our definition of convex optimisation problem is strict. If a problem does not satisfy 1., 2. and 3. then we do not call it convex, even though it is equivalent to a convex problem. Practically this is irrelevant but it helps us to keep the notation and definitions clear.

There is only one exception to the rule above, which is that we will call convex problem the maximisation of a concave function (with 2. and 3. as above).

**A fundamental property of convex opimisation problems is that any locally optimal point is also globally optimal**. This fact is obvious. If we have a local optimal which is not a global, then there is another point which achieve a lower value. This implies that far away from the local optimizer the curvature of the function must go down, which is in contradiction with convexity. The figure below illustrates the idea.

<div>
<img src="https://drive.google.com/uc?export=view&id=1Fdg_alDBe8U_kN_AJG2WdJ-FXYA0Deb_" width="400"/>
</div>

Figure 5.1. *The presence of two minimisers implies downward curvature. The function cannot be convex.*

### Special case: Feasibility Problems

If the objective function is identically zero, the optimal value is either zero (if the feasible set is nonempty) or infinity (if the feasible set is empty). We call this **feasibility problem** and sometimes it can be written as

$$
\begin{array}{lll}
\text{find} & x &\\
s.t. & f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array}
$$

In other words, a feasibility problem consists in determining if the constraints are consistent and in finding a point that satifies them.

### Special case: Quasiconvex Optimisation

The most important difference between convex and quasiconvex optimization is that a quasiconvex optimization problem can have locally optimal solutions that are not globally optimal. See for instance the figure below for an example.

<div>
<img src="https://drive.google.com/uc?export=view&id=1jOTKRRLwemYB6iPoyGyLgt_Gi0eNq7uH" width="400"/>
</div>

Figure 5.2. *A quasiconvex function $f$ with a locally optimal point $x$
that is not globally optimal. This example shows that the global optimality
condition $\nabla f(x) = 0$, valid for convex functions, does not hold for quasiconvex functions.*

As we have hinted in [Chapter 4](https://colab.research.google.com/drive/1w7WGrNK-G-CT7g_EqWAFTbO_Ej7okEF2?usp=sharing), a quasiconvex optimisation problems can be solved by using a representation of the sublevel sets of a quasiconvex function via a family of convex inequalities. Let $\phi_t$ be a family of convex functions parametrized in $t$ such that $f_0(x) \le t \iff \phi_t\le 0$ for each $x$, with $\phi_t(x)$ nonincreasing in $t$. Let $p^*$ be the optimal value of the quasiconvex optimisation problem. If the convex feasibility problem

$$
\begin{array}{lll}
\text{find} & x &\\
s.t. & \phi_t(x) \le 0 & \\
& f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array}\tag{2}
$$

is feasibile, then we have $p^* \le t$ and any $x$ found satisfies $f_0(x) \le t$, whereas if the problem is infeasible then we have $p^* \ge t$.

Thus quasiconvex optimisation problems can be solved using a bisection algorithm:

$$
\begin{array}{l}
\textbf{given } l\le p^*,\, u\ge p^*, \text{ tolerance } \varepsilon>0\\
\textbf{repeat}\\
\,\,\,1.\,\,t:=(l+u)/2\\
\,\,\,2.\,\,\text{Solve the convex feasibility problem (2)}\\
\,\,\,3.\,\,\textbf{if }\text{(2) is feasible, }u:=t; \quad \textbf{else }l:=t\\
\textbf{until }u-l \le \varepsilon
\end{array}
$$

The algorithm terminates in exactly $\log_2((u-l)/\varepsilon)$ because at each iteration the interval is divided in two, so the length of the interval after $k$ iterations is $2^{−k}(u − l)$, where $u − l$ is the length of the initial interval.

## 5.1.2 A Simple Optimality Criterion for Differentiable $f_0$

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/EdbSQPi1MNE"></iframe>')

**N.B.** In the following and in the rest of the module we assume that the functions $f_0$, $f_1$, ..., $f_m$ are differentiable for simplicity. All results (including the general KKT conditions given in the next chapter) hold for non-differentiable functions by replacing derivatives with [subderivatives](https://en.wikipedia.org/wiki/Subderivative).

Consider the convex optimisation problem $(1)$. Suppose $f_0$ is differentiable. Then by convexity we know that for all $x,y\in\textbf{dom }f_0$

$$
f_0(y)\ge f_0(x) + \nabla f_0(x)^\top (y-x) \tag{3}
$$

Let $X$ be the feasible set, i.e. $X = \{x : f_i(x) \le 0,\, i=1,\dots,m,\, a_i^\top x = b_i,\,   i = 1,\dots,p \}$. Then $x$ is optimal if and only if $x\in X$ and 

$$
\nabla f_0(x)^\top (y-x) \ge 0 \tag{4}
$$

for all $y\in X$. Geometrically this means that if $\nabla f_0(x) \ne 0$ then $-\nabla f_0(x)$ defines a supporting hyperplane to the feasible set at $x$, see the figure below.


<div>
<img src="https://drive.google.com/uc?export=view&id=1CfyzS_ox9qwRH-eRUpFKnzu2loWvHtMd" width="400"/>
</div>

Figure 5.3. *The feasible set $X$ is shown shaded. Some level curves of $f_0$ are shown as dashed lines. The point $x$ is optimal: $−\nabla f_0(x)$ defines a supporting hyperplane to $X$ at $x$.*

The proof is simple. Suppose $x\in X$ and satisfies $(4)$ for all $y \in X$. Then by $(3)$ follows that $f_0(y) \ge f_0(x)$ for all $y \in X$, i.e. $x$ is optimal. Conversely, suppose $x$ is optimal but $(4)$ does not hold, i.e. for some $y\in X$ we have $\nabla f_0(x)^\top (y-x) < 0$. Consider the point $z(t) = ty + (1-t) x$ where $t \in [0,1]$ is a parameter. By convexity $z(t)$ is feasible because it is on the line segment of two feasible points. Note that 

$$
\left.\frac{d}{dt}f_0(z(t))\right|_{t=0} = \nabla f_0(x)^\top (y-x) < 0
$$

so for small positive $t$ we have $f_0(z(t))< f_0(x)$ which contradicts the hypothesis.

The optimality conditions for convex optmisation problems will be studied in detail in subsequent chapters, but for now let us consider some special cases.

### Unconstrained problems

For an unconstrained problem condition $(4)$ reduces to the well-known necessary and sufficient condition

$$
\nabla f_0(x) =0 \tag{5}
$$

for $x$ to be optimal. In fact, let $x$ be optimal, which means that $(4)$ holds for all feasible $y$. Since $f_0$ is differentiable, its domain is [open](https://math.stackexchange.com/a/268965) and all $y$ sufficiently close to $x$ are feasible. Select $y = x -t \nabla f_0(x)$ where $t\in \mathbb{R}$ is parameter. For $t$ small enough, $y$ is feasible and so

$$
\nabla f_0(x)^\top (y-x) = -t ||\nabla f_0(x)||_2^2 \ge 0
$$

from which $(5)$ follows. 

If $(5)$ has no solutions, then there are no optimal points and the problem can be either unbounded or it can have a finite value but the minimiser is not attained. On the other hand $(5)$ may have one or multiple solutions, and each one would be optimal.



---

**Example 5.2:** Consider the problem of minimising $f_0(x) = \frac{1}{2}x^\top P x + q^\top x + r$ where $P\in \mathbb{S}^n_+$. $(5)$ is $\nabla f_0(x) = P x + q =0$. Then we can have the following cases:

*   If $P\succ 0$ (this also implies that $P$ is not singular), then there is a unique minimiser $x^*= - P^{-1}q$.
*   If $q \not \in \mathcal{R}(P)$ (this also implies that $P$ is singular), then there is no optimal solution. In this case $f_0$ is unbounded below.
*   If $P$ is singular but $q \in \mathcal{R}(P)$ then the set of optimal points is the affine set $x^* = - P^{\dagger} q + \mathcal{N}(P)$, where $P^{\dagger}$ denotes the pseudo-inverse of $P$. So there are multiple optimal solutions.

---

### Problems with equality constraints only

Consider now the case with just equality constraints

$$
\begin{array}{ll}
\min & f_0(x)\\
s.t. & A x = b.  
\end{array}
$$

Here the feasible set is affine. We assume that the problem is feasible, so there is a feasible $x$. Then every feasible $y$ can be written as $y = x + z$ for some $z \in \mathcal{N}(A)$, where $\mathcal{N}(A)$ indicates the nullspace of $A$ (because $Ay = A(x+z)=b+0$). Thus $(4)$ can be rewritten as 

$$
\nabla f_0(x)^\top z \ge 0 \tag{6}
$$

for all $z\in \mathcal{N}(A)$. But since $\mathcal{N}(A)$ is a subspace then also $-z\in\mathcal{N}(A)$, which implies that $(6)$ holds with the equality, i.e. 

$$
\nabla f_0(x)^\top z = 0 
$$

for all $z\in \mathcal{N}(A)$. This means that $\nabla f_0(x)$ is perpendicular to $\mathcal{N}(A)$, which in turn implies that $\nabla f_0(x)$ must belong to the range of $A$ transpose. Thus the condition becomes

$$
\nabla f_0(x) = A^\top (-v) 
$$

for some $v \in \mathbb{R}^n$ (the minus is irrelevant, we added it for notational purposes). Thus we have obtained the classical Lagrange multiplier condition, i.e. $x\in\textbf{dom }f_0$ is optimal if and only if there exists a $v$ such that

$$
Ax = b \qquad \nabla f_0(x) + A^\top v =0. \tag{7}
$$

### Minimization over the nonnegative orthant

Consider now the case where the only constraint is the nonnegativity of the variables, i.e.

$$
\begin{array}{ll}
\min & f_0(x)\\
s.t. & x \succcurlyeq 0.  
\end{array}
$$

The optimality conditions then are $x \succcurlyeq 0$ with $(4)$ for all $y\succcurlyeq 0$. Note that if $\nabla f_0(x) \prec 0$, then a large enough $y$ makes the test fail. Thus $\nabla f_0(x) \succcurlyeq 0$. Under this condition we need to check that for small $y$ the condition still holds, i.e. $-\nabla f_0(x)^\top x \ge 0$. But since $x \succcurlyeq 0$ and $\nabla f_0(x) \succcurlyeq 0$ it follows that $\nabla f_0(x)^\top x = 0$. Since this is the sum of terms which are the product of two nonnegative numbers, it follows that in each term of the sum one of the two numbers must be zero for the equality to hold. Thus the condition reduces to $x_i (\nabla f_0(x))_i =0$ for $i=1,\dots, n$. In summary, the optimality conditions in this case are

$$
x \succcurlyeq 0 \qquad \nabla f_0(x) \succcurlyeq 0 \qquad x_i (\nabla f_0(x))_i =0 \quad i=1,\dots, n. \tag{8}
$$

The last condition is called *complementarity* since it implies that the sparsity patterns of the vectors $x$ and $\nabla f_0(x)$ are complementary.




A combination of $(7)$ and $(8)$ form the conditions known as Karush-Kuhn-Tucker optimality conditions, which you have already seen in the Autumn module "Optimisation" and that we will analyse in more detail in the next chapter (in the context of duality). 

## 5.1.3 Equivalent Convex Problems

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/vQVUUFwADfo"></iframe>')

It is useful to know which standard transformations preserve convexity.

### Eliminating equality constraints

The convex problem

$$
\begin{array}{lll}
\displaystyle \min_x & f_0(x) &\\
s.t. & f_i(x) \le 0, & i = 1,\dots,m\\
& A x = b,  & 
\end{array}
$$

is equivalent to the convex problem

$$
\begin{array}{lll}
\displaystyle \min_z & f_0(Fz+x_0) &\\
s.t. & f_i(Fz+x_0) \le 0, & i = 1,\dots,m 
\end{array}
$$

where $F$ and $x_0$ are such that $Ax = b \iff x = Fz + x_0$ for some $z$, i.e. $x_0$ is a particular solution of $Ax = b$ and the range of $F$ is the nullspace of $A$.

While intuition suggests that eliminating equality constraints is advantageous, in many cases it is better to retain the equality constraints. In fact, eliminating them can make the problem harder to understand and analyze, or ruin the efficiency of an algorithm that solves it. For instance, eliminating the equality constraints could destroy sparsity or some other useful structure of the problem.

### Introducing equality constraints

We can introduce new variables and equality constraints into a convex optimization problem, provided the equality constraints are linear. The convex problem 

$$
\begin{array}{lll}
\displaystyle \min_x & f_0(A_0 x + b_0) &\\
s.t. & f_i(A_i x + b_i) \le 0, & i = 1,\dots,m 
\end{array}
$$

is equivalent to the convex problem

$$
\begin{array}{lll}
\displaystyle \min_{x,y_i} & f_0(y_0) &\\
s.t. & f_i(y_i) \le 0, & i = 1,\dots,m\\
& y_i = A_i^\top x + b_i,  & i = 0,\dots, m 
\end{array}
$$

### Introducing slack variables for linear inequalities

If an inequality is linear, then the introduction of a slack variable preserves convexity. The convex problem

$$
\begin{array}{lll}
\min & f_0(x) &\\
s.t. & a_i^\top x \le b_i, & i = 1,\dots,m 
\end{array}
$$

is equivalent to the convex problem

$$
\begin{array}{lll}
\min & f_0(x) &\\
s.t. & a_i^\top x + s_i = b_i, & i = 1,\dots,m\\
& s_i \ge 0,  & i = 1,\dots,m 
\end{array}
$$

### Epigraph form

The epigraph form of the convex optimisation problem $(1)$ is 

$$
\begin{array}{lll}
\displaystyle \min_{x,t} & t &\\
s.t. & f_0(x)-t \le 0  \\
& f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array} 
$$

and is a convex problem. The epigraph form can be interpreted geometrically as an optimization
problem in the "graph space" $(x, t)$: we minimize $t$ over the epigraph of $f_0$, subject to the constraints on $x$. Since the objectice is linear, it is sometimes said that a linear objective is universal for convex optimisation because any convex optimisation problem can be readily transformed in epigraph form. This has practical consequences as in the end we just need optimisation algorithms that can handle linear obejectives.

### Minimising over some variables

The convex problem 

$$
\begin{array}{lll}
\displaystyle \min_{x_1,x_2} & f_0(x_1,x_2) &\\
s.t. & f_i(x_1) \le 0, & i = 1,\dots,m 
\end{array}
$$

is equivalent to the convex problem

$$
\begin{array}{lll}
\displaystyle \min_{x_1} & \tilde{f}_0(x_1) &\\
s.t. & f_i(x_1) \le 0, & i = 1,\dots,m 
\end{array}
$$

where $\tilde{f}_0(x_1)= \inf_{x_2} f_0(x_1,x_2)$.

# 5.2 Linear Optimisation Problems

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/7D90ElXh6Ys"></iframe>')

## 5.2.1 Linear Programming

When the objective and constraint functions are all affine, the problem if called a **linear program** (LP), namely

$$
\begin{array}{ll}
\min & c^\top x + d \\
s.t. & G x \preccurlyeq h\\
& Ax = b 
\end{array}\tag{9}
$$

where $G \in \mathbb{R}^{m\times n}$ and $A \in \mathbb{R}^{p\times n}$. Note that if in $(9)$ we replace "min" with "max" and keep the rest identical, we still have a convex problem because the affine objective is also concave. The figure below shows a geometric interpretation of problem $(9)$.

<div>
<img src="https://drive.google.com/uc?export=view&id=1g6dS31JyXUIXZtkMRtEOxLXKevC6xGf6" width="400"/>
</div>

Figure 5.4. *Geometric interpretation of an LP. The feasible set $P$, which is a polyhedron, is shaded. The objective $c^\top x$ is linear, so its level curves are hyperplanes orthogonal to $c$ (shown as dashed lines). The point $x^⋆$ is optimal; it is the point in $P$ as far as possible in the direction $−c$.*




---

**Example 5.3:** (Diet problem) Select nonnegative quantities $x_1$, ..., $x_n$ of $n$ different foods to make a healty diet which contains $m$ different nutrients in quantities of at least $b_1$, ..., $b_m$. One unit of food $j$ contains an amount $a_{ij}$ of nutrient $i$ and has cost $c_j$. We want to find the cheapest healthy diet, i.e.

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & A x \succcurlyeq b\\
& x  \succcurlyeq 0 
\end{array}
$$

**Example 5.4:** (Piecewise-linear minimisation) Consider the problem of minimising the piecewise-linear convex function

$$
\displaystyle f_0(x)=\max_{i=1,\dots,m} (a_i^\top x + b_i)
$$

This can be transformed in an equivalent LP in two steps. First we write the epigraph form and then we express the inequality $f_0(x) \le t$ as a set of $m$ separate inequalities, i.e.

$$
\begin{array}{lll}
\displaystyle\min_{x,t} & t & \\
s.t. & a_i^\top x + b_i\le t, &i=1,\dots,m.
\end{array}
$$

---

## 5.2.2 Linear-fractional Programming


The problem of minimizing a ratio of affine functions over a polyhedron is called a **linear-fractional program**

$$
\begin{array}{ll}
\min & \displaystyle\frac{c^\top x + d}{e^\top x + f} \\
s.t. & G x \preccurlyeq h\\
& Ax = b 
\end{array}
$$

The objective function is quasiconvex so this a quasiconvex optimisation problem. This problem can be solved with the bisection algorithm. However, if the feasible set is nonempty then it is possible to show that this problem can also be transformed in the linear program

$$
\begin{array}{ll}
\displaystyle \min_{y,z} & c^\top y + dz \\
s.t. & G y - hz \preccurlyeq 0\\
& Ay - bz = 0\\
& e^\top y + fz = 1\\
& z \ge 0 
\end{array}
$$

which is then solvable in one shot. We skip the proof of the equivalence between the two problems.


It is also possible to consider a generalisation of the problem in which the objective function is the pointwise maximum of $r$ ratios of affine functions. This is called generalised linear-fractional problem and is a quasiconvex problem. In this case, there is no one-shot equivalent and the problem must be solved using bisection. 

---

**Example 5.5:** (*Von Neumann growth problem*) We consider an economy with $n$ sectors, and activity levels $x_i > 0$ in the current period, and activity levels $x_i^+ > 0$ in the next period. An activity level $x$ consumes
goods $Bx \in \mathbb{R}^m$, and produces goods $Ax$. The goods consumed in the next period
cannot exceed the goods produced in the current period, i.e., $Bx^+ \preccurlyeq Ax$. The growth rate in sector $i$, over the period, is given by $x_i^+/ x_i$. We look for an activity level vector $x$ that maximizes the minimum growth rate across all sectors of the economy. This problem can be expressed as a generalized linear-fractional problem

$$
\begin{array}{ll}
\displaystyle \max_{x,x^+} & \displaystyle \min_{i=1,\dots,n}\frac{x_i^+}{x_i} \\
s.t. & x^+ \succcurlyeq 0\\
& Bx^+ \preccurlyeq Ax 
\end{array}
$$

---

# 5.3 Quadratic Optimisation Problems

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/g5VAlib3qow"></iframe>')

## 5.3.1 Quadratic Programming

When the objective function is convex quadratic and the constraint functions are affine we have a **quadratic program** (QP) that can be expressed as

$$
\begin{array}{ll}
\min & \frac{1}{2}x^\top P x + q^\top x + r \\
s.t. & G x \preccurlyeq h\\
& Ax = b 
\end{array}
$$

where $P\in\mathbb{S}_+^n$, $G \in \mathbb{R}^{m\times n}$ and $A \in \mathbb{R}^{p\times n}$. The figure below shows a geometric interpretation of the problem.




<div>
<img src="https://drive.google.com/uc?export=view&id=1nD1U-i44JjKAyBE-5Rxoa87Lq1_ag1O6" width="400"/>
</div>

Figure 5.5. *Geometric interpretation of a QP. The feasible set $P$, which is a polyhedron, is shown shaded. The level lines of the objective function, which
is convex quadratic, are shown as dashed curves. The point $x^*$ is optimal.*

If also the inequality constraints are convex quadratic functions then we have a **quadratically constrained quadratic program** (QCQP), namely

$$
\begin{array}{ll}
\min & \frac{1}{2}x^\top P_0 x + q_0^\top x + r_0 \\
s.t. & \frac{1}{2}x^\top P_i x + q_i^\top x + r_i \le 0 \qquad i=1,\dots,m\\
& Ax = b 
\end{array}
$$

where $P_i\in\mathbb{S}_+^n$, $i=0,\dots,m$.

Note that LP $\subseteq$ QP $\subseteq$ QCQP.

---

**Example 5.6:** the least-square approximation problem seen in Chapter 1 is a QP.

**Example 5.7:** the least-norm problem seen in Chapter 1 is a QP.

**Example 5.8:** The Euclidean distance between the polyhedra $\mathcal{P}_1=\{x:A_1x \preccurlyeq b_1\}$ and $\mathcal{P}_2=\{x:A_2x \preccurlyeq b_2\}$ is defined as $\textbf{dist}(\mathcal{P}_1,\mathcal{P}_2)= \inf \{||x_1-x_2||_2 : x_1 \in \mathcal{P}_1, x_2 \in \mathcal{P}_2\}$. We can find the distance between the two poliheadra solving the QP

$$
\begin{array}{ll}
\displaystyle \min_{x_1,x_2} & ||x_1-x_2||_2^2 \\
s.t. & A_1 x_1 \preccurlyeq b_1 \\
& A_2 x_2 \preccurlyeq b_2
\end{array}
$$

This problem is infeasible if and only if one of the polyhedra is empty. The optimal value is zero if and only if the polyhedra intersect. Otherwise the optimal $x_1$ and $x_2$ are the points in $\mathcal{P}_1$ and $\mathcal{P}_2$, respectively, that are closest to each other.

**Example 5.9:** Consider an LP in which the cost function is $c^\top x$. Assume that $c$ is a random vector with mean $\bar c$ and variance $\Sigma$. Then $\mathbf{E}(c^\top x)= \bar{c}^\top x$ and $\textbf{var}(c^\top x)= x^\top \Sigma x$. In general there is a trade-off between small expected cost and small cost variance. This trade-off can be expressed with the cost function $\mathbf{E}(c^\top x)+\gamma \textbf{var}(c^\top x)$ where $\gamma\ge 0$ is called *risk aversion parameter*. In summary, an LP with random cost can be formulated as the QP 

$$
\begin{array}{ll}
\min & \bar{c}^\top x + \gamma x^\top \Sigma x\\
s.t. & G x \preccurlyeq h\\
& Ax = b 
\end{array}
$$

---

## 5.3.2 Second-order Cone Programming

A problem that is closely related to quadratic programming is the **second-order cone program** (SOCP):

$$
\begin{array}{ll}
\min & f^\top x\\
s.t. & ||A_i x + b_i||_2 \le c_i^\top x + d_i \qquad i=1,...,m\\
& Fx = g 
\end{array}
$$

The constaint in the problem above is called a *second-order cone constraint*, since it is the same as requiring that the affine function $(Ax + b, c^\top x + d)$ lies in the second-order cone in $\mathbb{R}^{k+1}$. We see that if all $c_i$'s are zero then the SOCP reduces, once the constraint is squared, to a QCQP. We also see that if the $A_i$'s are all zero then the SOCP reduces to an LP. Of course, SOCPs are more general than QCQPs.


SOCPs can be used to formulate robust linear programs. Consider the LP

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & a_i^\top x \le b_i, \qquad i=1,\dots,m
\end{array}
$$

and assume that there is uncertainty on the parameters $a_i$. 

In one approach, we can assume that the $a_i$'s are known to lie in a given ellipsoid

$$
a_i\in\mathcal{E}_i=\{\bar{a}_i+P_i u : ||u||_2\le 1\}
$$

where $\bar{a}_i$ is the center of the ellipsoid and $P_i\in \mathbb{R}^{n\times n}$. If $P_i$ is singular, it means that some values of $a_i$ are known perfectly. Then the robust linear constraint becomes $a_i^\top x \le b_i$ for all $a_i\in\mathcal{E}_i$, which can be expressed as

$$
\sup\{a_i^\top x : a_i\in\mathcal{E}_i\} = \bar{a}_i^\top x + \sup\{u^\top P_i^\top x: ||u||_2\le 1\} = \bar{a}_i^\top x + ||P_i^\top x||_2 \le b_i
$$

In summary, we have the following SOCP

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & \bar{a}_i^\top x + ||P_i^\top x||_2 \le b_i, \qquad i=1,\dots,m
\end{array}
$$

Note that the additional norm terms act as regularization terms; they prevent $x$ from being large in directions with considerable uncertainty in the parameters $a_i$.

In the other approach we can use a statistical framework. In this case we assume that the $a_i$ are independent Gaussian random vectors with mean $\bar{a}_i$ and covariance $\Sigma_i$. We require that the constraints hold with probability exceeding $\eta\ge 0.5$, i.e.

$$
\textbf{prob}(a_i^\top x \le b_i) \ge \eta
$$

This can be formulated as a SOCP exploiting the relation

$$
\textbf{prob}(a_i^\top x \le b_i) = \Phi \left(\frac{b_i - \bar{a}_i^\top x}{||\Sigma_i^{1/2}x||_2} \right)
$$

where $\Phi(x) = (1/\sqrt{2\pi})\int_{-\infty}^x e^{-t^2/2}dt$ is the cumulative distribution function of a zero mean unit variance Gaussian random variable. In summary, the problem can be formulated with the SOCP

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & \bar{a}_i^\top x + \Phi^{-1}(\eta) ||\Sigma^{1/2}_i x||_2 \le b_i, \qquad i=1,\dots,m
\end{array}
$$


# 5.4 Geometric Programming

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/2U94CIfxYg0"></iframe>')

We now consider a family of optimisation problems that are not convex in their natural form but that can be transformed into convex optimisation problems by a change of variables and a transformation of the objective and constraint functions.

A function $f:\mathbb{R}^n\to \mathbb{R}$ with $\textbf{dom }f=\mathbb{R}^n_{++}$ defined as

$$
f(x) = c\, x_1^{a_1}x_2^{a_2}\cdots x_n^{a_n}
$$

where $c>0$ and $a_i\in\mathbb{R}$ is called a **monomial**. The exponents $a_i$ of a monomial can be any real numbers, including fractional or negative, but the coefficient $c$ can only be positive. Note that this definition is more general than the one usually given in algebra. A sum of monomials of the form

$$
\displaystyle f(x) = \sum_{k=1}^K c_k x_1^{a_{1k}}x_2^{a_{2k}}\cdots x_n^{a_{nk}}
$$

where $c_k >0$ is called a **posynomial**. Posynomials are closed under addition, multiplication, and nonnegative scaling. Monomials are closed under multiplication and division. If a posynomial is multiplied by a monomial, the result is a posynomial;  similarly, a posynomial can be divided by a monomial, resulting in a posynomial.


A **geometric programme** (GP) **in posynomial form** is an optimisation problem described by

$$
\begin{array}{lll}
\min & f_0(x) &\\
s.t. & f_i(x) \le 1, & i = 1,\dots,m\\
& h_i(x) = 1,  & i = 1,\dots,p 
\end{array}
$$

where $f_0$, ..., $f_m$ are posynomials and $h_1$, ..., $h_p$ are monomials. The domain of the problem is $\mathbb{R}_{++}^n$, i.e. the constraint $x \succ 0$ is implicit. A GP is not a convex optimisation problem.

Some more general cases can be easily treated. For instance $f(x)\le h(x)$ can be handled if $f$ is a posynomial and $h$ is a monomial by expressing it as $f(x)/h(x) \le 1$. Similarly, we can deal with equality constraints of the form $h_1(x) = h_2(x)$ if $h_1$ and $h_2$ are nonzero monomials. Also, we can maximize a nonzero monomial objective by minimising its inverse, which is also a monomial.

The GP in posynomial form can be transformed into a convex problem. Consider the change of variables $y_i = \log x_i$, so $x_i = e^{y_i}$. A posynomial in the new variables becomes

$$
\displaystyle f(x) = \sum_{k=1}^K c_k x_1^{a_{1k}}x_2^{a_{2k}}\cdots x_n^{a_{nk}} = \sum_{k=1}^K e^{a_k^\top y + b_k}
$$

where $a_k = (a_{1k},\dots,a_{nk})$ and $b_k = \log c_k$. The tranformed problem is

$$
\begin{array}{lll}
\min & \sum_{k=1}^{K_0} e^{a_{0k}^\top y + b_{0k}} &\\
s.t. & \sum_{k=1}^{K_i} e^{a_{ik}^\top y + b_{ik}} \le 1, & i = 1,\dots,m\\
& e^{g_i^\top y + h_i} = 1,  & i = 1,\dots,p 
\end{array}
$$

This is not yet a convex problem. However, if we take the logarithm of all the functions we obtain

$$
\begin{array}{lll}
\min & \log\left(\sum_{k=1}^{K_0} e^{a_{0k}^\top y + b_{0k}}\right) &\\
s.t. & \log\left(\sum_{k=1}^{K_i} e^{a_{ik}^\top y + b_{ik}}\right) \le 0, & i = 1,\dots,m\\
& g_i^\top y + h_i = 0,  & i = 1,\dots,p 
\end{array} \tag{10}
$$

Since log-sum-exp is a convex function, the resulting problem is a convex optimisation problem (noting that the logarithm is a monotone function and so it does not change the optimal point). We refer to $(10)$ as **geometric programme in convex form**.



---

**Example 5.10:** (*Design of a cantilever beam*) Consider the problem of designing a cantilever beam, which consists of $N$ segments. Each segment has unit
length and a uniform rectangular cross-section with width $w_i$ and height $h_i$. A vertical load $F$ is applied at the right end of the beam. This load causes
the beam to deflect (downward), and induces stress in each segment of the beam.

<div>
<img src="https://drive.google.com/uc?export=view&id=1hwIWxWdJILTe1vXfl9vKnJeRS_Gj6xdJ" width="400"/>
</div>

Figure 5.6. *Cantilever beam with 4 segments. Source: page 164 of [1].*

We want to minimise the total weight of the beam $w_1 h_1 + \cdots + w_N h_N$ subject to constraints on width, height, aspect ratio, stress and vertical deflection. The vertical deflection on the last beam $y_1$ can be computed as a function of the widths and heights of all the sections recursively

$$
y_i = 6\left(i-\frac{1}{3}\right)\frac{F}{E w_i h_i^3} + v_{i+1} + y_{i+1} \qquad v_i = 12\left(i-\frac{1}{2}\right)\frac{F}{E w_i h_i^3} + v_{i+1}
$$

where $E$ is a coefficient. The problem then can be formulated as a GP

$$
\begin{array}{llll}
\displaystyle \min_{w_i,h_i} & \sum_{i=1}^N w_i h_i & & \text{total weight}\\
s.t. & w_{\min} \le w_i \le w_{\max}, & i = 1,\dots,N & \text{bounds on width}\\
& h_{\min} \le h_i \le h_{\max}, & i = 1,\dots,N & \text{bounds on height}\\
& S_{\min} \le h_i/w_i \le S_{\max}, & i = 1,\dots,N & \text{bounds on aspect ratio}\\
& 6iF/(w_i h_i^2) \le \sigma_{\max}, & i = 1,\dots,N & \text{bounds on maximum stress}\\
& y_1 \le y_{\max},  &   & \text{bound on vertical displacement of last segment}
\end{array}
$$

---

# 5.5 Generalised Inequality constraints

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/vBj96R8TFek"></iframe>')

A useful generalization is obtained by allowing the inequality constraint functions to be vector valued, and using generalized inequalities in the constraints. A **convex optimisation problem with generalized inequality constraints** in standard form is given by

$$
\begin{array}{lll}
\min & f_0(x) &\\
s.t. & f_i(x) \preccurlyeq_{K_i} 0, & i = 1,\dots,m\\
& Ax=b,  & 
\end{array}
$$

where $K_i\subseteq \mathbb{R}^{k_i}$ are proper cones and the functions $f_i$ are $K_i$-convex. Like for standard convex optimisation problems the feasible set, any sublevel set and the optimal set are ($K_i$-)convex. Any local optimal point is global optimal. Convex optimisation problems with generalized inequality constraints can often be solved as easily as standard convex problems.

We now discuss two special cases. The first is the class of **conic programmes** described by

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & Fx+g \preccurlyeq_{K} 0,\\
& Ax=b.
\end{array}
$$

Conic programmes are a generalisation of LP because if $K$ is the nonnegative orthant then the problem reduces to an LP.

The other special case is the **semidefinite programme** (SDP) which is when $K$ is the cone of positive semidefinite $k \times k$ matrices $\mathbb{S}^k_+$ and takes the form

$$
\begin{array}{ll}
\min & c^\top x \\
s.t. & x_1F_1+ \cdots + x_nF_n + G \preccurlyeq_{K} 0,\\
& Ax=b,  
\end{array}
$$

where $G$, $F_1$, ..., $F_n\in\mathbb{S}^k_+$ and $A \in \mathbb{R}^{p\times n}$. Note that the inequality constraint is a linear matrix inequality (LMI). Also this problem is a generalisation of an LP (if all the matrices $F_i$ and $G$ are diagonal it reduces to an LP).

---

**Exercise 5.11:** Show that if a problem has multiple LMIs it can be anyway rewritten as an SDP (i.e. with a single inequality).

**Example 5.12:** A SOCP can be expressed as a conic programme (which is the reason for its name)

$$
\begin{array}{ll}
\min & c^\top x\\
s.t. & -(A_i x + b_i, c_i^\top x + d_i) \preccurlyeq_{K_i} 0 \qquad i=1,...,m\\
& Fx = g 
\end{array}
$$

with $K_i = \{(y,t) \in \mathbb{R}^{n_i+1} : ||y||_2\le t \}$, which is the second-order cone in $\mathbb{R}^{n_i+1}$.

**Example 5.13:** (*Matrix norm minimisation*) Let $A(x) = A_0 + x_1 A_1 + \dots + x_n A_n$ with $A_i \in \mathbb{R}^{p\times q}$. We consider the problem 

$$
\min\,\,\, ||A(x)||_2
$$

where $||\cdot||_2$ denotes the [spectral norm](https://en.wikipedia.org/wiki/Matrix_norm#Matrix_norms_induced_by_vector_p-norms) (i.e. the maximum singular value). This is a convex problem since $A(x)$ is a convex function of $x$. This problem can be formulated as an SDP in two steps. First, we write the epigraph form of the problem, namely

$$
\begin{array}{ll}
\displaystyle \min_{x,s} & s \\
s.t. & A(x)^\top A(x) \preccurlyeq sI.
\end{array}
$$

Then we note that $A^\top A \preccurlyeq t^2 I$ is the [Schur complement](https://en.wikipedia.org/wiki/Schur_complement) of 

$$
\left[\begin{array}{ll}tI & A\\A^\top & tI
\end{array}\right]\succcurlyeq 0.
$$

As a result, the matrix norm minimisation problem is equivalent to the SDP

$$
\begin{array}{ll}
\displaystyle \min_{x,t} & t \\
s.t. & \left[\begin{array}{cc}tI & A(x)\\A(x)^\top & tI
\end{array}\right] \succcurlyeq 0.
\end{array}
$$

---

# 5.6 Vector Optimisation

In [None]:
from IPython.display import HTML
HTML('<iframe width="850" height="480" src="https://www.youtube.com/embed/EPpRcLfA6zg"></iframe>')

**Errata:** At 11:30 the video says "there is some zero". It should be "there is some nonzero".

In the previous section we considered vector-valued constraints. We now consider vector-valued objective functions. A **convex vector optimisation problem** is given by

$$
\begin{array}{lll}
\displaystyle \min_{\text{wrt a cone }K} & f_0(x) &\\
s.t. & f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array} \tag{11}
$$

where $K \subset \mathbb{R}^q$ is a proper cone and $f_0: \mathbb{R}^n \to \mathbb{R}^q$ is the objective function. The only difference with problem $(1)$ is that $f_0$ takes values in $\mathbb{R}^q$ and the comparison of values is made with respect to a cone $K$. As one can expect from previous lectures (see [Chapter 3.5](https://colab.research.google.com/drive/1oMYvG4PZQt8M35tLbU3GrzTB_5nx34kY#scrollTo=-ad-F9bsEB3k)), in most vector optimisation problems we will not be able to find an optimal point because two objective values $f_0(x)$ and $f_0(y)$ are not necessarily comparable with respect to $\preccurlyeq_K$. Let us define the set of achievable objective values $\mathcal{O} = \{f_0(x) : x \text{ feasible} \}$. 

We recall that a feasible $x$ is optimal if $f_0(x)$ is the minimum value of $\mathcal{O}$. If $x^*$ is an optimal point, then $f_0(x^*)$ can be compared to the objective at every other feasible point and it is better than or equal to it. Mathematically, this can be expressed compactely with

$$
\mathcal{O} \subseteq f_0(x^*) + K \tag{12}
$$

which means that the entire set of achievable points lies in the cone $K$ with vertex $f_0(x^*)$. Most vector optimization problems do not have an optimal point and an optimal value, but this does occur in some special cases.







We also recall that a feasible point $x$ is Pareto optimal if $f_0(x)$ is a minimal value of $\mathcal{O}$. Thus, a point $x$
is Pareto optimal if it is feasible and, for any feasible $y$, $f_0(y) \preccurlyeq_K f_0(x)$ implies $f_0(y) = f_0(x)$. In other words: any feasible point $y$ that is better than or equal to $x$ has exactly the same objective value as $x$. Mathematically, this can be expressed compactely with 

$$
(f_0(x) - K) \cap \mathcal{O} = \{f_0(x)\} \tag{13}
$$

which means that the only achievable point in the inverted cone with vertex $f_0(x)$ is just $f_0(x)$ itself. 

The figures below illustrate $(12)$ and $(13)$.



<div>
<img src="https://drive.google.com/uc?export=view&id=1IAxW6rqbCiYRv_sMQiPFdZ2IU03bpZyz" width="400"/>
</div>

Figure 5.7. *Geometric representation of $(12)$ where $K = \mathbb{R}_+^2$.*

<div>
<img src="https://drive.google.com/uc?export=view&id=1QFBLnV7hvvtJa_QAzjkLryIB6cJLit1Q" width="400"/>
</div>

Figure 5.8. *Geometric representation of $(13)$ where $K = \mathbb{R}_+^2$.*

A vector optimisation problem can have many Pareto optimal values (and of course, also many optimal points. **N.B.** here we are saying that there can be different values $f_0(x)$'s that are Pareto optimal. In scalar optimisation there can be multiple optimal points $x$'s, but they all return the same value for the objective function). The set of Pareto optimal values is denoted by $\mathcal{P}$ and it has the property

$$
\mathcal{P} \subseteq \mathcal{O} \cap \textbf{bd } \mathcal{O}
$$

that means that Pareto optimal points are achievable points that always lie on the boundary of the achievable set.

A standard technique to find (some) Pareto optimal points is **scalarisation**. Choose any $\lambda \succ_{K^*} 0$, i.e. any vector that is positive in the dual generalized inequality. The scalarised problem is

$$
\begin{array}{lll}
\displaystyle \min & \lambda^\top f_0(x) &\\
s.t. & f_i(x) \le 0, & i = 1,\dots,m\\
& a_i^\top x = b_i,  & i = 1,\dots,p 
\end{array} \tag{14}
$$

Let $x$ be optimal for the scalarised problem $(14)$. Then $x$ is Pareto optimal for the vector optimisation problem $(11)$. In fact, if $x$ were not Pareto optimal, then there is a $y$ that is feasible, satisfies $f_0(y) \preccurlyeq_K f_0(x)$, and $f_0(x) \ne f_0(y)$. Since $f_0(x) - f_0(y) \succcurlyeq_K 0$ and is nonzero, we have $\lambda^\top  (f_0(x) − f_0(y)) > 0$, i.e.,
$\lambda^\top f_0(x) > \lambda^\top f_0(y)$. This contradicts the assumption that $x$ is optimal for the scalar problem $(14)$. The weight vector is a free parameter; by varying it we obtain (possibly) different Pareto optimal solutions of the vector optimisation problem $(11)$. This is illustrated in the figure below.


<div>
<img src="https://drive.google.com/uc?export=view&id=1UvmEP8KHS0fRQGHSu8lbLTDrRIPFnhVl" width="400"/>
</div>

Figure 5.9. *Scalarisation for a non-convex problem: three Pareto optimal values $f_0(x_1)$,
$f_0(x_2)$, $f_0(x_3)$ are shown. The first two values can be obtained by scalarization: $f_0(x_1)$ minimizes $\lambda^\top_1 u$ over all $u \in\mathcal{O}$ and $f_0(x_2)$ minimizes $\lambda^\top_2 u$, where $\lambda_1$, $\lambda_2 \succ 0$. The value $f_0(x_3)$ is Pareto optimal, but cannot be found by scalarization.*

The figure also shows that there are Pareto optimal points that cannot be obtained via scalarisation for any value of the weight $\lambda \succ_{K^*} 0$.

All statements above hold for general (i.e. non-convex) vector optimisation problems. For convex optimisation problems we have a partial converse. For every
Pareto optimal point $x^{\text{po}}$, there is some nonzero $\lambda \succcurlyeq_{K^*} 0$ (N.B. the equal) such that $x^{\text{po}}$ is a solution of the scalarized problem $(14)$.

So, for convex problems the method of scalarization would yield all Pareto optimal points, as the weight vector $\lambda$ varies over the $K^*$-nonnegative and nonzero values, i.e. $\lambda \succcurlyeq_{K^*} 0$, $\lambda \ne 0$. **Attention:** every solution of the scalarised problem with $\lambda \succ_{K^*} 0$ is Pareto optimal, but it is not true that every solution of the scalarised problem with $\lambda \succcurlyeq_{K^*} 0$ and $\lambda \ne 0$ is Pareto optimal.

In summary, if we use $\lambda \succ_{K^*} 0$, we do not find all Pareto optimal points, and we we use $\lambda \succcurlyeq_{K^*} 0$ we find all plus other non-Pareto optimal points.

A standard way of proceeding is to first find the Pareto optimal set by scalarisation with $\lambda \succ_{K^*} 0$ (see for instance [Example 2.3](https://colab.research.google.com/drive/1i4EDeNKubjRN55t8m5ycgJD83adxwKJc#scrollTo=7sqheDfFwAXb) in Chapter 2). To find the remaining Pareto optimal solutions, we have to consider nonzero weight vectors $\lambda$ that satisfy $\lambda \succcurlyeq_{K^*} 0$. For each such weight vector, we first identify all solutions of the scalarized problem. Then among these solutions we must check which are, in fact, Pareto optimal for the vector optimization problem. These ‘extreme’ Pareto optimal points can also be found as the limits of the Pareto optimal points obtained from positive weight vectors.

A **multicriterion** or **multi-objective optimisation problem** is a vector optimisation problem on the cone $K = \mathbb{R}^q_+$. Let $F_1$, ..., $F_q$ be the components of $f_0$. In a multicriterion problem, an optimal point $x^*$ satisfies $F_i(x^*) \le F_i(y)$ for all $i = 1,\dots,q$ for every feasible $y$. In other words, the optimal point is simulaneously optimal for each of the scalar problems which consider one $F_i$ at the time. We then say that the objectives are **noncompeting**. In the case of Pareto optimal points instead, there will be a trade-off between the objectives. In this case the Pareto optimal surface is also called **optimal trade-off surface**. The weight $\lambda_i$ can be thought of as quantifying our desire to make $F_i$ small or large. In particular, we should take $\lambda_i$ large if we
want $F_i$ to be small; if we care much less about $F_i$, we can take $\lambda_i$ small. We can
interpret the ratio $\lambda_i/\lambda_j$ as the relative weight or relative importance of the $i$-th objective compared to the $j$-th objective.

---

**Example 5.14:** Consider the regularised least-square approximation problem (Tikhonov regularisation) that we have seen in Chapter $1$, i.e. $f_0(x)=(||Ax-b||^2_2,||x||^2_2)$. Recall that the optimal solution was $x^* = (A^\top A + \delta I)^{-1}A^\top b$, where $\delta = \lambda_2/\lambda_1$. Solving the regularised problem for different $\delta$ we obtain all Pareto optimal points, except for the extreme points associated to $\delta \to \infty$ and $\delta \to 0$. In the first case we have the Pareto optimal solution $x = 0$, corresponding to $\lambda = (0, 1)$. In the second case we have the Pareto optimal solution $A^\dagger b$, where $A^\dagger$ is the pseudoinverse of $A$, which corresponds to $\lambda \to  (1, 0)$. The figure below illustrates many of the concepts covered in this section.

<div>
<img src="https://drive.google.com/uc?export=view&id=1qV-kltJckkVrqx6Gm8GzjJc4sGA9KTKu" width="400"/>
</div>

Figure 5.10. *Optimal trade-off curve (dark) for a regularized least-squares problem. The two dots corresponds to weights $\lambda = (0, 1)$ (only $F_2$ matters, which gives $F_2(x^*)=0$) and $\lambda \to  (1, 0)$ (only $F_1$ matters, which gives $F_1(x^*)=||AA^\dagger b-b||^2_2$).*

---

# End of CHAPTER 5