# Optimization-Optimality-Conditions-in-Convex-Problems

> The Lagrangian and the KKT Conditions

- hide: true
- toc: true
- badges: true
- comments: true
- categories: ['Optimization','Applied Mathematics','Proofs']

# Introduction

Taking advantage of the geometry of *linear programs*, we [were able to deduce](https://v-poghosyan.github.io/blog/optimization/applied%20mathematics/proofs/2022/01/25/Optimization-Geometry-of-Linear-Programs.html) that their optima occur at extreme points of the polytopal constraint set. This key geometric insight reduced the search space of optimal solutions to the set of extreme points thereby simplifying the problem of solving LP's.

In this post we will develop a more general geometric condition for identifying the optima of *convex programs (CP's)*. The theory developed here will also apply to linear programs, since LP's are a strict subset of convex programs. 

CP's can be stated most generally as 

$
\begin{cases}
\min_x: f(x)
\\
s.t.: \begin{aligned} &g_i(x) \leq 0 \ \ \forall i = 1,...,m
\\ 
&h_j(x) = 0 \ \ \forall j = 1,...,p
\end{aligned}
\end{cases}
$

Where $f(x)$ is a convex objective, the $g_i$'s define $m$ convex inequality constraints, and the $h_j$'s are $p$ linear equality constraints. Note that the equality constraints *must* be linear because, for a more general convex function $h_j$, $\{ x : h_j(x) = 0 \}$ does *not* represent a convex set.

# The Lagrangian

Consider the simple convex problem with an objective $f(x)$ and only one equality constraint $h(x)=0$.

The picture below illustrates this situation using a maximization problem with the objective function $f(x_1,x_2) = x_1^2 e^x_2 x^2$ and a circular equality constraint. 

Note that the circular equality constraint does not define a convex set, so the given problem is not a CP. However it's intended as a general example that demonstrates a key geometric property of optimal solutions.

![](my_icons/lagrange-condition.png "Optima occur on the tangent point(s) or, more generally, tangent space(s) of the level sets of the objective function and the constraint boundary")
<br> 

The key observation is that, if the convex problem is feasible at all, its optima, if any exist, occur at a level set of the objective function that is tangent to the constraint boundary.

That is, $x^*$ is a local maximum if and only if $\nabla_x f(x^*) = \pm \lambda \nabla_x h(x^*)$ for some $\lambda > 0$. Note that $\lambda$ may either be added or subtracted, depending on whether the problem is that of minimization or maximization. 

Interestingly, this condition is general enough that for an unconstrained convex problem it reduces to $\nabla_x f(x^*) = 0$, which is the familiar first-order necessary condition for an interior point (such as the optimal solution of an unconstrained problem) to be a *stationary point*. And, since the objectives in this post are all assumed to be convex, the second-order optimality conditions hold at any stationary point. Hence, $\nabla_x f(x^*) = 0$ is the only condition we need to determine that $x^*$ is a local optimizer.

Since the stationary points of an unconstrained problem can be found with such relative ease, the goal is to construct an unconstrained problem related to a given constrained problem so that the former's stationary points are optimal and feasible for the constrained problem.

With this in mind, we define the *Lagrangian* as the function $\mathcal{L}(x, \lambda) :=f(x) - \lambda h(x)$. Now, any $(x^*, \lambda^*)$ that satisfies $\nabla \mathcal{L}(x^*, \lambda^*) = 0$ is a stationary point of the Lagrangian function, as well as an optimal, feasible solution to the constrained problem.

To see this, note that $\nabla \mathcal{L} = [\nabla_x  \mathcal{L}, \nabla_{\lambda} \mathcal{L}]^T = [0,0]^T$ implies $\nabla_x f(x^*) = \pm \lambda \nabla_x h(x^*)$ which is the optimality condition, and $h(x^*) = 0$ which is the feasibility condition.

So, $\nabla \mathcal{L} = 0$ is taken as a *certificate of optimality*.

From the above discussion, we also see that $\lambda = 0$ would result in disrespecting the equality constraint $h(x^*) = 0$. Hence the need for $\lambda > 0$...


## Convex Problem with Multiple Equality Constraints

For a convex program with multiple equality constraints, the condition generalizes to:


$\nabla_x f(x^*) = \lambda_1 \nabla_x h_1(x^*) \pm \lambda_2 \nabla_x h_2(x^*) \pm ... \pm \lambda_n \nabla_x h_n(x^*)$ with $\lambda_1, \lambda_2, ... \lambda_n > 0$.

That is, the gradient of the objective function at the optimizer should be a conic combination of the $\pm$ gradients of the equality constraints.

This is pointless to illustrate in two dimensions, since the feasible set would either be $\emptyset$ or contain a single point which is optimal by default. Instead, we can imagine the spherical level sets of $f(x) = x_1^2 + x_2^2 + x_3^2$ and the line of intersection of two planes in $\mathbb{R}^3$, h_1(x) = 0$ and $h_2(x) = 0$. The maximizer would then be the point at which the line of intersection is tangent to a spherical level set of the objective.


# Karush-Kuhn-Tucker Conditions. 

KKT Conditions are a generalization of the necessary and sufficient optimality conditions. They're considered to be a certificate of optimality and are often used to verify if a particular guess is optimal or not. In many cases this beats solving the problem from scratch.

Consider the following convex program with both inequality as well as equality constraints. 

$
\begin{cases}
\min_x: f(x)
\\
s.t.: \begin{aligned} &g_i(x) \leq 0 \ \ \forall i = 1,...,m
\\ 
&h_j(x) = 0 \ \ \forall j = 1,...,p
\end{aligned}
\end{cases}
$

Let the dual variables corresponding to the inequality constraints be $\lambda$ and those corresponding to the equality constraints be $\mu$. 

> **KKT Conditions:** &nbsp; $x^*, (\lambda^*, \mu^*)$ satisfy the KKT conditions if the following hold:
&nbsp;
> 1. $g_i(x^*) \leq 0$
> 2. $h_j(x^*) = 0$ 
> 3. $\lambda^* \geq 0$
> 4. $\lambda^*_ig_i(x^*) = 0 \ \ \forall i$
> 5. $\nabla_x f(x^*) + \sum \lambda^*_i\nabla_xg_i(x^*) + \sum \mu^*_j\nabla_xh_j(x^*) = 0$ where the $h_j$'s are the active inequality constraints at $x^*$
<br>

> Note: If there are no inequality constraints, the KKT conditions turn into the Lagrange conditions. Then condition $5$ is simply a statement about the gradient of the Lagrangian being zero, i.e. $\nabla \mathcal{L} = 0$. However, as we shall soon see, it has a strong geometric interpretation as well. 

These are simply the conditions the primal-dual pair must meet to satisfy the KKT conditions. The following theorem is what establishes the certificate of optimality mentioned earlier.

> **Theorem** &nbsp; If Strong Duality holds, then $x^*, (\lambda^*, \mu^*)$ are respectively primal-dual optimal if and only if they satisfy the KKT conditions. 
<br> 

To see how the KKT conditions are a generalization of the optimality conditions in a simple, yet elucidative case, we consider the unconstrained problem $\min_x f(x)$ and the same problem with a single inequality constraint

$
\begin{cases}
\min_x: f(x)
\\
s.t.: g(x) \leq 0
\end{cases}
$

**Case 1:** The unconstrained optimum is in the feasible region.
<br> 

**Picture**

![](my_icons/interior-optimum.png "Case 1: Optimum is an interior point")
<br> 

Recall that $x^*$ is a local minimum of the unconstrained problem if and only if $\nabla_x f(x^*) = 0$ and $\nabla_{xx} f(x^*) \succeq 0$. This is true whenever $x^*$ is an interior point. It's particularly true in the unconstrained case since all points, in particular $x^*$, are interior points.

Since the inequality constraint is not active at the constrained optimum, the latter is  identified by the same conditions as the unconstrained optimum. That is, 

The Lagrangian of the unconstrained problem is, of course, the objective function itself.

![](my_icons/interior-optimum.png "Case 1: Optimum is an interior point")
<br> 

In this case, the optimum can be found simply by solving for $x^*$ that satisfies the above necessary and sufficient conditions.

![](my_icons/exterior-optimum.png "Case 2: Optimum is an interior point")

Take a simple convex program with one inequality constraint. 