# Constrained NLP 
## Introduction 
Recall that in constrained NLP, the objective is to find the optimal value (maximum or minimum) of an objective function, subject to a set of constraints, where at least one function is not linear. 
In this section, we will present the canonical form for non-linear problems, and use this form to introduce the Kuhn-Tucker conditions, which are a set of conditions that the optimal value must hold.  

## Canonical Form
The canonical form of a **minimisation** problem is defined as: 

$\min f(x)$

$\\text{s.t.}$

$g_i(x) \leq 0 \quad \forall i = 1, ..., m$

where $x = [x_1, x_2, ..., x_n]$ is an array with the n different decision variables of the problem, $f(x)$ is the (non-linear) objective function and $g_i(x)$ are the (non-linear) functions of the left hand sides of the m constraints.
Note that in this canonical form, all the constraints are of type *less or equal* and all the right hand sides are 0. It is not required that all of the constraints are of the same type, but for now, we will use this definition without loss of generality to present the Kuhn-Tucker conditions.

Similarly, the canonical form of a **maximisation** problem is defined as: 

$\max f(x)$

$\\text{s.t.}$

$g_i(x) \leq 0 \quad \forall i = 1, ..., m$

Note that, besides that and the fact that the type of optimisation function (which is obviously of type maximise) is different, the canonical form is the same as for minimisation problems. 

## Kuhn Tucker Conditions
The Kuhn-Tucker (KT) conditions provide a set of differential equations that can be used to find the optimal value of a constrained NLP, based on the **Lagrangian** defined below:

### Lagrangian
Given an optimisation problem with m constraints in the canonical form, let us define the **Lagrangian** function as: 

$\text{L}(x,\lambda) =  f(x) + \sum_{i=1}^{m}{\lambda_i*g_i(x)}$

That is, to take into account the constraints in the objective function, we add the corresponding functions in the right hand side multiplied by a set of coefficients noted as $\lambda_i$ which are known as Lagrangian multipliers. 

### Gradient condition
Now, for any candidate solution to be an optimal solution, we know that, as for unconstrained NLP, it must be a critical point and thus satisfy: 

$\nabla_x \text{L}(x,\lambda) = 0$

That is, since each component of the gradient is the first order derivative of the corresponding decision variable, the Lagrangian must satisfy: 

$\frac{\delta \text{L}}{\delta x_j} = 0 \quad \forall j = [1, ..., n]$

### Feasibility condition 
Additionally, we must ensure that the solution is **feasible**, i.e. that meets all the constraints. Therefore, the feasibility condition yields: 

$g_i(x) \leq 0 \quad \forall i = [1, ..., m]$

Note that, given the expression of the Lagrangian, this is equivalent to: 

$\frac{\delta \text{L}}{\delta \lambda_i}g_i(x) \leq 0 \quad \forall i = [1, ..., m]$

### Orthogonality condition
Now, note that for the Lagrangian and the objective function to represent exactly the point of the function, at the optimal value, the following conditions must be met: 

$\lambda_i*g_i(x)=0 \quad \forall i = [1, ..., m]$

That is, either the Lagrangian multiplier is equal to zero or the corresponding right hand side is equal to zero.

### Non-positive, non-negativity conditions




