# Practice with the Lagrangian

First, read Section 5.1 of [Boyd and Vandenberghe, 2004](https://web.stanford.edu/~boyd/cvxbook/) about the Lagrange dual function.  (You can omit Section 5.1.6 for now, if you like.)  As you can see from the reading, for each optimization problem of the standard form

$$
\begin{align*}
\text{minimize}&\quad f_0(x)\\
\text{subject to}&\quad f_i(x)\le 0\quad i=1,2,\ldots,m\\
&\quad h_i(x) = 0\quad i=1, 2, \ldots,p
\end{align*}
$$

we define the Lagrangian

$$
L(x,\lambda,\nu) = f_0(x) + \sum_{i=1}^m\lambda_i f_i(x) + \sum_{i=1}^p\nu_ih_i(x).
$$

Note that $x$, $\lambda$ and $\nu$ are all vectors in general.  Make careful note of the direction of the inequality on $f_i(x) \le 0$.  That's important.  Sometimes you need to do a little work to convert the problem to standard form.  Also note that the Lagrangian isn't quite unique; for example, you have some freedom in choosing what order to put the constraints in, and the constraints $h_i(x)=0$ and $-h_i(x)=0$ are equivalent.

## Example

The Lagrangian for the problem

$$
\begin{align*}
\text{minimize}&\quad x_1^2 - 2x_1x_2\\
\text{subject to}&\quad 1 \le x_1 \le 8\\
&\quad x_1-x_2 \ge 7\\
&\quad x_1^2 + x_2^2 = 9
\end{align*}
$$

is

$$
L(x,\lambda,\nu) = x_1^2 - 2x_1x_2 + \lambda_1(1-x_1) + \lambda_2(x_1-8) + \lambda_3(7-x_1+x_2) + \nu_1(x_1^2 + x_2^2 - 9). 
$$

Make sure you understand why.

## 📝 Your turn



The following Markdown cell demonstrates what *typeset* answers (as opposed to coding answers) look like in a Jupyter notebook.  Notice that there is currently a placeholder that says "YOUR ANSWER HERE".  You need to **replace** that text with the correct answer.  In this case, you'll be using a Markdown cell.  You can (and should!) read more about Markdown formatting online.  Basically it provides a fairly simple text-based syntax for typeset text.  The Jupyter flavor of Markdown can also handle a fairly large subset of $\LaTeX$ commands.  You can view the source of the cell you are currently reading (just double-click it) to get an idea of how this works.

At this point in your careers, you should be developing good communications skills as well as creative problem-solving skills.  So, your typeset answers are expected to be clear, grammatically correct (when applicable), and nicely formatted.  If you are uncertain how to do something in the Jupyter interface, spend some time looking online for the answer.  You can also ask your instructor(s) for help, but it is good to practice finding answers from other sources as well.

So, here's your first typeset question for the class: write the Lagrangian for the optimization problem

$$
\begin{align*}
\text{minimize}&\quad x_1^2 + 3x_2\\
\text{subject to}&\quad 2\le x_1 \le 12\\
&\quad x_2 \ge 1\\
&\quad x_1 x_2 = 4
\end{align*}
$$

My Answer:
$$ L(x, \lambda, \nu ) = x_1^2 -2x_1x_2 + \lambda_1(2-x_1) + \lambda_2(x_1-12) + \lambda_3(x_2-1) + \nu_1(x_1x_2-4) $$

# Lagrangian duality

The next concept to understand is the Lagrange dual problem.  Every standard optimization problem (convex or not) has a natural *dual* optimization problem of the form
$$
\begin{align*}
\text{maximize}&\quad g(\lambda,\nu)\\
\text{subject to}&\quad \lambda \succeq 0
\end{align*}
$$
where $g$ is called the *dual objective*.

Read Sections 5.1 and 5.2 of Boyd and Vandenberghe.  You may also wish to take a look at 5.3 and 5.4, which give some helpful ways of looking at duality.  The following videos fill in a few details missing from the text.

<center>
<table>
    <tr>
        <td>
            <a href="https://youtu.be/vgw0qHyG4CY">
                <img src="https://img.youtube.com/vi/vgw0qHyG4CY/hqdefault.jpg"><br>
                Intro to the Lagrangian
            </a>
        </td>
        <td>
            <a href="https://youtu.be/ku-286oAXwc">
                <img src="https://img.youtube.com/vi/ku-286oAXwc/hqdefault.jpg"><br>
                Weak Duality
            </a>
        </td>
    </tr>
</table>
</center>

# The KKT conditions

Now you should read Section 5.5 about the KKT conditions.  Summarized in equation (5.49), the KKT conditions are a generalization of the [Lagrange multiplier theory](https://www.khanacademy.org/math/multivariable-calculus/applications-of-multivariable-derivatives/constrained-optimization/a/lagrange-multipliers-examples) you learned in Calc III.  In particular, notice that if there are no inequality constraints, you get exactly the equations you would solve in Calc III.

Solving an optimization problem using the KKT conditions is a little more complicated than what you've done before, basically because the inequality constraints complicate things somewhat.  A good technique for solving the KKT conditions is to try to guess which inequality constraints will be active.  If you assume a dual variable $\lambda_i$ is positive, then the complimentary slackness conditions show that the associated constraint $f_i(x)\le 0$ must be satisfied as equality.  So, for small problems, you can often make an educated guess.

You might also be interested in reading Sections 5.6 and 5.7.

## Example

Consider, for example, the optimization problem
$$
\begin{align*}
    \text{minimize}&\quad (x-a)^2 + (y-b)^2\\
    \text{subject to}&\quad 0\le x \le 1\\
    &\quad 0\le y\le 1.
\end{align*}
$$
Here $a$ and $b$ should be thought of parameters for the problem, they are not decision variables.  Now take a look at the example video below.  It shows how to derive the KKT conditions:
$$
\begin{align*}
    0 \le x^*,y^*\le 1 &\qquad\text{primal feasibility}\\
    \lambda_1^*,\ldots,\lambda_4^*\ge 0 &\qquad\text{dual feasibility}\\
    x^* = a + \frac{1}{2}\left(\lambda_1^*-\lambda_2^*\right) &\qquad\text{stationarity}\\
    y^* = b + \frac{1}{2}\left(\lambda_3^*-\lambda_4^*\right)\\
    \lambda_1^*x^* = 0,\quad \lambda_2^*(x^*-1)=0 &\qquad\text{complementary slackness}\\
    \lambda_3^*y^* = 0,\quad \lambda_4^*(y^*-1)=0
\end{align*}
$$

It also shows how to solve this system in two cases.  Take a look and see if you can get the hang of how the arguments work.  As a challenge, analyze the case $\lambda_1^*,\lambda_3^*>0$, $\lambda_2^*=\lambda_4^*=0$.  (As you might guess, this corresponds to the case that $a,b<0$.  See if you can prove it.)

<center>
<table>
    <tr>
        <td>
            <a href="https://youtu.be/f1pWxXduIqk">
                <img src="https://img.youtube.com/vi/f1pWxXduIqk/hqdefault.jpg"><br>
                KKT Example
            </a>
        </td>
    </tr>
</table>
</center>

## 📝 Your turn

Explain why it is impossible to have both $\lambda_1^*>0$ and $\lambda_2^*>0$ for this example?  (Hint: look at the complementary slackness conditions.)
If both $\lambda_1^*$ and $\lambda_2^*$ are positive, this means that in the case of the corresponding complementary slackness value with $\lambda_1^*$, $x^* = 0$ and in the case of the corresponding complementary slackness value with $\lambda_2^*$, $x^* = 1$. This is a contradiction, so it is impossible to have both $\lambda_1^*>0$ and $\lambda_2^*>0$.

# 💻 Using cvxpy

One handy tool for solving small convex optimization problems is [cvxpy](http://www.cvxpy.org).  It probably won't work for super-sophisticated problems, but it's so easy to use it's a nice place to start for smaller problems.  A good way to learn about it is to solve a simple problem with it.  The examples on the website should give you a good template to base your solution on.

Using cvxpy, implement the function below.  It should take two numbers, $a$ and $b$, and output the optimal point $(x^*,y^*)$ solving the problem
$$
\begin{align*}
    \text{minimize}&\quad (x-a)^2 + (y-b)^2\\
    \text{subject to}&\quad x^2-1\le y\le x
\end{align*}
$$

Notice that you've been given the skeleton of a function.  You should implement the function to have the behavior described in the docstring.  Look for the point in the code that says

```
# YOUR CODE HERE
raise NotImplementedError()
```

You need to **remove** this code and replace it with the correct implementation.  The cell that follows the function is a testing cell.  Once your function is working correctly, the tests should all pass with no errors.  (Be sure to go back and re-read the text at the top of this notebook.  To ensure that your code is not using variables or functions that have inadvertently been removed from the notebook, you should always restart the kernel and run all of your code one last time before submitting it.)

In [6]:
def solve_opt(a,b):
    """
    Uses cvxpy to solve the assigned optimization problem.

    Args:
        a: The 'a' parameter.
        b: The 'b' parameter.

    Returns:
        x,y: Tuple containing the optimal point.
    """
    
    import cvxpy as cvx
    
    # initialize decision variables
    x = cvx.Variable()
    y = cvx.Variable()

    # YOUR CODE HERE
    
    obj = cvx.Minimize((x-a)**2 + (y-b)**2)
    constr = [x**2-1 <= y, y <= x]
    prob = cvx.Problem(obj, constr)
    prob.solve()
    return x.value, y.value

In [7]:
# run some tests on the solution
from numpy.testing import assert_almost_equal

solutions = [((0.0, 0.0), (1.0646712704950963e-05, -7.265375755273091e-06)), 
             ((1.0, 0.0), (0.9999790375808776, 7.67899185353985e-06)), 
             ((0.0, 1.0), (0.49999699408512543, 0.4999969938855593)), 
             ((-2.0, -1.5), (-0.6180339879194733, -0.6180339902459888)), 
             ((2.0, 0.3), (1.2619190379423255, 0.592439656055633)), 
             ((0.8, -1.3), (0.41235233211011496, -0.8299655541699843))]

for (a,b),s in solutions:
    assert_almost_equal(s,solve_opt(a,b),decimal=4)
    
print('All tests passed!')

All tests passed!
