### Lagrange Multipliers

This is a method for finding minima and maxima subject to equality constraints.

[This Youtube video](https://www.youtube.com/watch?v=yuqB-d5MjZA&list=PLSQl0a2vh4HC5feHa6Rc5c0wbRTx56nF7&index=93) and the one that follows does a nice job to summarise it:

![lagrangian function](Screenshot_2023-09-04_20-08-12.png)

Note that the circle is the equality constraint.

Then in that example:

$ \nabla f(x_m, y_m) = \lambda g(x_m, y_m) $

And $ \lambda $ is the Lagrange Multiplier.

The strategy is to take find $ \nabla f, \nabla g $

$ \nabla f = \lambda \nabla g $

In the example, $ \nabla f $ is a vector of partials:

$ \nabla f = \left[ \begin{matrix} 2xy \\ x^2 \end{matrix} \right] $

$ \nabla g = \left[ \begin{matrix} 2x \\ 2y \end{matrix} \right] $

And don't forget the constraint:

$ x^2 + y^2 = 1 $

Then what is the value of $ \lambda $ that satisfies the condition. It's handy to think of $ \lambda $ as being a proportionality constant.

### Lagrangian Functions

Another excellent video is this one https://www.youtube.com/watch?v=hQ4UNu1P2kw&list=PLSQl0a2vh4HC5feHa6Rc5c0wbRTx56nF7&index=97

It shows us how to package up the three elements of the Lagrange Multiplier method into a single function.

$ \mathcal{L}(x, y, \lambda) = f(x, y) - \lambda (g(x, y) - b) $

where $ b $ is the RHS of the constraint that was originally given.

So in the previous example, this becomes

$ \mathcal{L}(x, y, \lambda) = (x^2 y) - \lambda ((x^2 + y^2) - 1 ) $

And now we can calculate the gradient of the Lagrangian to find the 

$ \nabla \mathcal{L} = \vec 0 $

So this is:

$ \left[ \begin{matrix} \frac{\partial \mathcal{L}}{\partial x} \\ \frac{\partial \mathcal{L}}{\partial y} \\
\frac{\partial \mathcal{L}}{\partial \lambda} \end{matrix} \right] = \left[ \begin{matrix} 0 \\ 0 \\ 0 \end{matrix} \right]$

#### Example of using a Lagrangian function

Consider:

$ \min f(x_1, x_2) = x_1 - x_2^2 $

$ s.t. g(x_1, x_2) = x_1^2 + x_2^2 - 1 = 0 $

The Lagrangian is:

$ \mathcal{L}(x_1, x_2, \lambda) = x_1 - x_2^2 - \lambda (x_1^2 + x_2^2 - 1) $

But it's frequently better to change the sign of $ \lambda $ first, meaning we also rearrange $ g $:

$ \mathcal{L}(x_1, x_2, \lambda) = x_1 - x_2^2 + \lambda (1 - (x_1^2 + x_2^2) $

We wish to find out what happens when:

$ \nabla \mathcal{L} = \vec 0 $

$ \frac{\partial \mathcal{L}}{\partial x_1} = 1 - 2\lambda x_1 = 0 $ (DE 1)

$ \frac{\partial \mathcal{L}}{\partial x_2} = -2x_2 - 2\lambda x_2 = 0 $ (DE 2)

$ \frac{\partial \mathcal{L}}{\partial \lambda} = (x_1^2 + x_2^2) - 1 = 0 $ (DE 3)

The sign of the constraint here doesn't matter since it's an equality constraint. Not the case for inequality.

Now we can think about solving these. In other words, what are the conditions that lead to the equations being $ 0 $

Take the second equation: $ -2 x_2(1+\lambda) = 0 $.

This is true if $ x_2 = 0 $ or if $ \lambda = -1 $.

Consider $ x_2 = 0 $, then in (DE 3) this implies $ x_1 = \pm 1 $. (notice $ x_1 $ is a square)

But if you consider that $ \lambda = -1 $ and have a look at (DE 1) then it's clear that means $ x_1 = \frac{1}{2} $

That means that $ (-1,0), (1,0) $ are stationary points.

Similarly we find that for $ x_1 = -\frac{1}{2} $ that $ x_2 = \pm \frac{\sqrt{3}}{2} $

So that means we have another two stationary points: $ \left(-\frac{1}{2}, -\frac{\sqrt{3}}{2} \right), \left(-\frac{1}{2}, \frac{\sqrt{3}}{2} \right) $

_Note: it is a good habbit, when you find a stationary point using two equations to check the third equality holds_.

And now, we can see where the objective function touches the optimality condition below:

![objective function and optimality conditions](Screenshot_2023-09-05_12-27-09.png)
