Tutorial written by Jacob M. Dean.

### An explanation of the minimize function

In the exercise for this course you will have to use the minimize function, which is imported from the scipy.optimize library. Importing minimize is done by using the following line

In [1]:
from scipy.optimize import minimize

In this tutorial we will briefly go through how the minimize function works and potential issues that may be encountered with the minimize function. 

From your previous studies you should know that the function $f(x) = x^2$ looks like this:

<img src="img/image_1.png" alt="drawing" width="400"/>

This clearly has a minimum at $x=0$. We can use the minimize function to find this minimum. But first we must define the function in python.

In [2]:
def square(x):
    """
    Description: a function that squares a number
    
    Args:
    x (int/float): the number to be squared
    
    Returns:
    x_squared (int/float): x squared
    """
    x_squared = x**2
    return(x_squared)

The minimize function requires a function to minimize and an initial guess. We know this because documentation tells us so (https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) or alternatively we could type `minimize?`, which brings up the function description. Lets arbitrarily choose an initial guess at 4. We can now find the minimum of the function square by typing

In [3]:
minimum_squared = minimize(square, 4)

The output of the minimize function is stored in the minimum_squared variable. Lets see what information has been created

In [4]:
minimum_squared

      fun: 6.668345606604765e-17
 hess_inv: array([[0.5]])
      jac: array([-1.43082657e-09])
  message: 'Optimization terminated successfully.'
     nfev: 12
      nit: 3
     njev: 4
   status: 0
  success: True
        x: array([-8.16599388e-09])

Now this is a lot of information, but the three pieces of information that you require are fun, message, and x. These represent:
- fun: the value of the function at the solution given.
- message: this will tell you whether the minimize function has converged on a solution
- x: this is the solution the minimize function has found

The solution that has been found is of the order $1 \times 10^{-9}$, which is basically zero. So the minimize function has found our minimum correctly! In order to obtain our solution we can type:

In [5]:
minimum_squared.x[0] #the [0] is required as our solution is stored in an array. We will see why later.

-8.165993881092959e-09

But was has the minimize function done to get from $x=1$ to the solution at $x=0$? If we image placing a ball at $x=4$ on our graph then it will "roll down the hill" until it gets to the bottom as follows:

<img src="img/image_2.png" alt="drawing" width="800"/>

This is in effect how the minimize function finds its minimum. $f(x)=x^2$ only has one minimum, what happens if we have more than one minimum? Say with the function $f(x)=(x+1)^2(x-1)^2$, which looks like

<img src="img/image_3.png" alt="drawing" width="400"/>

This has two minima. One at +1 and one at -1. Where does the ball go now? Well it depends where we start, or what our initial guess is:

<img src="img/image_4.png" alt="drawing" width="800"/>

So if our initial guess is less than 0 then we will obtain the minimum at $x=-1$ and if our initial guess is greater than 0 then we will obtain the minimum at $x=1$. Lets see if this is the case by writing a function for $f(x)=(x+1)^2(x-1)^2$ and using the minimize function:

In [6]:
def example_polynomial(x):
    """
    Description: a function that evaluates (x+1)(x+1)(x-1)(x-1)
    
    Args:
    x (int/float): the number to be evaluated
    
    Returns:
    evaluated_value (int/float): (x+1)(x+1)(x-1)(x-1)
    """
    evaluated_value = (x+1)*(x+1)*(x-1)*(x-1)
    return(evaluated_value)

greater_than_minimum = minimize(example_polynomial, +2)
less_than_minimum = minimize(example_polynomial, -2)

print("If we start at x=2, the minimum we find is x=", greater_than_minimum.x[0], "and f(x) =",greater_than_minimum.fun)
print("If we start at x=-2, the minimum we find is x=", less_than_minimum.x[0], "and f(x) =",less_than_minimum.fun)

If we start at x=2, the minimum we find is x= 0.9999989561808204 and f(x) = 4.358229369988024e-12
If we start at x=-2, the minimum we find is x= -0.9999989710772856 and f(x) = 4.23472345132519e-12


These are the solutions that we would expect from our ball rolling example. Here both minimum have a value of zero. What happens if we have two minima which have different values though? This is the case for $f(x)=(2x+1)(x+1)(x-1)(x-2)$, which looks like

<img src="img/image_5.png" alt="drawing" width="400"/>

Where will the ball roll to now? 

<img src="img/image_6.png" alt="drawing" width="800"/>

Lets see if these predictions are correct by writing this polynomial as a function and initialising our minimization at $x=-3$ and $x=+3$

In [7]:
def second_example_polynomial(x):
    """
    Description: a function that evaluates (2x+1)(x+1)(x-1)(x-2)
    
    Args:
    x (int/float): the number to be evaluated
    
    Returns:
    evaluated_value (int/float): (2x+1)(x+1)(x-1)(x-2)
    """
    evaluated_value = (2*x+1)*(x+1)*(x-1)*(x-2)
    return(evaluated_value)

second_greater_than_minimum = minimize(second_example_polynomial, +3)
second_less_than_minimum = minimize(second_example_polynomial, -3)

print("If we start at x=3, the minimum we find is x=", second_greater_than_minimum.x[0], "and f(x) =",second_greater_than_minimum.fun)
print("If we start at x=-3, the minimum we find is x=", second_less_than_minimum.x[0], "and f(x) =",second_less_than_minimum.fun)

If we start at x=3, the minimum we find is x= 1.6029119759082024 and f(x) = -2.620904951418587
If we start at x=-3, the minimum we find is x= -0.7784457524125226 and f(x) = -0.6096677436637878


These are the two minimum that we would expect from $f(x)=(2x+1)(x+1)(x-1)(x-2)$. However, note than the value of $f(x)=(2x+1)(x+1)(x-1)(x-2)$ is lower at $x=1.602$. This is a very important observation. We often want to find the **global minimum** which is the lowest value of $f(x)$. But as we have seen the minimize algorithm can get stuck in **local minima**, like at $x=-0.609$ above, if we "start the ball rolling" in the wrong place. 

Up until now we have just used functions of one variable, $f(x)$. What about functions of two or more variables? Can the minimize function find the minima of those functions. Lets think about the simplest of these functions $g(x,y) = x^2 + y^2$ which looks like:

<img src="img/image_7.png" alt="drawing" width="400"/>

It may be difficult to see but $g(x,y) = x^2 + y^2$ is in essence a bowl. Where the bottom of the bowl is at $x=0, y=0$. But can the minimize function find this minimum? Yes, it can! We just need to give it the information in the correct way. 

**The minimize function can only optimize one variable**. How can we fit two variables, $(x,y)$, into one variable? A list is a good solution to this problem. We can use something like variables = [x,y]. Lets write a function to evaluate $g(x,y)$ and optimize it starting at $x=2,y=-2$ and see what result we obtain using the minimize function. 

In [8]:
def two_dimensional_polynomial(variables):
    """
    Description: a function that evaluates x**2 + y**2
    
    Args:
    variables (list): a kist containing x as the first item and y as the second item
    
    Returns:
    evaluated_value (float/int): x**2 + y**2
    """
    evaluated_value = variables[0]**2 + variables[1]**2
    return(evaluated_value)

two_dimensional_minimum = minimize(two_dimensional_polynomial, [2,-2])
two_dimensional_minimum

      fun: 1.3999347093356855e-14
 hess_inv: array([[0.75000002, 0.25      ],
       [0.25      , 0.74999998]])
      jac: array([-1.63721946e-07, -1.40312156e-07])
  message: 'Optimization terminated successfully.'
     nfev: 12
      nit: 2
     njev: 3
   status: 0
  success: True
        x: array([-8.93115537e-08, -7.76066587e-08])

We can see that we have found our minimum successfully! Our x and y values are basically zero. Our solution is similarly stored in two_dimensional_minimum.x . **The order in which the solutions are stored is the same as the input into the function being minimized.** So two_dimensional_minimum.x[0] corresponds to the x value at our minimum and two_dimensional_minimum.x[1] corresponds to the y value at our minimum as this is the input order specified in two_dimensional_polynomial. 

The minimize function can be used in any number of dimensions, it is not limited to one and two. Infact, that is why the solutions are always stored in arrays. The minimize function is designed to work in multiple dimensions.

**Key take aways:**
- you can check if a minimize function has converged by checking the output information.
- the minimize function solutions depend on your initial guess.
- the minimize function can only optimize one parameter, but you can optimize more variables by storing them in a list.