# Discussion 1: Overview of Optimization

In this discussion, we will talk about
* The general task of optimization
* The different types of optimization problems and algorithms

---

The entire field of **optimization** concerns one broad topic only:

>Given some (usually scalar) function $f(\mathbf{x})$ of in general many variables (i.e. $\mathbf{x}\in\mathbb{R}^n$, $n\ge1$), find a point $\mathbf{x}^*$ such that $f(\mathbf{x}^*)\le f(\mathbf{x})$ for all other $\mathbf{x}$, i.e. $\mathbf{x}^*$ is a minimizer of $f$.

This may seem like a very niche and abstract topic, but a large number of problems in many different fields can be essentially boiled down to this idea. For several interesting examples of optimization problems in various fields, see [here](https://neos-guide.org/Case-Studies).

Mathematically, finding a minimum of a function, even one with many inputs, is straightforward. A **sufficient condition** for (at least a local) minimum is that
1. $\nabla f(\mathbf{x}^*)=\mathbf{0}$.
2. $\nabla^2 f(\mathbf{x}^*)$ is a [symmetric positive definite (SPD)](https://en.wikipedia.org/wiki/Definite_symmetric_matrix) matrix.

In practice, however, finding such an $\mathbf{x}^*$ can be extrememly difficult, particularly if the system of equations is highly nonlinear. This entire class is devoted to 
1. Describing various types of optimization (constrained vs. unconstrained, continuous vs. discrete, local vs. global linear vs. nonlinear, etc.). See [here](https://neos-guide.org/content/optimization-taxonomy) for an excellent graphic "taxonomy" of optimization types.
2. Outlining various algorithms for solving optimization problems of some of the types. See [here](https://neos-guide.org/content/algorithms-by-type) for a (nonexhaustive) list of algorithms.

We will mainly focus on *continuous, unconstrained* optimization, though we will discuss a few continuous, constrained optimization problems such as [linear programming](https://en.wikipedia.org/wiki/Linear_programming), which has many applications, e.g. in business and economics. We will not discuss discrete optimization, which is in general a much more difficult problem.

---

### Example problem: [Rosenbrock function](https://en.wikipedia.org/wiki/Rosenbrock_function)

Most problems (e.g. on homework) in this class will require programming, but here is one we can do by hand:

**Problem:** Compute the gradient $\nabla f(x)$ and Hessian $\nabla^2 f(x)$ of the Rosenbrock function

$$f(\mathbf{x}) = 100(x_2 - x_1^2)^2 + (1-x_1)^2$$

and show that $\mathbf{x}^*=(1, 1)$ is the only local minimizer of this function, and that the Hessian matrix at that point is positive definite.

**Solution:** Since $\nabla f(\mathbf{x})=\left\langle \frac{\partial f}{\partial x_1}, \frac{\partial f}{\partial x_2}\right\rangle$, and

$$ \begin{align*}
    \frac{\partial f}{\partial x_1} & = -400x_1(x_2-x_1^2)-2(1-x_1)\\
    \frac{\partial f}{\partial x_2} & = 200(x_2-x_1^2)
\end{align*} $$

to satisfy $\nabla f=\mathbf{0}$, we must have from the second component, $200(x_2-x_1^2)=0\implies x_2=x_1^2$. Plugging this into the equation for the second component, the first term vanishes and we have $-2(1-x_1)=0\implies x_1=1$, so $x_2=1^2=1$. Thus at $\mathbf{x}^*=(1,1)$ and *only* at this $\mathbf{x}^*$, $\nabla f(\mathbf{x}^*)=\mathbf{0}$.

For the Hessian, we have
$$ \nabla^2 f(\mathbf{x}) = \begin{bmatrix}
    \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1\partial x_2} \\
    \frac{\partial^2 f}{\partial x_2\partial x_1} & \frac{\partial^2 f}{\partial x_2^2}
\end{bmatrix} = \begin{bmatrix}
    -400x_1\cdot-2x_1-400(x_2-x_1^2)+2 & -400x_1 \\
    -400x_1 & 200
\end{bmatrix} = \begin{bmatrix}
    800x_1^2-400(x_2-x_1^2)+2 & -400x_1 \\
    -400x_1 & 200
\end{bmatrix} $$

so at $\mathbf{x}^*=(1,1)$, we have

$$ \nabla^2 f(1,1) = \begin{bmatrix}
    802 & -400 \\
    -400 & 200
\end{bmatrix} $$

The eigenvalues of this matrix are solutions to the characteristic equation

$$ (802-\lambda)(200-\lambda)-160000=0 \implies \lambda^2 - 1002\lambda + 400 = 0 \implies \lambda=501\pm\sqrt{250601},$$

both of which are (only barely!) positive, so the Hessian matrix is positive definite.