# Derivative-free optimization

In the following set of notes, we will discuss optimization of a function when its derivative is either unavailable, hard to obtain, noisy, or unreliable. This is common in engineering, as the underlying physical models can have abrupt changes in properties, leading to non-smooth objective functions. The models can be too complicated to find the derivative, or we solve the objective using software that does not give us gradients (the software is a <i>black-box</i>). Further, as function evaluations can be computationally heavy, it is hard to calculate numerical approximations for the derivative. Lastly, the response surface of the models can be erratic, thus making the derivative unreliable for optimization purposes. Derivative-free methods might also be better to avoid being trapped in a local extremum for multi-modal functions.

Again, let $F \colon \mathbb{R}^n \to \mathbb{R}$ be a multi-variable real-valued function (our objective function). We want to find a point $x_o \in \mathbb{R}^n$ such that $F(x_o) \leq F(x)$ for all $x \in \mathbb{R}^n$ (or, inversely, finding the maxima by finding a $x_o \in \mathbb{R}^n$ such that $F(x_o) \geq F(x)$ for all $x \in \mathbb{R}^n$). The derivative-based methods use the local derivative to find the steepest descent (ascent), and then move in the steepest descent direction. 

As the name indicates, derivative-free methods do not use the derivative in the optimization; they only use function evaluations. They might still use methods that resemble numerical derivatives (see the upcoming pattern search algorithm). Instead of using the derivatives, they use some sort of metaheuristic methods: A <i>heuristic</i> is a procedure for solving a problem, either more quickly when the classical methods are too slow, or approximately when classical methods fail to find exact solutions.

We might divide the metaheuristic algorithms into two classes, based on the number of candidate solutions used in each iteration. Single-solution-based metaheuristic algorithms include simulated annealing, as we will explore in an upcoming note. Population-based metaheuristic algorithms have a population of test candidates that interact at each iteration. These population-based algorithms include evolutionary algorithms such as the genetic algorithm (GA) and swarm-intelligence algorithms such as the particle swarm optimization (PSO). Both will be considered in the upcoming notes.

Derivative-free optimization methods often have input parameters, which might require tuning to obtain good convergence of the methods. In general, one might need to use different methods for different problems, so derivative-free optimization is a bit of an art, unfortunately.


[Previous note](ludecomposition.ipynb) -- [Next note](patternSearch.ipynb)