.. meta::
   :description: Differential Evolution (DE) is an Evolutionary Algorithm (EA) originally designed for solving optimization problems over continuous domains. It has a simple implementation yet a great problem-solving quality, which makes it one of the most popular population-based algorithms, with several successful applications reported.

.. meta::
   :keywords: Differential Evolution, DE,  Multi-modal Optimization, Nature-inspired Algorithm, Single-objective Optimization, Python

# DE: Differential Evolution

Differential evolution <cite data-cite="de_article"></cite> is an Evolutionary Algorithm (EA) originally designed for solving optimization problems over continuous domains. It has a simple implementation yet a great problem-solving quality, which makes it one of the most popular population-based algorithms, with several successful applications reported.

From its original conception, DE was designed to fulfill some requirements that have made it particularly useful:

- Ability to handle non-differentiable, nonlinear, and multimodal cost functions.
- Parallelizability to cope with computationally intensive cost functions.
- Ease of use: few control variables to steer the minimization. These variables should also be robust and easy to choose.
- Good convergence properties: consistent convergence to the global minimum in consecutive independent trials.

Several DE variants have been proposed in the literature (including multi-objective variants). Most of them share some common operations when producing offsprings for the next generation. A detailed overview on DE mechanisms can be found at <cite data-cite="de_book"></cite>.

At each generation, $N$ new individuals (same as population size) are produced by operations originally defined as *mutation* and *crossover*. Notice the *mutation* in DE is conceptually different from the usual definition in genetic algorithms and in pymoo it is implemented as a *Crossover* operator.

Several reproduction schemes have been proposed for DE. Usually, they are denoted DE/*x*/*y*/*z*, in which *x* corresponds to the mutation *parent selection* scheme, *y* to the number of difference vectors in *mutation*, and *z* to the *crossover* strategy.

Probably the most popular mutation scheme is the DE/rand/1, represented by the equation below.

$$
\boldsymbol{v}_i=\boldsymbol{x}_{r1}+F(\boldsymbol{x}_{r2}-\boldsymbol{x}_{r3})
$$

In which, $v_i$ is the mutant vector of index $i$; and $r1$, $r2$, and $r3$ are mutually different indices and also different from $i$. The difference between individual $r2$ and $r3$ scaled by the $F$ parameter is added to $r1$.

The *crossover* operation occurs between a given mutant vector and its corresponding parent of same index. The most usual is the binomial (bin) crossover, given by the equation below. Notice that it is mandatory that at least one attribute $j$ of $u$ is inherited from $v$.

$$
u_{i, j}
\begin{cases}
 v_{i, j} & \text{ if } \text{ rand }(0, 1)_{i, j} < CR \\ 
 x_{i, j} & \text{ if } \text{ rand }(0, 1)_{i, j} \geq CR \; \lor \; j = j_{rand}
\end{cases}
$$

This creation of a new individual in the DE/rand/1/bin scheme is represented below:


<div style="text-align: center;">
    <img src="https://github.com/anyoptimization/pymoo-data/blob/main/docs/images/de_mating.png?raw=true" width="350">
</div>


A great tutorial and more detailed information can be found [here](https://web.archive.org/web/20190928024126/http://www1.icsi.berkeley.edu/~storn/code.html). The following guideline is copied from the description there (variable names are modified):

If you are going to optimize your own objective function with DE, you may try the following classical settings for the input file first: Choose method e.g. DE/rand/1/bin, set the population size $N$ to 10 times the number of parameters, select weighting factor `F=0.8`, and crossover constant `CR=0.9`. Recently, it has been found that selecting $F$ from the interval (0.5, 1.0) randomly for each generation or each difference vector, a technique called dither, improves convergence behavior significantly, especially for noisy objective functions. 

It has also been found that setting $CR$ to a low value, e.g., `CR=0.2` helps to optimize separable functions since it fosters the search along the coordinate axes. It can also be helpful to avoid premature convergence in some complex problems. On the contrary, this choice is not effective if parameter interdependence is encountered, which frequently occurs in real-world optimization problems rather than artificial test functions. So for parameter interdependence, the choice of $CR$ between 0.7 and 0.9 is likely to be more appropriate.

Another interesting empirical finding is that raising $N$ above, say, 40 does not substantially improve the convergence, independent of the number of parameters. It is worthwhile to experiment with these suggestions. Ensure that you initialize your parameter vectors by exploiting their full numerical range, i.e., if a parameter is allowed to exhibit values in the range (-100, 100), it is a good idea to pick the initial values from this range instead of unnecessarily restricting diversity.

Keep in mind that different problems often require different settings for $N$, $F$, and $CR$ (have a look into the different papers to get a feeling for the settings). If you still get misconvergence, you might want to try a different method. We mostly use 'DE/rand/1/...' or 'DE/best/1/...'. The crossover method is not so crucial, although Ken Price claims that binomial is never worse than exponential. In the case of misconvergence, also check your choice of objective function. There might be a better one to describe your problem. Any knowledge that you have about the problem should be worked into the objective function. A good objective function can make all the difference.

And this is how DE can be used:

### Example

In [None]:
from pymoo.algorithms.soo.nonconvex.de import DE
from pymoo.problems import get_problem
from pymoo.optimize import minimize


problem = get_problem("ackley", n_var=10)


algorithm = DE(pop_size=50, variant="DE/rand/1/bin", CR=0.5, F=(0.3, 0.8))

res = minimize(problem,
               algorithm,
               ("n_gen", 300),
               seed=1,
               verbose=False)

print("Best solution found: \nX = %s\nF = %s" % (res.X, res.F))

### API