Differential evolution (#1062)

* Differential evolution --------- Co-authored-by: Matt Borland <matt@mattborland.com>
boostorg · Jan 2, 2024 · 4ee8391 · 4ee8391
1 parent 56b6ae1
commit 4ee8391
Show file tree

Hide file tree

Showing 6 changed files with 629 additions and 0 deletions.
diff --git a/doc/roots/differential_evolution.qbk b/doc/roots/differential_evolution.qbk
@@ -0,0 +1,110 @@
+[/
+Copyright (c) 2023 Nick Thompson
+Use, modification and distribution are subject to the
+Boost Software License, Version 1.0. (See accompanying file
+LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+]
+
+[section:differential_evolution Differential Evolution]
+
+[heading Synopsis]
+
+``
+    #include <boost/math/tools/differential_evolution.hpp>
+
+    namespace boost::math::tools {
+
+    template <typename ArgumentContainer> struct differential_evolution_parameters {
+        using Real = typename ArgumentContainer::value_type;
+        ArgumentContainer lower_bounds;
+        ArgumentContainer upper_bounds;
+        Real F = static_cast<Real>(0.65);
+        double crossover_ratio = 0.5;
+        // Population in each generation:
+        size_t NP = 200;
+        size_t max_generations = 1000;
+        size_t threads = std::thread::hardware_concurrency();
+	ArgumentContainer const * initial_guess = nullptr;
+    };
+
+    template <typename ArgumentContainer, class Func, class URBG>
+    ArgumentContainer differential_evolution(
+        const Func cost_function, differential_evolution_parameters<ArgumentContainer> const &de_params, URBG &g,
+        std::invoke_result_t<Func, ArgumentContainer> target_value = std::numeric_limits<std::invoke_result_t<Func, ArgumentContainer>>::quiet_NaN(),
+        std::atomic<bool> *cancellation = nullptr,
+        std::vector<std::pair<ArgumentContainer, std::invoke_result_t<Func, ArgumentContainer>>> *queries = nullptr,
+        std::atomic<std::invoke_result_t<Func, ArgumentContainer>> *current_minimum_cost = nullptr);
+
+    } // namespaces
+``
+
+The `differential_evolution` function provides an implementation of the (classical) differential evolution optimization algorithm, often going by the label `de/rand/bin/1`.
+It is capable of minimizing a cost function defined on a continuous space represented by a set of bounds.
+This function has been designed more for progress observability, graceful cancellation, and post-hoc data analysis than for speed of convergence.
+We justify this design choice by reference to the "No free lunch" theorem of Wolpert and Macready, which establishes "that for any algorithm, any elevated performance over one class of problems is offset by performance over another class".
+
+[heading Parameters]
+
+    `lower_bounds`: A container representing the lower bounds of the optimization space along each dimension. The `.size()` of the bounds should return the dimension of the problem.
+    `upper_bounds`: A container representing the upper bounds of the optimization space along each dimension. It should have the same size of `lower_bounds`, and each element should be >= the corresponding element of `lower_bounds`.
+    `F`: The scale factor controlling the rate at which the population evolves (default is 0.65).
+    `crossover_ratio`: The crossover ratio determining the trade-off between exploration and exploitation (default is 0.5).
+    `NP`: The population size (default is 200). Parallelization occurs over the population, so this should be "large".
+    `max_generations`: The maximum number of generations for the optimization (default is 1000).
+    `threads`: The number of threads to use for parallelization (default is the hardware concurrency). If the objective function is already multithreaded, then this should be set to 1 to prevent oversubscription.
+    `initial_guess`: An optional guess for where we should start looking for solutions.
+
+The defaults were chosen by a reading of Price, Storn, and Lampinen, chapter 3, with the exception of the population size, which we have chosen a bit larger than they found as core counts have increased since publication of this reference and parallelization occurs within each generation.
+Note that there is a tradeoff between finding global minima and convergence speed.
+The most robust way of decreasing the probability of getting stuck in a local minima is to increase the population size.
+
+[heading The function]
+
+``
+template <typename ArgumentContainer, class Func, class URBG>
+ArgumentContainer differential_evolution(const Func cost_function,
+    differential_evolution_parameters<ArgumentContainer> const &de_params,
+    URBG &g,
+    std::invoke_result_t<Func, ArgumentContainer> value_to_reach = std::numeric_limits<std::invoke_result_t<Func, ArgumentContainer>>::quiet_NaN(),
+    std::atomic<bool> *cancellation = nullptr,
+    std::vector<std::pair<ArgumentContainer, std::invoke_result_t<Func, ArgumentContainer>>> *queries = nullptr,
+    std::atomic<std::invoke_result_t<Func, ArgumentContainer>> *current_minimum_cost = nullptr)
+``
+
+Parameters:
+
+    `cost_function`: The cost function to be minimized.
+    `de_params`: The parameters to the algorithm as described above.
+    `rng`: A uniform random bit generator, like `std::mt19937_64`.
+    `value_to_reach`: An optional value that, if reached, stops the optimization. This is the most robust way to terminate the calculation, but in most cases the optimal value of the cost function is unknown. If it is, use it! See the referenced book for clear examples of when target values can be deduced. 
+    `cancellation`: An optional atomic boolean to allow the user to stop the computation and gracefully return the best result found up to that point. N.B.: Cancellation is not immediate; the in-progress generation finishes.
+    `queries`: An optional vector to store intermediate results during optimization. This is useful for debugging and perhaps volume rendering of the objective function after the calculation is complete.
+    `current_minimum_cost`: An optional atomic variable to store the current minimum cost during optimization. This allows developers to (e.g.) plot the progress of the minimization over time and in conjunction with the `cancellation` argument allow halting the computation when the progress stagnates. Refer to Price, Storn, and Lampinen, Section 3.2 for caveats with this approach.
+
+Returns:
+
+The argument vector corresponding to the minimum cost found by the optimization.
+
+N.B.: The termination criteria is an "OR", not an "AND".
+So if the maximum generations is hit, the iteration stops, even if (say) a `value_to_reach` has not been attained.
+
+[h4:examples Examples]
+
+An example use can be found [@../../example/differential_evolution.cpp here].
+More examples and API use cases can be studied in [@../../test/differential_evolution_test.cpp differential_evolution_test.cpp].
+
+[h5:caveats Caveats]
+
+We have decided to only support classic `de/rand/1/bin` because there are too many parameters for this class as it stands, and we have not seen benchmarks that indicate that other variants of the algorithm perform better.
+If a compelling usecase is provided, support for the `de/x/y/z` variants can be added.
+
+Supported termination criteria are explicit requests for termination, value-to-reach, and max generations.
+Price, Storn, and Lampinen, Section 2.8 also list population statistics and lack of accepted trials over many generations as sensible termination criteria.
+These could be supported if there is demand.
+
+[h4:references References]
+
+* Price, Kenneth, Rainer M. Storn, and Jouni A. Lampinen. ['Differential evolution: a practical approach to global optimization.] Springer Science & Business Media, 2006.
+* D. H. Wolpert and W. G. Macready, ['No free lunch theorems for optimization.] IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67-82, April 1997, doi: 10.1109/4235.585893.
+
+[endsect] [/section:differential_evolution Differential Evolution]
diff --git a/doc/roots/roots_overview.qbk b/doc/roots/roots_overview.qbk
@@ -21,6 +21,7 @@ There are several fully-worked __root_finding_examples, including:
 [include quartic_roots.qbk]
 [include root_finding_examples.qbk]
 [include minima.qbk]
+[include differential_evolution.qbk]
 [include root_comparison.qbk]
 
 [/ roots_overview.qbk

diff --git a/example/differential_evolution.cpp b/example/differential_evolution.cpp
@@ -0,0 +1,41 @@
+/*
+ * Copyright Nick Thompson, 2023
+ * Use, modification and distribution are subject to the
+ * Boost Software License, Version 1.0. (See accompanying file
+ * LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+ */
+#include <iostream>
+#include <boost/math/tools/differential_evolution.hpp>
+
+using boost::math::tools::differential_evolution_parameters;
+using boost::math::tools::differential_evolution;
+
+double rosenbrock(std::vector<double> const & x) {
+   double result = 0;
+   for (size_t i = 0; i < x.size() - 1; ++i) {
+       double tmp = x[i+1] - x[i]*x[i];
+       result += 100*tmp*tmp + (1-x[i])*(1-x[i]); 
+   }
+   return result;
+}
+
+int main() {
+    auto de_params = differential_evolution_parameters<std::vector<double>>();
+    constexpr const size_t dimension = 10;
+    // Search on [0, 2]^dimension:
+    de_params.lower_bounds.resize(dimension, 0);
+    de_params.upper_bounds.resize(dimension, 2);
+    // This is a challenging function, increase the max generations 10x from default so we don't terminate prematurely:
+    de_params.max_generations *= 10;
+    std::random_device rd;
+    std::mt19937_64 rng(rd());
+    // The global minima is exactly zero-but some leeway is required:
+    double value_to_reach = 1e-5;
+    auto local_minima = differential_evolution(rosenbrock, de_params, rng, value_to_reach);
+    std::cout << "Minima: {";
+    for (auto l : local_minima) {
+        std::cout << l << ", ";
+    }
+    std::cout << "}\n";
+    std::cout << "Value of cost function at minima: " << rosenbrock(local_minima) << "\n";
+}