blackbox: A Python module for parallel optimization of expensive black-box functions
What is this?
A minimalistic and easy-to-use Python module that efficiently searches for a global minimum of an expensive black-box function (e.g. optimal hyperparameters of simulation, neural network or anything that takes significant time to run). User needs to provide a function, a search domain (ranges of each input parameter) and a total number of function calls available. A code scales well on multicore CPUs and clusters: all function calls are divided into batches and each batch is evaluated in parallel.
A mathematical method behind the code is described in this arXiv note (there were few updates to the method recently): https://arxiv.org/pdf/1605.00998.pdf
Don't forget to cite this note if you are using method/code.
(a) - demo function (unknown to a method).
(b) - running a procedure using 15 evaluations.
(c) - running a procedure using 30 evaluations.
How do I represent my objective function?
It simply needs to be wrapped into a Python function. An external application, if any, can be accessed using system call.
def fun(par): ... return output
par is a vector of input parameters (a Python list),
output is a scalar value to be minimized.
How do I run the procedure?
blackbox.py into your working directory. Main file should look like this:
import blackbox as bb def fun(par): return par**2 + par**2 # dummy example def main(): bb.search_min(f = fun, # given function domain = [[-10., 10.], [-10., 10.]], # ranges of each parameter budget = 40, # total number of function calls available batch = 4, # number of calls that will be evaluated in parallel resfile = 'output.csv') # text file where results will be saved if __name__ == '__main__': main()
- All function calls are divided into batches and each batch is evaluated in parallel. Total number of batches is
budget/batch. The value of
batchshould correspond to the number of available computational units.
- An optional parameter
executor = ...should be specified within
bb.search_min()in case when custom parallel engine is used (ipyparallel, dask.distributed, pathos etc).
executorshould be an object that has a
How about results?
Iterations are sorted by function value (best solution is at the top) and saved in a text file with the following structure:
|Parameter #1||Parameter #2||...||Parameter #n||Function value|
Paul Knysh (firstname.lastname@example.org)
I receive tons of useful feedback that helps me to improve the code. Feel free to email me if you have any questions or comments.