# GPSO optimisation example with saving the state and resuming

Yes, you can optimise for $X$ and then realise: oh, I need more! You can even realise that after few days, so yes, you can save the state of the optimiser and resume from saved state on the disk.

In [1]:
# imports
import os
import numpy as np
from shutil import rmtree
import logging
from gpflow.utilities import print_summary

from gpso import ParameterSpace, GPSOptimiser, GPRSurrogate
from gpso.utils import set_logger, make_dirs

again, objective function and parameter space are without change

In [2]:
# define objective function - compare with paper Fig. 2
def obj(point, offset=0.0, rotate=True):
    x, y = point
    if rotate:
        ct = np.cos(np.pi / 4)
        st = np.sin(np.pi / 4)
        xn = ct * x + st * y
        yn = ct * y - st * x
        x = xn
        y = yn
    return (
        3 * (1 - x) ** 2.0 * np.exp(-(x ** 2) - (y + 1) ** 2)
        - 10 * (x / 5.0 - x ** 3 - y ** 5) * np.exp(-(x ** 2) - y ** 2)
        - 1 / 3 * np.exp(-((x + 1) ** 2) - y ** 2)
        - offset
    )


# bounds of the parameters we will optimise
x_bounds = [-3, 5]
y_bounds = [-3, 3]
# number of points per dimension for plotting
N_POINTS = 120

In [3]:
space = ParameterSpace(
    parameter_names=["x", "y"], parameter_bounds=[x_bounds, y_bounds]
)

### First optimisation
Let's be naive and optimise only for 25 evaluations.

In [4]:
opt = GPSOptimiser(
    parameter_space=space,
    gp_surrogate=GPRSurrogate.default(),
    exploration_method="tree",
    exploration_depth=5,
    update_cycle=1,
    budget=25,
    stopping_condition="evaluations",
    n_workers=4,
)

In [5]:
# log_level INFO: reasonable amount of information on what is happening
# log_level DEBUG: a lot of information on what is happening
set_logger(log_level=logging.INFO)
# run vanilla, with default initialisation and just 1 repetition of objective function (since it's deterministic...)
best_point_v1 = opt.run(obj)

[2020-05-13 23:39:46] INFO: Starting 2-dimensional optimisation with budget of 25 objective function evaluations...
[2020-05-13 23:39:46] INFO: Sampling 2 vertices per dimension within L1 ball of 0.25 of the domain size radius in normalised coordinates using 4 worker(s)...
[2020-05-13 23:39:46] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:39:48] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:48] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:39:48] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:39:49] INFO: After 1th iteration: 
	 number of obj. func. evaluations: 6 
	 highest score: 2.019346080880167 
	 highest UCB: 0.5464137300581137
[2020-05-13 23:39:49] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:49] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:39:49] INFO: Update step: retraining GP model and updating scores...
[

In [6]:
print(best_point_v1)
print_summary(opt.gp_surr.gpflow_model, fmt="notebook")

GPPoint(normed_coord=array([0.27777778, 0.72222222]), score_mu=6.822127355673395, score_sigma=0.0, score_ucb=0.0, label=<PointLabels.evaluated: 1>)


name,class,transform,prior,trainable,shape,dtype,value
GPR.mean_function.c,Parameter,,,True,(),float64,0.311109
GPR.kernel.variance,Parameter,Softplus,,True,(),float64,3.38538
GPR.kernel.lengthscales,Parameter,Softplus,,True,(),float64,0.104901
GPR.likelihood.variance,Parameter,Softplus + Shift,,True,(),float64,1.04832e-06


So we can see we are not there yet...
Let's resume

In [7]:
# let's just resume for additional 25 evaluations
best_point_v2 = opt.resume_run(additional_budget=25)

[2020-05-13 23:39:53] INFO: Resuming optimisation for with additional budget of 25
[2020-05-13 23:39:53] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:53] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:39:53] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:39:54] INFO: After 9th iteration: 
	 number of obj. func. evaluations: 31 
	 highest score: 6.822127355673395 
	 highest UCB: 6.375786411525929
[2020-05-13 23:39:54] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:54] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:39:54] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:39:55] INFO: After 10th iteration: 
	 number of obj. func. evaluations: 36 
	 highest score: 7.880218798440746 
	 highest UCB: 7.6297163908975705
[2020-05-13 23:39:55] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:55] INFO: 

In [8]:
print(best_point_v2)
print_summary(opt.gp_surr.gpflow_model, fmt="notebook")

GPPoint(normed_coord=array([0.23662551, 0.68518519]), score_mu=8.102201204076508, score_sigma=0.0, score_ucb=0.0, label=<PointLabels.evaluated: 1>)


name,class,transform,prior,trainable,shape,dtype,value
GPR.mean_function.c,Parameter,,,True,(),float64,0.1915
GPR.kernel.variance,Parameter,Softplus,,True,(),float64,3.16384
GPR.kernel.lengthscales,Parameter,Softplus,,True,(),float64,0.128714
GPR.likelihood.variance,Parameter,Softplus + Shift,,True,(),float64,1.04826e-06


So now we done! We have the same result as optimising 50 straight (makes sense...)

### Save and resume
Are we **really** done? Let's try to save the current state of the optimisation and imagine getting back in a few days...

Side note: saving the optimiser state unfortunately loses callbacks and saver if any. We need to provide new ones for the resuming operation.

In [9]:
output_dir = "output"
opt.save_state(output_dir)

[2020-05-13 23:39:58] INFO: Saved optimiser to output


In [10]:
# we need to provide objective function again, it's hard to save callable like that...
best_point_v3, opt_loaded = GPSOptimiser.resume_from_saved(
    output_dir, additional_budget=25, objective_function=obj
)
# the `resume_from_saved` directly resumes the optimisation and in the end return new best point and optimiser itself,
# so you can save it again

[2020-05-13 23:39:58] INFO: Resuming optimisation for with additional budget of 25
[2020-05-13 23:39:58] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:39:59] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:39:59] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:39:59] INFO: After 14th iteration: 
	 number of obj. func. evaluations: 56 
	 highest score: 8.10343046869997 
	 highest UCB: 8.098495608051296
[2020-05-13 23:39:59] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:40:00] INFO: Selecting step: evaluating best leaves...
[2020-05-13 23:40:00] INFO: Update step: retraining GP model and updating scores...
[2020-05-13 23:40:00] INFO: After 15th iteration: 
	 number of obj. func. evaluations: 61 
	 highest score: 8.10343046869997 
	 highest UCB: 8.098763407817744
[2020-05-13 23:40:00] INFO: Exploration step: sampling children in the ternary tree...
[2020-05-13 23:40:01] INFO: Se

In [11]:
print(best_point_v3)
print_summary(opt.gp_surr.gpflow_model, fmt="notebook")

GPPoint(normed_coord=array([0.23433928, 0.68564243]), score_mu=8.106141171216883, score_sigma=0.0, score_ucb=0.0, label=<PointLabels.evaluated: 1>)


name,class,transform,prior,trainable,shape,dtype,value
GPR.mean_function.c,Parameter,,,True,(),float64,0.1915
GPR.kernel.variance,Parameter,Softplus,,True,(),float64,3.16384
GPR.kernel.lengthscales,Parameter,Softplus,,True,(),float64,0.128714
GPR.likelihood.variance,Parameter,Softplus + Shift,,True,(),float64,1.04826e-06


Not much better... Meaning that we couldn't top the highest score by optimising more. At least we see that we optimised the hell out of it!

In [12]:
# cleaning - run after you check the results!
rmtree(output_dir)