Manually adding points and changing the utility function #337

sguysc · 2022-07-22T16:07:22Z

I'm using the suggest-register paradigm and I'm doing two things that I think break something in the operation and I wanted to know how to do it correctly.

At some point, I want to suggest a parameter by myself, externally. So what I do is

my_pnt_from_an_oracle = 0
next_point = {'x': my_pnt_from_an_oracle} 
optimizer.register(params=next_point, target=f(my_pnt_from_an_oracle) )
_ = optimizer.suggest(utility) # I also call this, although I'm not sure it is necessary

After that, I also want to change the utility to exploitation rather than exploration because
I happen to know that the point I'm manually adding, is supposed to be at least some local maxima
and also that it is for sure higher than all the points that were suggested until now.
So I change:

utility = UtilityFunction(kind="ei", kappa=0, xi=1e-4) #xi was 10 originally

But now whenever I run the suggest-register combo, I don't seem to get any suggestions that are closer to my_pnt_from_an_oracle to try to find a better maximum. I think it may not be registered correctly? Although, it appears in optimizer.res.

Any suggestion on how to add a point and change the utility in the middle of an operation?

The text was updated successfully, but these errors were encountered:

bwheelz36 · 2022-07-25T23:42:20Z

hey, at first glance, you have the target and the suggested point as the same thing. e.g. point 0 has target 0. Depending on your objective function, that may not be a very good target value, in which case it's nor surprising the optimiser doesn't look around that point any more. You still have to evaluate what the real target value is at your suggestion, and it still has to be good compared to previously probed points for the optimizer to continue exploring it.

overall, there should only be one line of difference between making a suggestion and using the optimizers internal suggestions. The below is a code snippet that won't run by itself, but I hope illustrates the idea.

if (self.Nsuggestions is not None) and (self.SuggestionsProbed < self.Nsuggestions):
    # evaluate any suggested solutions first
    next_point_to_probe = self.Suggestions[self.SuggestionsProbed]
    self.SuggestionsProbed += 1
else:
    next_point_to_probe = self.optimizer.suggest(utility)
target = self.BlackBoxFunction(next_point_to_probe)
self.optimizer.register(params=next_point_to_probe, target=target)

sguysc · 2022-07-26T12:45:23Z

I didn't follow the explanation about what was wrong with my example. Let me clarify, I sample N points using suggest-register normally. If after those N points, I don't have an objective function value larger than some value V then I manually add a point, let's say 0, which I know for a fact that it has a larger objective value than all of the N points sampled until now. It might not be the true maximum over the whole range, but it is greater than the other sampled points.

So I would want the optimizer to investigate in the vicinity of the added point at least, depending on the exploration-exploitation parameter knobs. I think that essentially I have the same code as you. The only difference is that I also change the utility function at the same time. Could that be it? I'll try getting into the code more and see if I see something.

p.s.:
I just thought, maybe you missed that I call the target function with f(my_pnt_from_an_oracle)? f is my BlackBoxFunction.

bwheelz36 · 2022-07-26T15:29:35Z

At the optimizer.register step, you need to register the location of the point, and the value of the objective function at that point.

In the example you gave, you are using the same value (0) for both.

sguysc · 2022-07-26T15:39:10Z

This is what I'm doing, it's not registering the same value for both, f is the objective:

optimizer.register(params={'x':0}, target=f(0) )

But my f only accepts scalars and not dictionaries.

bwheelz36 · 2022-07-26T15:54:43Z

Ok my apologies, I didn't see the f there.
In that case, I think what you are doing is correct. Is there any chance you could set up a simple example that replicates the issue?

bwheelz36 · 2022-07-26T18:35:08Z

OK. Again apologies for not understanding your initial question, I was trying to answer on my phone.

I've just had a look into this and I agree the behavior is not as expected. Below is a standalone example I wrote to investigate this.
A few points:

I think the point is being correctly registered. You can check what the predicted value is at the suggestion as per line 57 of the example. Following the registration of the suggested point, the prediction closely matches the real value.
It seems to me that when I set length scales in the gaussian process model, the behavior is more what I expect; I get a clustering of points around the suggestion. I think that this may be related to Choice of length_scale in GP kernel - wow! #287; the predictions of the gaussian process model can be highly dependent on the gaussian process settings. You can turn on/off the setting of length scales using the set_gp_params parameter.
I think what is happening here is this; when no length scales are set, the gp model uses 1 length scale to predict all parameters. When they are set on an individual basis, each parameter is modeled independently. In both cases, the length scales are optimized behind the scenes - but setting them as an array results in much better predictions. When the predictions are off, it is not surprising that suggest utility doesn't suggest what we think it should. What perhaps IS surprising is that the optimization still works quite well!
As I suggested in Choice of length_scale in GP kernel - wow! #287; maybe we should change the default behaviour to always use an array of length scales instead of a single value.

I would really appreciate if you could play around with this example, and check if you agree with the conclusions I draw above...

from bayes_opt import BayesianOptimization
from bayes_opt import UtilityFunction
from scipy.optimize import rosen
import numpy as np
from matplotlib import pyplot as plt
from sklearn.gaussian_process.kernels import Matern

def run_optimization(use_suggestion=False, iterations=100, set_gp_params=True):
    """
    optimize the rosenbrock function; the correct answer is x=1, y=1.

    :param use_suggestion: False: ignore suggestoin, True: use sugestion
    :param iterations: iterations to run
    :param set_gp_params: if true, will set length scales, if false uses default
    """
    optimizer = BayesianOptimization(
        f=None,
        pbounds={'x': (-1, 1), 'y': (-1, 1)},
        verbose=1,
        random_state=1
    )
    if set_gp_params:
        optimizer.set_gp_params(alpha=0,
                                kernel=Matern(length_scale=[.2, .2]))

    utility = UtilityFunction(kind="ucb", kappa=2, xi=1e-4)
    suggestion = {'x': 1, 'y': 1}  # this is the correct answer
    objective_values = []
    x_values = []
    y_values = []
    for i in range(iterations):
        if use_suggestion:
            target = -1 * rosen(list(suggestion.values()))
            optimizer.register(params=suggestion, target=target)
            objective_values.append(target)
            x_values.append(suggestion['x'])
            y_values.append(suggestion['y'])
            use_suggestion = False

        next_point = optimizer.suggest(utility)
        target = -1*rosen(list(next_point.values()))
        # -1 because we seek max not min
        try:
            optimizer.register(params=next_point, target=target)
            objective_values.append(target)
            x_values.append(next_point['x'])
            y_values.append(next_point['y'])
        except KeyError:
            '''
            the optimiser has a tendency to try and probe points it already has probed
            when it gets very confident about where the maximum is. I'm just skipping these
            for this example
            '''
            pass
    print(optimizer.max)
    suggested_point_array = np.array(list(suggestion.values())).reshape(1, -1)
    print(f'predicted value at true maximum is {optimizer._gp.predict(suggested_point_array)}')
    return objective_values, x_values, y_values

if __name__ == '__main__':
    print(f'running no suggestion sim')
    no_sug_convergence, no_sug_x, no_sug_y = run_optimization(use_suggestion=False)
    print(f'running with suggestion sim')
    sug_convergence, sug_x, sug_y = run_optimization(use_suggestion=True)

    fig, axs = plt.subplots(nrows=1, ncols=2, figsize=[8,5])
    axs[0].plot(no_sug_convergence)
    axs[0].plot(sug_convergence)
    axs[0].legend(['without suggestion', 'with suggestion'])

    axs[1].scatter(no_sug_x, no_sug_y)
    axs[1].scatter(sug_x, sug_y)
    axs[1].grid()
    axs[1].legend(['without suggestion', 'with suggestion'])
    plt.show()

bwheelz36 · 2022-07-26T19:02:43Z

PS: changing kappa to e.g. 0.1 makes the behavior a lot clearer. I'm not entirely confident that my theory about the gp parameters is correct....

sguysc · 2022-07-26T19:41:19Z

I will check it out. I also wonder if the acquisition function type has anything to do with it. I use EI rather than UCB and then xi is the parameter.

sguysc · 2022-07-27T14:52:49Z

Ok, I've gone inside the code and I have preliminary observations (I'm not sure that is the only time it happens, but this is what I see now):
After I sample N points as mentioned, and added the extra point manually, when asking to suggest the next point, the acquisition function is fitting the GP with the current data (util.py). The problem is that it is failing to fit a good function and the mean and std are both constants. Because of that, it basically suggests a point in random (the last value in x_seeds is the one that will get accepted).

Any suggestions on how to deal with a fit that suggests that it is a constant line (the data and targets are not constant)?

bwheelz36 · 2022-07-27T16:20:01Z

that sounds strange - could you post some code to reproduce?

sguysc · 2022-07-27T17:20:22Z

This is trying to emulate what I have:

import matplotlib.pyplot as plt
import numpy as np
from bayes_opt import BayesianOptimization, UtilityFunction

data_target = np.array([[ 4.11527487e+00, -5.25998354e-02],
			[ 9.87746489e-02, -4.13503230e-02],
			[ 3.57381547e+00, -5.25998354e-02],
			[ 1.60683672e+00, -5.25998354e-02],
			[ 5.26929603e+00, -6.18080378e-02],
			[ 9.08232311e-01, -5.25998354e-02],
			[ 2.39112060e-01, -3.52116406e-02],
			[ 4.29874823e+00, -4.23911691e-02],
			[ 9.88334539e-01, -5.25998354e-02],
			[ 3.10196661e+00, -5.25998354e-02],
			[ 7.87734943e-01, -5.25998354e-02],
			[ 8.27486483e-01, -5.16990900e-02],
			[ 3.80523113e+00, -5.25998354e-02],
			[ 7.03828442e-01, -2.22452462e-02],
			[ 5.83227361e+00, -3.09045196e-02],
			[ 4.65850859e+00, -2.24894464e-02],
			[ 5.35498380e+00, -4.55683947e-02],
			[ 5.15052521e+00, -2.59802818e-02],
			[ 3.43894560e+00, -5.25998354e-02],
			[ 3.40215825e+00, -5.25998354e-02],
			[ 0.00000000e+00, -4.35041189e-03],  # <- manually added point
			[ 6.28318531e+00, -4.35041189e-03]]) # <- manually added point

pbounds = {'x': (0, 6.283)}
optimizer = BayesianOptimization(f=None, pbounds=pbounds, verbose=0, random_state=None)
utility = UtilityFunction(kind="ei", kappa=0, xi=10) 

# this includes all the data i currently have, what was suggested from the optimizer, and the manually added points.
for i in range(data_target.shape[0]):
	next_point = {'x': data_target[i,0]}
	optimizer.register(params=next_point, target=data_target[i,1])

utility = UtilityFunction(kind="ei", kappa=0, xi=0.001) 
optimizer._gp.fit(optimizer._space.params, optimizer._space.target) #or: data_target[:,0:1], data_target[:,1:])

# to "understand" what's happening beneath in the suggestion function
x = np.linspace(0, 6.28, 1000).reshape(-1,1)
mu, sigma = optimizer._gp.predict(x, return_std=True)

plt.scatter(data_target[:,0], data_target[:,1], marker='*')
plt.scatter(x, mu, marker='.')
plt.show()
# or, just see what is the suggestion
newpnt = optimizer.suggest(utility)
print(newpnt['x'])
# the maximum should occur near either of the bounds.

Every time I run this code snippet, I get different results. When it cannot predict correctly, the suggestion is random. When it does predict correctly, I get sound suggestions most of the time.

bwheelz36 · 2022-07-27T18:26:16Z

if you put optimizer._gp.fit(optimizer.space.params, optimizer.space.target) inside the for loop, it should work. I don't entirely know why tbh, I would have guessed that what you are doing is fine. The different results are likely because you are not setting random_state.

sguysc · 2022-07-27T19:04:26Z

For me it doesn't. If I run this several times:

import matplotlib.pyplot as plt
import numpy as np
from bayes_opt import BayesianOptimization, UtilityFunction
from sklearn.preprocessing import StandardScaler

use_scalar = False

def closest(x, y, K):
    return y[min(range(len(x)), key = lambda i: abs(x[i]-K))]

scalerx = StandardScaler()
scalery = StandardScaler()

data_target = np.array([[ 4.11527487e+00, -5.25998354e-02],
					   [ 9.87746489e-02, -4.13503230e-02],
					   [ 3.57381547e+00, -5.25998354e-02],
					   [ 1.60683672e+00, -5.25998354e-02],
					   [ 5.26929603e+00, -6.18080378e-02],
					   [ 9.08232311e-01, -5.25998354e-02],
					   [ 2.39112060e-01, -3.52116406e-02],
					   [ 4.29874823e+00, -4.23911691e-02],
					   [ 9.88334539e-01, -5.25998354e-02],
					   [ 3.10196661e+00, -5.25998354e-02],
					   [ 7.87734943e-01, -5.25998354e-02],
					   [ 8.27486483e-01, -5.16990900e-02],
					   [ 3.80523113e+00, -5.25998354e-02],
					   [ 7.03828442e-01, -2.22452462e-02],
					   [ 5.83227361e+00, -3.09045196e-02],
					   [ 4.65850859e+00, -2.24894464e-02],
					   [ 5.35498380e+00, -4.55683947e-02],
					   [ 5.15052521e+00, -2.59802818e-02],
					   [ 3.43894560e+00, -5.25998354e-02],
					   [ 3.40215825e+00, -5.25998354e-02],
					   [ 0.00000000e+00, -4.35041189e-03],  # <- manually added point
					   [ 6.28318531e+00, -4.35041189e-03]]) # <- manually added point


pbounds = {'x': (0, 6.283)}
optimizer = BayesianOptimization(f=None, pbounds=pbounds, verbose=0, random_state=None)
utility = UtilityFunction(kind="ei", kappa=0, xi=10) 

# this includes all the data i currently have, what was suggested from the optimizre, and the manually added point.
for i in range(data_target.shape[0]):
    next_point = {'x': data_target[i,0]}
    optimizer.register(params=next_point, target=data_target[i,1])
    if(use_scalar):
        # optimizer._gp.set_params(kernel__length_scale=15)
        scalerx.fit(optimizer._space.params)
        scalery.fit(optimizer._space.target)
        X = scalerx.transform(optimizer._space.params)
        Y = scalery.transform(optimizer._space.target)
    else:
        X = optimizer._space.params
        Y = optimizer._space.target
        
    optimizer._gp.fit(X, Y) 
    
utility = UtilityFunction(kind="ei", kappa=0, xi=0.001) 
# optimizer._gp.fit(optimizer._space.params, optimizer._space.target) #or: data_target[:,0:1], data_target[:,1:])
optimizer._gp.fit(X, Y) 

# or, to "understand" what's happening beneath
x_pnts = np.linspace(0, 6.28, 1000).reshape(-1,1)
if(use_scalar ):
    x = scalerx.transform(x_pnts)
else:
    x = x_pnts
    
mu, sigma = optimizer._gp.predict(x, return_std=True)
if(use_scalar):
    mu = scalery.inverse_transform(mu.reshape(-1,1))

plt.scatter(data_target[:,0], data_target[:,1], marker='*')
plt.scatter(x_pnts, mu, marker='.')


newpnt_scaled = optimizer.suggest(utility)
# 
if(use_scalar ):
    newpnt = scalerx.inverse_transform( np.array(newpnt_scaled['x']).reshape(-1,1) )
else:
    newpnt = np.array(newpnt_scaled['x']).reshape(-1,1)
    
print( 'suggested point: ', newpnt )
plt.scatter(newpnt, closest(x_pnts, mu, newpnt), marker='s', color='r')

plt.show()

When I run this several times it happens. On one single run, the results could be what you showed.

(I also tried playing around with scalers, to see if that'll help)

sguysc · 2022-07-27T20:38:39Z

I guess the problem is with the rand_state parameter in the GaussianProcessRegressor. If you're unlucky, it may do a "bad" fit and then your predictions will be bad too. This is a problem whenever you call suggest, not just when you add a point manually.
But what is the remedy?

bwheelz36 · 2022-07-28T22:16:48Z

OK:

I'm glad it wasn't actually whether the fit was in the for loop or not because I was confused by that
If you set random_state=1 (or some number) you should get consistent results - be they consistently good or consistently bad
The problem is the gaussian process is struggling to fit to your data. I fixed this by adding the following line after the optimizer is created:

optimizer.set_gp_params(kernel=Matern(length_scale=[0.1]))

With this I get consistently good fits for random_state =0, 1, and 2. My understanding is that the length scale should be optimized for best fit anyway, so I'm slightly surprised that this made the difference - you could investigate why that is if you have any time!

Now: your GP is working, we can get back to the initial question of the clustering around suggestions!

sguysc · 2022-07-29T16:06:15Z

Yes, another thing that helped me is:

optimizer.set_gp_params(n_restarts_optimizer=25)

You're right about the length scale, it is weird that the length_scale matters, because after the fit if I look at optimizer._gp.kernel_ it has a different length_scale.

I think I'm withdrawing my question now that I get that the problem is the fact that I don't get a good fit for a GP or if I do, it is overfitting the observations.

The reason is that anyway it's going to have high uncertainty between the observations and essentially the problem is ill-posed so the suggestions are random or almost random. So it won't go necessarily to the maximum value of the point I added, but according to the acquisition function.

And also, It is not just when adding points manually. It's going to happen every time I can't get a good fit. I'll have to think about what I can do about it, but I don't think there's any problem with your package.

p.s. - thank you for your help and time!!

bwheelz36 · 2022-07-29T23:01:12Z

if you are having issues with over fitting, you can also set the alpha parameter in the call to set_gp_params. higher values will regularity more heavily and prevent over-fitting.

It's good that you checked that the gaussian process model is actually fitting your data. I think we should maybe generate a plot or something like that to make it easier for people to realise when the fit has failed...

bwheelz36 closed this as completed Jul 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manually adding points and changing the utility function #337

Manually adding points and changing the utility function #337

sguysc commented Jul 22, 2022 •

edited

bwheelz36 commented Jul 25, 2022

sguysc commented Jul 26, 2022 •

edited

bwheelz36 commented Jul 26, 2022

sguysc commented Jul 26, 2022

bwheelz36 commented Jul 26, 2022 •

edited

bwheelz36 commented Jul 26, 2022

bwheelz36 commented Jul 26, 2022

sguysc commented Jul 26, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 27, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 27, 2022

sguysc commented Jul 27, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 28, 2022

sguysc commented Jul 29, 2022 •

edited

bwheelz36 commented Jul 29, 2022

Manually adding points and changing the utility function #337

Manually adding points and changing the utility function #337

Comments

sguysc commented Jul 22, 2022 • edited

bwheelz36 commented Jul 25, 2022

sguysc commented Jul 26, 2022 • edited

bwheelz36 commented Jul 26, 2022

sguysc commented Jul 26, 2022

bwheelz36 commented Jul 26, 2022 • edited

bwheelz36 commented Jul 26, 2022

bwheelz36 commented Jul 26, 2022

sguysc commented Jul 26, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 27, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 27, 2022

sguysc commented Jul 27, 2022

sguysc commented Jul 27, 2022

bwheelz36 commented Jul 28, 2022

sguysc commented Jul 29, 2022 • edited

bwheelz36 commented Jul 29, 2022

sguysc commented Jul 22, 2022 •

edited

sguysc commented Jul 26, 2022 •

edited

bwheelz36 commented Jul 26, 2022 •

edited

sguysc commented Jul 29, 2022 •

edited