different results vs. runs #337

pavel-rev · 2020-08-12T20:57:27Z

I run optimization and get different results. Sometimes it hits the optimum (I have independent "slow" exhaustive search to know the optimum). Sometimes it does not hit it. Are there known rules of thumb wrt parameters to get more consistent results?

pavel-rev · 2020-09-16T00:43:45Z

analysis: the problem is in initialization of numpy random number generators. TODO: 1. APIs should be updated to the latest (details to be provided); 2. seed should be passed for testing purposes to achieve consistency. NB: do NOT close this issue, I am going to work on it soon.

Seal-o-O · 2020-09-22T04:26:56Z

I have the same problem as well. Even I used the same seed, the results were different.
And I found out that the optimizer used to explored the bounds of inputs.

apaleyes · 2020-09-22T08:30:19Z

@Seal-o-O that's weird, if the seed is the same the output should be the same too. We use this fact extensively in functional test suite. How are you fixing the seed?

Seal-o-O · 2020-09-22T11:20:28Z

@Seal-o-O that's weird, if the seed is the same the output should be the same too. We use this fact extensively in functional test suite. How are you fixing the seed?

@apaleyes Thanks for taking your time to answer me.
I think that's weird too, I got same result with the same seed as I run the code before. But result changed even the seed is the same now.
My code is here, tell me if I made mistake.

def optimize_as(self):      
        start = time.time()
        domain=[{'name': 'd1', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd2', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd3', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd4', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd5', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd6', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd7', 'type': 'continuous', 'domain': (0, 0), 'dimensionality':1},
               {'name': 'd8', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd9', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd10', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd11', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd12', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd13', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd14', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd15', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1},
               {'name': 'd16', 'type': 'continuous', 'domain': (0, 6.27999), 'dimensionality':1}]

        seed(999)

        opt = GPyOpt.methods.BayesianOptimization(f = self.fit_lib,                
                                                                                  domain = domain,         
                                                                                  acquisition_type ='MPI' )   
        
        opt.run_optimization(max_iter=100, max_time=np.inf)

        end = time.time()
        print ('Optimization time: ',end - start)
        opt.plot_convergence()
        x_best = opt.X[np.argmin(opt.Y)]
        print('The optimized phase is :' ,x_best)
        self.ans = x_best`

And as you can see the input is 16 dims, I don't know why it always prefer to exploring the bound of the input ( such like: [6.27999, 0, 0, 6.27999, 0, 0, 0, 0, 6.27999, 0, 0, 6.27999, 0, 2.20965747, 0, 6.27999]). I suppose that it's because of the L-BGFS. The optimizer think it has found the optimum (actually not), so it converge and always search the limitation of the bounds.
I have problem in installing DIRECT, so i dont know if my assumption is right or not.

apaleyes · 2020-09-22T13:01:50Z

The code looks reasonable. But an important question is, where is seed function imported from? In our test cases we use np.random.seed(1)

Seal-o-O · 2020-09-23T08:38:00Z

The code looks reasonable. But an important question is, where is seed function imported from? In our test cases we use np.random.seed(1)

I imported it at the front by using from numpy.random import seed, and even used np.random.seed(999) , but the result still keep changing.

apaleyes · 2020-09-23T09:02:53Z

I see. Is there any stochasticity at all in self.fit_lib?

btw, d7 bounds look weird

Seal-o-O · 2020-09-23T09:14:52Z

No, self.fit_lib is a function which calculates the phase and ultrasound.

d7 is set to be 0 on purpose, avoiding 16 dims have the same offset which meanless in ultrasound field.

pavel-rev · 2020-09-24T04:55:00Z

venv/lib/python3.7/site-packages/GPy/core/init.py

        if rand_gen is None:
            rand_gen = np.random.normal # line 30

np.random.normal, line 30, should be updated to numpy.random.default_rng(seed=SEED).normal with SEED passed (if SEED=None, purely random; otherwise it will give reproducible results for testing purposes)

also it says:
#===========================================================================

Handle priors, this needs to be

cleaned up at some point

#===========================================================================

it is time to clean it up :)
NB: GPy is a dependency (but developed by the same uni)

Seal-o-O · 2020-09-24T07:00:16Z

NB: GPy is a dependency (but developed by the same uni)

I tried and it seemed not work.
But I found out the problem is that the seed should be defined before the Class, or it would not work.
Sorry for bother you such a long time, and thanks. @pavel-rev @apaleyes
And I still wonder why the GPyOpt prefer to search the bounds of domain, if you dont mind. Such like:
`
RMSE:
0.06481827043112744
Phase: #input in BO
[0. 6.27999 0. 6.27999 0. 6.27999
6.27999 0. 6.27999 6.27999 6.27999 6.27999
0. 2.88350738 0. 0. ] #always searching 0 and 6.27999

RMSE:
0.06531209718003994
Phase:
[0. 6.27999 0. 0. 0. 6.27999
6.27999 6.27999 2.8160093 0. 1.23583263 6.27999
0. 0. 0. 0. ]
`
In my situation, which the domain is (0, 6.27999)

apaleyes · 2020-09-24T10:03:14Z

Thanks, @pavel-rev , great find! I've just realized that in our testing suite we mock the model, while this code obviously uses GPy, so the dependency is very likely the reason for unconsistent behavior.

But is it this line or not I am still not sure. I mean, wouldn't fixing numpy random seed affect all generators, including numpy.random.normal? Anyway, I think at this point it's probably reasonable to do a few tests with GPy alone (GPRegression which GPyOpt uses by default) and cut an issue with GPy folks to discuss this further.

Despite what GitHub org says, GPyOpt is no longer maintained by the uni of Sheffield. Some of people involved are Sheffield alumni though. As for GPy, I don't know who is taking care of it at the moment. @zhenwendai may have a better idea.

pavel-rev · 2020-09-24T17:17:13Z

@apaleyes I had this change outside of VCS, now it has gone with reinstall of GPyOpt during working on other issues - I will try to reproduce it with writing some standalone test and will share it here.

lovis-heindrich · 2020-10-05T08:47:49Z

I have the same issue.

The randomness is fixed like this:

np.random.seed(123456)
initial_design = GPyOpt.experiment_design.initial_design('random', feasible_region, 30)
objective = GPyOpt.core.task.SingleObjective(blackbox)
model = GPyOpt.models.GPModel(exact_feval=True,optimize_restarts=10,verbose=False)
aquisition_optimizer = GPyOpt.optimization.AcquisitionOptimizer(feasible_region)
acquisition = GPyOpt.acquisitions.AcquisitionEI(model, feasible_region, optimizer=aquisition_optimizer)
evaluator = GPyOpt.core.evaluators.Sequential(acquisition)
bo = GPyOpt.methods.ModularBayesianOptimization(model, feasible_region, objective, acquisition, evaluator, initial_design)

bo.run_optimization(max_iter=50, max_time=None, eps=1e-6, verbosity=True)

Even if the evaluated function contains some randomness the seed should fix that too? I also tried restarting my jupyter kernel inbetween runs to see if there might be cached values but that doesn't seem to be the case since I still got different results. Here are three results I got by first restarting my jupyter kernel and then running the optimization:

0.36285324 0.32473314 1.31686286
0.29892593 0. 1.
0.36293122 0.32411331 1.31771813

pavel-rev closed this as completed Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

different results vs. runs #337

different results vs. runs #337

pavel-rev commented Aug 12, 2020 •

edited

pavel-rev commented Sep 16, 2020

Seal-o-O commented Sep 22, 2020 •

edited

apaleyes commented Sep 22, 2020

Seal-o-O commented Sep 22, 2020 •

edited by apaleyes

apaleyes commented Sep 22, 2020

Seal-o-O commented Sep 23, 2020

apaleyes commented Sep 23, 2020

Seal-o-O commented Sep 23, 2020

pavel-rev commented Sep 24, 2020 •

edited

Seal-o-O commented Sep 24, 2020 •

edited

apaleyes commented Sep 24, 2020

pavel-rev commented Sep 24, 2020 •

edited

lovis-heindrich commented Oct 5, 2020 •

edited

different results vs. runs #337

different results vs. runs #337

Comments

pavel-rev commented Aug 12, 2020 • edited

pavel-rev commented Sep 16, 2020

Seal-o-O commented Sep 22, 2020 • edited

apaleyes commented Sep 22, 2020

Seal-o-O commented Sep 22, 2020 • edited by apaleyes

apaleyes commented Sep 22, 2020

Seal-o-O commented Sep 23, 2020

apaleyes commented Sep 23, 2020

Seal-o-O commented Sep 23, 2020

pavel-rev commented Sep 24, 2020 • edited

Handle priors, this needs to be

cleaned up at some point

Seal-o-O commented Sep 24, 2020 • edited

apaleyes commented Sep 24, 2020

pavel-rev commented Sep 24, 2020 • edited

lovis-heindrich commented Oct 5, 2020 • edited

pavel-rev commented Aug 12, 2020 •

edited

Seal-o-O commented Sep 22, 2020 •

edited

Seal-o-O commented Sep 22, 2020 •

edited by apaleyes

pavel-rev commented Sep 24, 2020 •

edited

Seal-o-O commented Sep 24, 2020 •

edited

pavel-rev commented Sep 24, 2020 •

edited

lovis-heindrich commented Oct 5, 2020 •

edited