Skip to content
This repository has been archived by the owner on Feb 23, 2023. It is now read-only.

Optimization chooses and gets stuck at an infeasible point #93

Closed
jkaardal opened this issue May 22, 2017 · 10 comments
Closed

Optimization chooses and gets stuck at an infeasible point #93

jkaardal opened this issue May 22, 2017 · 10 comments

Comments

@jkaardal
Copy link

jkaardal commented May 22, 2017

Hi GPyOpt developers. I have an issue where GPyOpt chooses an infeasible next point when the number of variables in my problem exceeds 8 and then immediately converges to a suboptimal infeasible point (with respect to the inequality constraints, the chosen point is still within the domain). What I am doing is optimizing multiple l2-norm regularization parameters, an n dimensional vector denoted x, where the unknown function f(x) is the solution to a regularized logistic regression with respect to some weights, w, under the inequality constraints x_1 <= x_2 <= x_3 <= ... <= x_n. The purpose of these constraints is to reduce the computation time since any permutation of the elements of x will result in the same value of f(x). Here is an example of how I have the bounds and constraints set up for an n = 5 problem before being passed into BayesianOptimization():

domain = [{'domain': (0.0, 1.0), 'type': 'continuous', 'name': x_1_b'},
                  {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_2_b'},
                  {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_3_b'},
                  {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_4_b'},
                  {'domain': (0.0, 1.0), 'type': 'continuous', 'name': 'x_5_b'}]

constraints = [{'constrain': 'x[:,0] - x[:,1]', 'name': 'x_1_2_c'},
                       {'constrain': 'x[:,1] - x[:,2]', 'name': 'x_2_3_c'},
                       {'constrain': 'x[:,2] - x[:,3]', 'name': 'x_3_4_c'},
                       {'constrain': 'x[:,3] - x[:,4]', 'name': 'x_4_5_c'}]

Other information: I am using the matern32 kernel (this phenomenon occurs with matern52 as well) with the maximum likelihood update to the Gaussian process hyperparameters and the lower confidence bound acquisition function (default values except jitter is 1E-4 to protect against singularity of the kernel). The optimization has worked fine with these constraints when n < 9 and also worked for n > 8 when the constraints were removed.

@javiergonzalezh
Copy link
Member

javiergonzalezh commented May 22, 2017 via email

@jkaardal
Copy link
Author

jkaardal commented May 22, 2017

Hi Javier, thank you for the suggestion.

I previously tried a few runs with a jitter of 0.01 and the algorithm still chose an infeasible point after the initialization trials. I just tried jitters of 1.0, 10.0, and 100.0 and ran each a number of times. With these jitters it will only sometimes acquire a feasible initial point; probably a bit more often than randomly but less often than it chooses an infeasible initial point. There may be a slight improvement in this probability but it is hard to say for certain since the logistic regression itself takes a while to run and I worry that the jitter is becoming impractically large while examples of infeasible initial acquisitions remain.

Out of curiosity, what optimization algorithm is being used to maximize the acquisition function? From what I can surmise, it appears that the default is fmin_l_bfgs_b from scipy, right? What strategy is used to invoke the inequality constraints? I have an interior-point algorithm I can try if there is an easy way to tell GPyOpt to use an external optimizer.

@jkaardal
Copy link
Author

jkaardal commented May 22, 2017

I should add that when I let GPyOpt choose random initialization trials, these random points are in the correct feasible space but the initial acquisition point following the initialization phase is still usually infeasible.

Edit: Hm, actually it looks like when n = 10 the initialization phase has trouble finding feasible points to try and the program hangs. I guess it isn't surprising since those constraints become increasingly difficult to satisfy as n increases when drawing random points from the domain.

@jkaardal
Copy link
Author

Here is a minimal working example that reproduces the problem (SMF_example.py). Note that the algorithm nearly always completes immediately at an infeasible point when n = 8 but at n = 10 the program hangs during the initialization phase. If n is decreased below 8 the incidence of infeasibility decreases substantially becoming virtually non-existant at n = 4. When an infeasible acquisition occurs, the example code prints an alert to screen.

@jkaardal
Copy link
Author

jkaardal commented May 25, 2017

I ended up locating the problem. The issue is that samples_multidimensional_uniform() from submodule GPyOpt.util.general draws a finite number of random samples from within the bounds but neglects the inequality constraints. This was causing problems later on during the negative acquisition function minimization because the L-BFGS line search cannot find an f_acqu < 0. Similarly, this explanation is consistent with the difficulty observed in drawing a feasible initial x from a uniform distribution at n = 10.

I would imagine the only quick and practical solution for this would be to allow the user to customize or replace samples_multidimensional_uniform() to meet problem-specific constraints. I attempted to do this by replacing the submodule function (e.g. GPyOpt.util.general.samples_multidimensional_uniform = mySampling) which worked for the initialization phase but reverted back to the default when I ran bayes_opt.run_optimization(max_iter=max_iter, eps=eps), probably due to a re-import of GPyOpt.util.general somewhere. Instead, modifying samples_multidimensional_uniform() in the GPyOpt source code fixed the problem (but, of course, this is not a general solution).

A less practical but more general/automatic solution might be to provide an option to use fmin_cobyla or fmin_slsqp from scipy for the acquisition function maximization which allow for infeasible starting points (at least fmin_cobyla is supposed to), but these may require a substantial modification of the code.

@javiergonzalezh
Copy link
Member

javiergonzalezh commented May 25, 2017 via email

@jkaardal
Copy link
Author

Certainly, I can do that. I should be able to send a pull request sometime next week after I do some more testing. I will probably try to implement the intermediate solution you suggested as well if I have time.

@nakpinar
Copy link

nakpinar commented Jul 7, 2017

Hi Joel,
do you by any chance already have a pull request ready? I stumbled upon the same problem...
Thank you!

@jkaardal
Copy link
Author

jkaardal commented Jul 7, 2017

Hi nakpinar,

I ended up getting distracted by other obligations and have not yet had the chance to send the pull request. I was also holding off for a bit because I was a bit dissatisfied with my solution because it just involves the user passing a new keyword argument that replaces GPyOpt.util.general.samples_multidimensional_uniform with a custom sampling function rather than a general and automatic solution like was discussed on the May 24 post.

Having thought about it more, however, I am not sure an automatic solution will be possible for all but the most simple constraints (e.g. linear constraints). I have tested both my own interior-point algorithm and scipy's fmin_slsqp on nonlinear constraint satisfaction problems and have found that there remains a significant danger of "converging" to infeasible points. I would imagine fmin_cobyla would not be any better since it is an even more primitive approach to constrained optimization (but I have not tested this). (As an aside, there is an intuitive explanation for why infeasible "convergence" might occur for nonlinear constraints. If one were to attempt to find a feasible point by taking the squared-sum of the active constraints and find a least squares solution, the value of the objective function would be zero at a feasible point. But for a nonlinear problem, this objective function could be non-convex and have local minima with non-zero objective value.)

I will send a pull request in the next couple days with the simple keyword argument solution after I run the tests.

Question for the developers: Why is the value for the key 'constrain' a string (that is later evaluated) rather than a function? Is this meant to give the user access to local and class variables?

@javiergonzalezh
Copy link
Member

Thanks for the PR! Closing the issue now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants