Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usability of optimize.shgo #13813

Open
11 tasks
andyfaff opened this issue Apr 6, 2021 · 6 comments
Open
11 tasks

Usability of optimize.shgo #13813

andyfaff opened this issue Apr 6, 2021 · 6 comments

Comments

@andyfaff
Copy link
Contributor

andyfaff commented Apr 6, 2021

I've been trying to familiarise myself with the operation of optimize.shgo, and have run into several usability issues which limit the ability of a general user to apply it to their problems. A lot of these could be improved by a documentation route.

  • it's not clear how n and iters need to be adjusted to solve a particular problem, or how they should be varied if the dimensionality changes.

  • it's not clear how many function evaluations, nfev, that the user will be exposed to when they change the different solver options. How does nfev scale with iters, n and sampling_method? By experimenting it looks like there are a total of n * iters function evaluations (plus any local minimizer evaluations) if the sampling method is sobol (this isn't documented), i.e. nfev - nlfev = n * iters.
    When the sampling method is changed to simplicial the total number of function evaluations appears to be much larger, and have no clear dependence on n or iters.

  • When sampling_method = 'sobol', and iters = 1 it still looks like there are two iterations done. This might be a bug:

nfev: 1067   # output when iters=1, n=100, sampling_method='sobol'
nit: 2
nlfev: 867
  • Setting maxev does not have any bearing on the total number of function evaluations actually used. For one example I tried to set options={'maxfev': 100, 'maxev': 100}, but there were 750 function evaluations used overall. How does one limit the number of nfev?
  • When is the callback actually called? I counted the number of times the callback function was called, and it's not equal to iters. My expectation is that it would be called iter times.
  • The callback function is not called with the best value of x found so far, it varies quite a bit. From my point of view this renders the callback of little use; is there a good reason for this?
  • When a callback is provided a message Callback for minimizer starting at *x* is printed. There's no way of removing that message, I tried setting options['disp']=False, but to little effect.
  • It's not clear from the OptimizeResult why/how the sampling terminates. There is a minhgrd and f_tol, but the reason for termination is not generally given.
  • If f_tol an absolute tolerance or a relative tolerance? Should the minimiser have both?
  • The default value for f_tol is not explained.
  • How many local minimisations are done at each iteration of the sampling? When are the local minimizations done? From the docstring minimize_every_iter defaults to False, and the default value of local_iter is not mentioned in the docstring. I would naively expect a total of 1 local minimization if iters=5 and the default options apply (minimize_every_iter should be False)
@andyfaff
Copy link
Contributor Author

andyfaff commented Apr 6, 2021

@Stefan-Endres

@tupui
Copy link
Member

tupui commented Apr 6, 2021

The points about sobol are solved in #13469 which is ready to merge on my side (there is indeed a bug). This PR is currently blocking @Stefan-Endres as he wanted to ship enhancement after this PR.

@Stefan-Endres
Copy link
Contributor

A lot of these could be improved by a documentation route.

I definitely agree. I am working on the documentation before pulling the code into scipy. Currently what is eating most of time is that I wrote a new caching for the sampling step which fixed all of the bugs mentioned here and I'm trying to clean and document the code before opening a PR to scipy. In addition I need to update the parallelisation to be consistent with the other routines in scipy and avoid any extra dependency and the finally the issues mentioned in #13469.

When the sampling method is changed to simplicial the total number of function evaluations appears to be much larger, and have no clear dependence on n or iters.

This is a bug that was fixed in the upstream repository.

When sampling_method = 'sobol', and iters = 1 it still looks like there are two iterations done. This might be a bug:

There is a known bug where optimize.shgo always does one extra iteration. This has been fixed in the upstream repository.

Setting maxev does not have any bearing on the total number of function evaluations actually used.

  • maxev is the maximum number of evaluations, including constraint evaluations at a sampling point that is infeasible (in this case the objective function is not evaluated).
  • maxfev is the maximum number of function evaluations when the contraints at the are feasible.

Perhaps a small tutorial problem would be worth writing to demonstrate the difference.

When is the callback actually called? I counted the number of times the callback function was called, and it's not equal to iters. My expectation is that it would be called iter times.
The callback function is not called with the best value of x found so far, it varies quite a bit. From my point of view this renders the callback of little use; is there a good reason for this?

It is called in the local minimization routine. The current callback is used only in the scipy.optimize.minimize step. It might be more appropriate to have a separate callback function for the global and local routine to return the current best global?

When a callback is provided a message Callback for minimizer starting at x is printed. There's no way of removing that message, I tried setting options['disp']=False, but to little effect.

I see, thank you, this is a bug that still needs to be fixed.

It's not clear from the OptimizeResult why/how the sampling terminates. There is a minhgrd and f_tol, but the reason for termination is not generally given.

If f_tol an absolute tolerance or a relative tolerance? Should the minimiser have both?

Relative tolerance, I will update this. I don't know about having both, I would assume that only relative tolerance is needed unless there are applications where absolute tolerance is required (currently if the solution vector is at 0.0 then absolute tolerance is used, everywhere else relative tolerance is used).

How many local minimisations are done at each iteration of the sampling?
From the docstring minimize_every_iter defaults to False, and the default value of local_iter is not mentioned in the docstring. I would naively expect a total of 1 local minimization if iters=5 and the default options apply (minimize_every_iter should be False)

This depends on how many proven basins of attraction was found in a given sampling step or over several iterations without minimization.

When are the local minimizations done?

For each attractive basin a local minimization should only be done once. There is a bug in the current scipy version where the same basin of attraction could be run in the following iterations. This bug has been fixed with the new caching (this was the main reason for writing the new file which is unfortunately very large right now).

In general, there are a large number of options and the docstring is already quite large. But a lot of the confusion pointed out here is due to bugs. I think mainly the usability of shgo could be improved with more tutorial examples in addition to fixing the problems pointed out in this issue.

@andyfaff
Copy link
Contributor Author

andyfaff commented Apr 6, 2021

Often people often like to know how far through the minimisation procedure they are, perhaps using a progress bar like tqdm.

If you know the total length of computation (e.g. with differential_evolution there may be a maxiter), then you can express this as a percentage, and update a progress bar as the iterations finish and the callback is called.

If you don't know the total length of computation you can e.g. update a progress bar with minimum cost function found so far. If you get the best x so far in the callback this is relatively easy. If you're not, then you have to cache the best cost function (a few more lines of code) for comparison. Either way, you need to know what x value is being given to the callback.

@tupui
Copy link
Member

tupui commented Apr 7, 2021

Often people often like to know how far through the minimisation procedure they are, perhaps using a progress bar like tqdm.

If you know the total length of computation (e.g. with differential_evolution there may be a maxiter), then you can express this as a percentage, and update a progress bar as the iterations finish and the callback is called.

This is actually interesting and could be shown in the doc. We could even think about having an optional dependency to tqdm for that. Would be cool and be like PyMC3.

@yuriok
Copy link

yuriok commented Jan 1, 2024

I am currently attempting to solve an optimization problem using shgo. During the process, I found that the parameter n can control the number of sampling points generated. However, only a portion of the sampling points undergo local optimization in the final output. After inspecting the source code, I discovered that the issue occurs at line 1133 in ‘scipy\optimize_shgo.py’. The code checks whether each sampling point is the lowest among its neighboring points. In my view, this behavior is intended to optimize only those points that could potentially be local minima, thereby reducing calculation time. However, for my optimization problem, this operation may not be quite suitable. Whether it would be possible to add a parameter option to shgo to control this filtering behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants