-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] MAINT: Update optimize.shgo #14013
Conversation
Thank you for all these improvements! I will try to have a look on it fast for this to have a chance to be in the release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a huge update, thanks again for putting this together.
I did a first pass and I would first advise to clean these in order to facilitate the review. Things like reverting American spellings would lower the diff size a lot.
For workers
try to follow the same usage as in differential_evolution
. There is also a helper function to handler the pool (_util.MapWrapper
).
Also, there are lot of TODOs. I would try to clean these as well (put them in the issue with a checklist instead).
scipy/optimize/_shgo.py
Outdated
minimizer_kwargs=None, options=None, sampling_method='simplicial', | ||
workers=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New arguments should be keywords only.
minimizer_kwargs=None, options=None, sampling_method='simplicial', | |
workers=None): | |
minimizer_kwargs=None, options=None, sampling_method='simplicial', *, | |
workers=None): |
scipy/optimize/_shgo.py
Outdated
local minimisation routine every iteration. If False then only the | ||
final minimiser pool will be run. Defaults to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
American spelling with z
should be used as the whole documentation uses optimize
. Here and in other places.
local minimisation routine every iteration. If False then only the | |
final minimiser pool will be run. Defaults to False. | |
local minimization routine every iteration. If False then only the | |
final minimizer pool will be run. Defaults to False. |
scipy/optimize/_shgo.py
Outdated
workers : int optional | ||
Uses `multiprocessing.Pool <multiprocessing>`) to sample and run the | ||
local serial minimizatons in parrallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add default and information about -1 (copy from differential_evolution
).
scipy/optimize/_shgo.py
Outdated
minimizer_kwargs=None, options=None, sampling_method='simplicial'): | ||
def shgo(func, bounds, args=(), constraints=None, n=100, iters=1, callback=None, | ||
minimizer_kwargs=None, options=None, sampling_method='simplicial', | ||
workers=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should default to 1 here as differential_evolution
and other ones.
scipy/optimize/_shgo.py
Outdated
self.symmetry, self.bounds, self.g_cons, | ||
self.g_args) | ||
if self.disp: | ||
print('Constructing and refining simplicial complex graph ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print or logging? Same bellow.
scipy/optimize/_shgo.py
Outdated
logging.info('=' * 30) | ||
|
||
if self.HC.V[x] not in self.minimizer_pool: | ||
self.minimizer_pool.append(self.HC.V[x]) | ||
|
||
if self.disp: | ||
logging.info('Neighbors:') | ||
logging.info('Neighbours:') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not needed either (while in this case it does not really matter, it increases the diff which makes it harder to review). In general we use American spelling.
scipy/optimize/_shgo.py
Outdated
results['x'] = self.xl_maps[ind_sorted[0]] # Save global minima | ||
results['fun'] = self.f_maps[ind_sorted[0]] # Save global fun value | ||
|
||
self.xl_maps = np.ndarray.tolist(self.xl_maps) | ||
self.f_maps = np.ndarray.tolist(self.f_maps) | ||
return results | ||
|
||
# TODO: In scipy version delete this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for caching there is functools.lru_cache
scipy/optimize/tests/test__shgo.py
Outdated
import pytest | ||
from pytest import raises as assert_raises, warns | ||
from nose.tools import nottest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use nose
. If you want to skip a test, use @pytest.mark.skip(reason="This is not run because ...")
scipy/optimize/_shgo_lib/_vertex.py
Outdated
self.process_fpool = self.proc_fpool_g | ||
else: | ||
self.workers = workers | ||
self.pool = mp.Pool(processes=workers) #TODO: Move this pool to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rely on _util.MapWrapper
instead
Thank you very much @tupui . Apologies about the doctrings I think I accidentally pulled the wrong direction in the last merge. I will update and clean this as soon as possible to get it up to review standard. |
No problem at all 😃 |
BTW, we recently added bounds to Nelder-Mead, so the |
Hi @Stefan-Endres just checking in 😃 Any update? |
Hi, apologies for letting the time slip on this, I will have the weekend free to work on an hopefully finish this. |
No problem 😃 Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another superficial pass.
There are unchecked item in your list. Are you still working on these or is the PR ready for more review?
Did you check that the resource usage is constant? (CPU, RAM). With more workers in the loop and the rewrite, things can go bad fast.
@@ -241,7 +260,7 @@ def shgo(func, bounds, args=(), constraints=None, n=None, iters=1, | |||
specified using the `bounds` argument. | |||
|
|||
While most of the theoretical advantages of SHGO are only proven for when | |||
``f(x)`` is a Lipschitz smooth function, the algorithm is also proven to | |||
``f(x)`` is a Lipschitz smooth function. the algorithm is also proven to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``f(x)`` is a Lipschitz smooth function. the algorithm is also proven to | |
``f(x)`` is a Lipschitz smooth function, the algorithm is also proven to |
sfield=self.func, sfield_args=self.args, | ||
symmetry=self.symmetry, | ||
constraints=self.constraints, | ||
# constraints=self.g_cons, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# constraints=self.g_cons, |
self.fn = 0 # Number of feasible sampling points evaluations performed | ||
self.hgr = 0 # Homology group rank | ||
|
||
# Default settings if no sampling criteria. | ||
if (self.n is None) and (self.iters is None): | ||
self.n = 128 | ||
self.nc = 0 # self.n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.nc = 0 # self.n | |
self.nc = 0 |
self.n = n # Sampling points per iteration | ||
self.nc = n # Sampling points to sample in current iteration | ||
self.nc = 0 # n # Sampling points to sample in current iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.nc = 0 # n # Sampling points to sample in current iteration | |
self.nc = 0 # Sampling points to sample in current iteration |
if (not isinstance(constraints, tuple)) and (not | ||
isinstance(constraints, list)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (not isinstance(constraints, tuple)) and (not | |
isinstance(constraints, list)): | |
if (not isinstance(constraints, tuple)) and \ | |
(not isinstance(constraints, list)): |
# Generate sampling points. | ||
# Generate uniform sample points in [0, 1]^m \subset R^m | ||
if self.n_sampled == 0: | ||
self.C = self.sobol_points(n, dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is sobol_points
defined?
import itertools | ||
import json | ||
import decimal | ||
from functools import lru_cache # For Python 3 only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from functools import lru_cache # For Python 3 only | |
from functools import lru_cache |
HC.V: The cache of vertices and their connection | ||
HC.H: Storage structure of all vertex groups | ||
|
||
:param dim: int, Spatial dimensionality of the complex R^dim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc is in the wrong format here, should be numpydoc.
|
||
# Printing | ||
if printout: | ||
print("=" * 19) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some print left
""" | ||
Cache objects | ||
""" | ||
class VertexCacheBase(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore
class VertexCacheBase(object): | |
class VertexCacheBase: |
Hi @tupui apologies I should've commented that I need a bit more time before another review pass. Aside from the TODOs there is also a few more things to implement from your earlier comments in this PR. I compared CPU times on the GO benchmarks, but not RAM usage yet I didn't think about that. I suppose the approriate measure would be to benchmark memory usage on the for |
No worries, let me know when you're ready or need help with something 😃
Thanks! Would be great so that we have full confidence with all the changes. |
@Stefan-Endres : I was thinking of adding support for the Bounds class for shgo. Is it worthwhile at the moment before this PR is merged? |
I think it's worth doing now as I would need to rebase this PR anyway, I am not certain when I will have full free weekend again to finish this. |
FYI we've merged the improvement @Stefan-Endres. |
Hi, I'm just checking in to see if this update is already in the latest version of scipy because I'm still having problems of this kind. If it is already included, I can open a new issue so that maybe someone can look at it. |
Hi @wilrop, thanks for checking in. Please don't open any new issue on this. This PR is still not merged and would address the issues. @Stefan-Endres is still working on it. |
@Stefan-Endres I am adding a release milestone. We have until around June to finish this. |
@Stefan-Endres FYI the branching for 1.9 is next week (26 of May). Let me know if I can help. Otherwise this will be pushed to 1.10. |
Great thank you! I will be on the lookout to review. The memory leak is just (I hope) a sanity check, so yes just running it twice with a low count and with a high count of worker is fine. |
I ran into compilation rebasing this PR last night, currently running the test suite. Some preliminary results indicate memory usage of Running import sys
from memory_profiler import profile
from pympler import asizeof
from scipy.optimize._shgo import SHGO
from scipy.optimize import rosen
# Define SHGO class
bounds = [(0,2), (0, 2), (0, 2), (0, 2)]
shc1 = SHGO(rosen, bounds, iters=4) # With only 1 worker
shc2 = SHGO(rosen, bounds, iters=4, workers=2) # With 2 workers using multiprocessing
shc1.iterate_all(), shc2.iterate_all()
print('Results with 1 cpu:')
print(f'asizeof.asizeof(shc1) (SHGO class) = {asizeof.asizeof(shc1)}')
print(f'asizeof.asizeof(shc1.HC) (Cache container) = {asizeof.asizeof(shc2.HC)}')
print('-')
print('Results with 2 cpus:')
print(f'asizeof.asizeof(shc2) (SHGO class) = {asizeof.asizeof(shc2)}')
print(f'asizeof.asizeof(shc2.HC) (Cache container) = {asizeof.asizeof(shc2.HC)}')
print('-')
print('Difference in memory usage:')
print(f'SHGO class: asizeof.asizeof(shc1) - asizeof.asizeof(shc2) = {asizeof.asizeof(shc1) - asizeof.asizeof(shc2)}')
print(f'Cache containers: asizeof.asizeof(shc1.HC) - asizeof.asizeof(shc2.HC) = {asizeof.asizeof(shc1.HC) - asizeof.asizeof(shc2.HC)}') produces
However, using the shgo library it is only about 0.003% bigger: from shgo._shgo import SHGO # Use shgo instead of scipy version
from scipy.optimize import rosen Produces
I don't fully understand why the scipy code which is idenitical has a bigger overhead for multiprocessing. Next step is to finish rebasing and rerun the same tests on all the scipy GO benchmarks, please let me know if anyone prefers using different methods of measurement for testing the memory usage. |
…ns that occur for large parameter values (#14036) Avoid the numerical issues in `roots_jacobi`, `roots_gegenbauer`, and `_gen_roots_and_weights` that occur for large values of a and b in `roots_jacobi(n,a,b)`. Also helps with large values of n (i.e. n > 200). [ci skip]
Apologies no idea why this happend, @tupui perhaps it would be better to close this PR and start a new one? Despite rebasing I still get this error when trying to build and run the tests:
|
Seems like you rebased agains a maintenance branch and not main. @Stefan-Endres feel free to open a new PR if that's easier for you. FYI, since you opened this PR the infrastructure evolved a bit and we have a new convenient developer interface. If you have issues, I would suggest you looking into creating a new conda environment with the meson yaml and using |
I'm not sure how to undo the rebase and GitHub automatically adding a bunch of reviewer requests etc. I think it will be easier if you don't mind.
Thank you, I will try this after work tonight instead of the old build instructions. |
Let me know if you need help. We can make a call as well 😃 |
Hi, I'm probably super late to the party but I think there is a (different?) bug in how constraints are interpreted. If I add a single dictionary as a constraint everything is fine, but if I give a list of dictionaries as constraints it completely ignores all of them. Is this something that is handled with the current improvements or completely new? I've developed a minimal example, so I can create a new issue if that helps. |
@wilrop if this bug is different from existing ones you are free to create an issue. Another possibility is that you wait a bit for an update here so we can try to reproduce your bug with latest changes (better IMO). |
Okay no problem, I'll wait and open an issue if the problem persists. |
@wilrop please open an issue with your code, there are unittests that have tuples of active constraints so this is possibly a new bug. |
I've just discovered that the bug is not specific to the list of constraints but rather with indexing in the |
Solution suggested by jjramsey in scipy#14589 (comment) The answer at the time was that an important refactor was in progress scipy#14013 This pull request is now closed and the bug persists in version 1.8.1 accordingly to this stack overflow question https://stackoverflow.com/q/72794609/12750353
Reference issues/fixes
args
(should also resolve scipy.optimize.shgo(): 'args' is incorrectly passed to constraint function when 'sobol' sampling is used #12114).Improved features
workers
argument.SHGO.iterate()
until a "satisfactory" solution is found when no other stopping criteria is available (commonly requested feature).Code streamlining
_shgo.py
file to improve maintainability and readability.-- (The main reason this PR for _shgo_lib is unfortunately so large is because adopting a new cache turned out to be the only way to solve the issue with the
simplicial
sampling method not iterating correctly).--
if a is 'teststring':
-- type(thing)
Benchmark profiles
As a preliminary test I reran the GO benchmarks for the
simplicial
method and compared it to the previous results quickly by overlaying a transparent window performance profiles over the previous results. As can be seen below the newsimplicial
method (darker blue solid line overlay) is noticeably faster than the previous results (lighter blue solid line).The increased performance largely due to fixing the issue that required the sampling method to refine the entire complex instead of a fixed number of points.
Still to do is to rerun all the results and new sampling methods together with the other GO methods.
TODO:
Sobol
andHalton
sequences to improve deterministic performance.optimize.shgo
#13813, but as this pull request is.Proposed reviewers
@tupui @andyfaff