-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: biteopt algorithm for global optimization #15298
Comments
A first thought is that it'd be preferable (if it's decided to add the algorithm) to have the entire minimiser code base ported to Pythran/Cython/Python.
|
Sounds like this might be difficult in the short term. Let's leave this open for dicussion though. |
For the RNG: if there is nothing special done, it should use the QMC generators we have in SciPy. Halton is flexible enough in terms of number of points and there is not anymore questions about convergence superiority over plain MC. Going to Pythran/Cython is not a requirement at first. As with DIRECT, we can first add the C version and convert it over time. What matters the most is the user facing API. |
QMC is a fine option to have, but I think users would expect the ability to use NumPy RNG. We did this sort of thing in gh-13330: |
Just to make sure we are talking about the same thing. I am only advising to use QMC if internally the optimizer needs to generate random numbers to generate a population (like in |
We were lucky with biasedurn because it was written with swappable random backends in mind (for integration with R). At a high level, what we did was implement a basic dependency injection scheme where we allowed assignment of rng functions (see base class function data members). The integrator is responsible for setting these data member functions to numpy rng generators before use. It looks like you could provide a new implementation for the CBiteRnd class that uses numpy and/or QMC under-the-hood. The same dependency injection scheme would probably work here -- that doesn't seem too difficult to me, but you can't really say until you try :) |
BiteOpt author here, stumbled upon this thread via websearch. "Better" PRNG won't make solver perform better. Just a waste of CPU cycles. Beside that, I've developed an astonishing PRNG myself (unlimited period), it also makes no sense to use it in BiteOpt. https://github.com/avaneev/prvhash As for the C++ code maintenance, I do not think this is an issue as I do all C++ development myself. So far no bugs were reported while according to PyPI analyzer, some 300 biteopt installations (of another precompiled Python wrapper) are done per month. |
Also important to note (so that no misunderstanding happens) is that BiteOpt is in a broader "stochastic global optimizer" category, it's not Differential Evolution. DE-alike population generator is used (among several others), but it's a non-standard DE, no crossover and no usual popindex randomization is used. I do have DE-alike optimizer in BiteOpt project called |
Why a better PRNG does not make a difference? The LCG used in BiteOpt passes 8MB window in PractRand. This is a lot more than necessary in a single optimization run. |
The discussion about replacing the PRNG is not about quality. It's about being able to use the same pseudorandom stream in all of one's code. In all of the other places |
I would say that PRNG in BiteOpt is "self-contained". In |
Hi @avaneev, thanks for dropping by. |
I do not think QMC will improve the method. E.g. if we are talking about 20-dimensional problem and a population size between 60 and 200, population sampling is so sparse relative to the problem's possible surface there's no difference in sampling method. |
On the contrary 200 points in 20 dimensions is well enough to show QMC's benefits. If you try to construct a response surface (with RBF, PC or GP for instance) you will see that it makes a huge difference. There is an extensive body of literature showing this. |
Maybe, if it's DE with 20*Dims, or in the thousands (there are some DE variants that converge extremely slowly and sample thousands of points, in an attempt to do "exhaustive" search). BiteOpt's popsize formula is 13+3*Dims. It really is not worth it, even the distribution of sampling does not play a role - I've extensively experimented with uniform, box, Gaussian. Just no measurable difference on big testsets. BiteOpt is not a pure DE - it has "random moves", not just "mutations". Sampling in DE may be important as it derives solutions from the sampled population. As I've replied earlier, there should be no misunderstanding that BiteOpt is not DE. |
I will wait to see a PR and try to experiment then 😃 Thanks for the discussion. I could understand that you did not see much difference with different distributions if the samples were from some kind of MC. MC is just extremely bad as the dimensionality grows and sampling a Gaussian or a uniform could lead to the same samples with a low number of points. It's similar to an integration problem in nD, exactly what QMC was made for and is known to have at worst the same convergence rate. (p.s. I am only talking about DE because we enabled use of QMC in SciPy's implementation. My comments are more general about sampling a small population in nD). |
I think QMC as a solution to sampling depends on the optimization method. DE or Nelder-Mead, whose solution generation depends on the diversity of initial population it may be useful, but even if so, it should be some "special" quasi-randomness, tuned to subsequent "derivations" via mutation operations like |
@avaneev It looks like you've written that integrality constraints can be added simply by rounding the decision variables. I know it might not be efficient, but currently, we don't have any global optimizers that claim to handle integrality constraints. It would be helpful to have something. Does it sound good to add Would a simple approach like this be reasonable with any of our other global optimizers? |
Yes, BiteOpt handles discontinuous variables acceptably. And a little number of boolean variables at the same time. But such variables, if they were discretized to only a few possible states, transform the problem into a combinatorial problem which BiteOpt and likely any other continuous method can't solve well. So, I would say BiteOpt can handle "discretized" continuous variables. |
scipy would have the advantage to use code which is actively being developed. If a PRNG backend agnostic version of biteopt is required, it would be harder to benefit from upstream updates. Still, if numpy RNGs are a strategic decision for scipy as users and depending libraries are very much used to them, that should have higher priority after all. While we are it: Maybe the importance of numpy PRNG should then also be prominent in the developer docs (might have missed it too). |
Here's an example of BiteOpt use in mixed-integer optimization, by Dietmar Wolz, an active participant of space trajectory optimization competitions. BiteOpt is fast at finding "good" solutions, and only slightly worse with an unlimited time budget (not the case in fast-paced competitions). Scipy's DE is basically unusable for this sort of applications. https://github.com/dietmarwo/fast-cma-es/blob/master/tutorials/Scheduling.adoc |
Biteopt clearly performs well enough (on our benchmarks and in other places) that it'd be great to have in SciPy. The two key issues for integration are:
(1) is discussed enough above (someone should try this as @mckib2 said), so I'll focus on (2). As @andyfaff said, maintenance issues can be quite painful when we integrate external C/C++/Fortran code bases. We are currently a month behind releasing The code is in okay-ish but not great state, it's ~3500 lines of C++ code (based on the That the code is actively maintained and improved is a significant plus, as is the fact that @avaneev shows up here to participate. The former means it does not make sense to translate the code to Cython/Pythran - and C/C++ with a thin wrapping layer is actually preferable anyway once code gets over a certain level of complexity. @dschmitz89 you've been fairly active recently, working on multiple optimizers - if you'd want to help maintain the added code, that would certainly be helpful here too. Given that
What would also be helpful is to modernize the upstream code a bit. Just running it through |
Sorry for off-topic, but the style clang-format creates is a "common denominator" style. First, it does not look good without a syntax highlight, and secondly, it turns constructs like after Just for perspective, I have 20 years of full-time C++ experience. Also, generalizing problems with C++ code with some packages to other instances of C++ code is not a great idea. |
@avaneev styles and tastes about styles vary, and there's probably not much point arguing about them. It's just that if we incorporate this code, it'd be preferable if it didn't look very foreign and therefore much harder to read to people who are used to "regular" or "common denominator" style.
You do have to realize that when we incorporate this code, it's going to go from 300 installs/month to 30,000,000 installs/month. So you're going to get the weirdest bug reports, on hardware platforms you don't have access to, and on corner cases you may or may not personally care about. The level of support we have to offer is unfortunately painfully high. |
Well, 300 installs/month without issues is a good indicator 30mln installs/month won't cause a trouble if supplied in binary form. If supplied in source code form it's another story as some systems may have an outdated C++ compiler. My other open-source C++ projects range from 1000 to 6000 installs/month, and even in source code form they generate no issues. In commercial field I have 100000 binary installs per month, across Windows and Mac OS. Not that this matters much, but I think you can rely on my experience at least. If there are bugs discovered, they will be fixed. |
Yes, unfortunately we have to support both packagers and end users with the most niche configs. xlC on AIX, mingw on Windows, custom compilers for HPC systems and cross-compiling, etc.
Thank you, much appreciated. Okay, I think the point on maintenance is clear. We do want to integrate
This is a separate thing to do, would indeed be nice to add a section on this at http://scipy.github.io/devdocs/dev/api-dev/api-dev-toc.html. |
I have this covered in my practice as well: I simply do not use C++ features beyond C++98 in my code; yes, not even STL, which increases compile times considerably (e.g. Eigen is a bad practice C++ IMO). As you can imagine, several thousands of C++ source code installs end up in very diverse developer workflows. Not to mention the compiled code ends up in millions of binary installs. |
This is not an appropriate forum. I am asking you to stop. You may write about whatever you want on your own repo or blog. If it's "hardly discussable" in those forums, it is not at all discussable here. Thank you. |
Okay, I'll stop. But I've noticed an interesting logic: if it's not discussible there, then it's not discussible here. |
Overly elliptical phrasing on my part. "If your issues about how BiteOpt is integrated into |
Hm. I think I've posted much more than "biteopt in fast-cma-es-related issues". Seems like you are somehow not wanting to refer to posts other than "biteopt in fast-cma-es-related issues"? But anyway, I won't continue. Don't want to hear "stop posting bullshit". |
Thank you. I can clarify more if you like in private email (address in my Github profile), but we can leave that off of here as the meta-discussion is also off-topic. |
I just want to mention that I'm not a novice in writing "garbage journal-quality papers", so I'm not just telling that figuratively. A bit of self-promotion: https://vixra.org/author/aleksey_vaneev It's worth to note a recent geomagnetic storm that floored Starlink satellites. That's plasma from Sun if I'm not mistaken. Unpredictable sattelite fall-off: unknown gravity perturbations. That's what one of my papers assumes-plasma can create gravity. A paper about "energy", if experiment can be replicated at all, would just destroy all energy business worldwide. Junk science, of course. |
This is a constructive idea to the community. When someone tells you CMA-ES is "fast", check the complexity analysis of "Symmetric tridiagonal QL algorithm", or something 100% equivalent that has less complexity. It's what makes CMA-ES uncompetitive on higher-dimensional non-expensive problems. SMA-ES is a result of my deconstruction of CMA-ES and then reconstruction using my own ideas. Source of creative inspiration, if scientists talk that way at all. |
You are continuing to be off-topic after promising to stop. This is not your personal blog. |
Then close it, if you have the authority. |
Or saying softer, you may simply unsubscribe. When a topic turns into a wrong direction, I usually just unsubscribe from updates. |
Just that you know, I'm officially diagnosed with "paranoid schizophrenia" in Syktyvkar's (my native city) psychiatric system. I can prove it, they were stupid enough to provide me an official signed reply on my request to change the diagnosis. Legally in Russia, what I do or say can be interpreted as behavioral disorder, unless, of course, I do some unlawful things. So, unfortunately, what I do or say may cause schizophrenia symtpoms in others, because I overcame the "disease" without anyone's help, and neuroleptics just made things worse (not taking them for quite a while already). The root case of schizophrenia is "cognitive dissonance", it can be so great that psychosis is imminent (it's a relief actually, resetting all cognitive stress). Any professional can induce it (so that you now) depending on his level and the level of "bullshit" barriers of a victim. Some psychiatrists want to know one's life completely, which is not exactly logical. This is not exactly a nice "bullshit" I know, but you can close a topic, really. If you thought that I want 'scipy' intergration, you are not exactly right. I want to talk to people, like any schizophrenic wants, but can't - the disorder can be so great the speech becomes completely unintelligible gibberish. And I think some psychiatrists can "educate" a victim to shift to such "style". |
This issue does not belong to you. Yes, I certainly have the authority to close it, but the issue that @dschmitz89 opened is perfectly on-topic and appropriate; you have hijacked it with your off-topic comments. Blocking you and you alone would be the next step if you persist. You have shown that you are perfectly capable of identifying when your comments are off-topic (you labeled them so) and you have been told that the off-topic comments are unwelcome and promised to stop. |
(I myself speak intelligible Russian, of course, but I witnessed an psychiatrical attempt, can't prove it, so treat it as an information) |
It's called don't give a microphone to a guy you do not know exactly well. Sorry, that's life, we all want to survive. |
It's truth mixed with lies, of course. But it's irrelevant, dates, numbers are there. |
I do not have a problem with psychiatry myself, the laws are just in Russia in this matter. But you are going to be classified a schizophrenic to the end of your life, which is not exactly right. It's a worldwide problem. |
And I'll add that I'm completely debt-free. Banks are owing me money. And they are sometimes acting like little sluts asking to invest here or there. |
I think "schizophrenics" that talk "unintelligible gibberish" can be healed - I witnessed they can say an intelligible phrase that does not dissonate with my perception of reality. To me it feels like medical science wants to silence such people (check the life expectancy adjustments of schizophrenics' population). The VERY first dose of mouth-dissolvable Olanzapine (Zyprexa), AFAIR, completely disorganized contact of my thought with my vocal tract. But I recovered after some time: not all brains are created equal, it's a basis of evolutionary theory. But I think I'm not the only one, of course, some are untrained or turned into complete human bullshit: populations evolve in cycles, that's why civilizations fall and any prior form of control falls apart. |
Talend |
Whatever that means. "Dietmar Wolz/Woltz" mentioned it. |
Complete, full loss of sight, a cancer of eyes, is the real, experienced lifetime fear, mainly for complete control freaks that rose to power. No need to wonder what a "black hole" is: Project-ion of fear to everyone in a hope somebody solves the problem and shares with everyone. It is an inbuilt populational protection mechanism. Some turned blind already during our lifetimes and were easily removed from the scene by forces noone knows about. |
I witnessed that local psychiatrists, somewhere around after 2011, told me a lie that I told them that I have a "black hole" in my stomach. Do not wonder why. |
Companies like Gogel may hope that a programmer's "work of art" they have bought hiring them has an unlimited timespan of copyright. While in a sane law system author's rights last for a specified timespan after his death, and then author's rights become "public domain" that, of course, cannot be copyrighted (otherwise it's schizo). It's usually too long in the current situation (70 years in some countries). Even if GGL hopes to have a copyright indefinitely, he can be duped to believe so by "deaths". CORPO-RATS. Even companies turn into shitty living organisms emanating decay from unknown reality. |
QoS, you know. ... a feeling something goes wrong, and may probably kill you in real life |
Like a nurturing and helping REAL mother... it's cognitive dissonance, a schizo. |
This issue has been brought to the attention of the Code of Conduct Committee. We want to acknowledge this is a difficult issue. The discussion here clearly has gone off-topic, and despite that being pointed out and being asked to stop, @avaneev hasn't stoppped posting. That said, we believe @avaneev isn't intentionally trying to harm anyone else here, and may have legitimate medical problems - we are not qualified to address those, but nevertheless will reach out privately to @avaneev for dialogue. Regarding this issue, our opinion is that the initial feature request and part of the discussion that followed it has value, however the issue now is too polluted to be useful. We will therefore close this issue and lock discussion, and we will summarize the useful parts in a new issue. We ask @avaneev and all other participants to please keep discussion on that and other issues on-topic, and respect our code of conduct. |
Is your feature request related to a problem? Please describe.
biteopt is a differential evolution like global optimizer which scored first in Scipy's benchmark suite in an extensive recent benchmark .
Besides great accuracy it is also very fast as it is completely implemented in C++: https://scipybiteopt.readthedocs.io/en/latest/Benchmark.html
A scipy.like API for biteopt is available at scipybiteopt based on this wrapper. The python wrapper is implemented based on Python's C API. Currently, the wrapper does not properly handle Python exceptions or errors. Maybe a ctypes or Cython wrapper would be a more robust alternative?
Describe the solution you'd like.
A function
scipy.optimize.biteopt
.Describe alternatives you've considered.
No response
Additional context (e.g. screenshots, GIFs)
Using the same benchmarking code in the DIRECT PR, I get the following performance plots for biteopt: 180 solved problems. This is 13 more than for
dual_annealing
anddirect
which seem to perform best currently.The text was updated successfully, but these errors were encountered: