New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: vectorize scalar zero-search-functions #8357
Conversation
* add test for newton with arrays * import asarray and np.abs * check any of the derivatives for zero, cast fder to array * check all newton step deltas, use numpy abs to cast (p-p0) to arary Signed-off-by: Mark Mikofski <bwana.marko@yahoo.com>
Hi! I suggest to change the title to explain what it is --- BUG, ENH, etc, just look at the recent pull request. And move the references to the issues into the description, so everyone could follow the links (they are not useful being in the title). |
I took the liberty to do that actually, also to GitHub-link the issue in question. |
* make pep8 happy
* instead of just at testing, so that halley and secant are easier - remove asarray(fder == 0).any() when testing if any fder is zero, just use (fder == 0).any() since fder is now an ndarray by default - still need np_abs(p - p0) because there is no .abs() method for ndarrays * convert initial guess, p0 -> ndarray using asarray(1.0 * p0) * convert fder -> ndarray using asarray(fprime(*myargs)) * convert fval -> ndarray using asarray(func(*myargs)) * just compare max newton-step instead of all of them, eg: abs(p-p0).max() < tol
* use the non-variant form of halley's, eg: dx = - 1 / (fder/fval - fder2/fder/2 - fder(N+1)/fderN/N ...) * use boolean indexing for parabolic halley's variant, but comment out
* add test for arrays in newton for secant * remove sign import, but import where from numpy * replace 1e-4 with machine specific delta as finfo(float).eps**0.33 * convert x0 to ndarray using asarray * use where to make array of +dx where x0>=0 else -dx * combine if - else conditions into 1-line * convert q0 and q1 to ndarray using asarray * add comment "check for divide by zero" * use .any() on arrays to check for divide by zero * use np_abs for all newton steps and just check the .max() * convert q1 if re-evaluated to ndarray using asarray
Hi @person142 , @ilayn , @rgommers , @nmayorov , @andyfaff , OK I think this is ready for review. Can you please review these changes or suggest reviewers and consider merging it into master? I look forward to your feedback. Thanks! |
@mikofski thank you for your eagerness to contribute. For new features like this it's advisable to post on the scipy-dev mailing list to begin with, so there can be wider discussion about whether the new feature is desired, and possible design, before the development work is started. |
Thanks @andyfaff. I've posted a comment with a link on the scipy-dev mailing list as you suggested. IMO multi-processing will add unnecessary overhead, but I'll do some benchmarking to demonstrate this. Also IMO, a much easier approach is to use numpy arrays. I only had to make few changes to use numpy arrays, and I also improved the Halley method addressing #5922 and removed several if-else statements, which IMO simplifies the codebase - the new code is 5 lines shorter. I've added new tests for arrays using all 3 methods (Newton, Halley, and secant) and all tests pass. |
@andyfaff here is the benchmark of mp versus numpy arrays: TL;DR: comparison of multiprocessing vs numpy arraysUsing NumPy arrays is about 200x faster than using multiprocessing.
Here are the code snippets: I ran these from an interpreter and timed it using >>> from benchmarking_newton import bench_mp_newton, bench_mp_secant, bench_mp_halley
>>> %timeit bench_mp_newton()
# 1 loop, best of 3: 2.66 s per loop
>>> %timeit bench_mp_halley()
# 1 loop, best of 3: 4.25 s per loop
>>> %timeit bench_mp_secant()
# 1 loop, best of 3: 2.15 s per loop Then I checked the output to make sure it was consistent. Then repeat with the other snippet. Of course this is probably because I have a 2011 mac with an intel 2.3GHz i5 and 16GB of 1333MHz DDR3 ram, but I think that just underscores my reason to prefer numpy arrays over mp for this application. |
* use assert_warns context to catch runtime warning * don't return anythin
* comparing a sequence with a scalar, is always false, so use p0 instead of x0, since p0 is an array, so can be compared * also can't multiply a sequence by a float, so use asarray(x0) * 1.0 to convert to floats * just p0 instead of x0 in secant * need to find just where the denominator is zero but the numerator is still non-zero, that's the only places that should issue a warning. * fix assert_warns, don't use context, remove test for secant, just check zero derivative *
Hi @person142 , @ilayn , @rgommers , @nmayorov , @andyfaff , PS @bmeyers , @scopatz , @jakevdp , @katyhuff , @tkelman maybe could you take a look at this and comment? thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with this code, but I guess vectorizing roughly works here if you're wanting to simultaneously solve many independent univariate problems. I'd personally throw a multivariate constrained solver at the problem (e.g. ipopt via pyomo) but that's my solution to everything.
scipy/optimize/zeros.py
Outdated
fder = fprime(*myargs) | ||
if fder == 0: | ||
fder = asarray(fprime(*myargs)) # convert to ndarray | ||
if (fder == 0).any(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't you need to keep iterating the other elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could, but IMO you really don't need to. I think it requires indexing which IMO makes the code a little messy. Do you feel very strongly that it should keep iterating the other cases? I'm open to changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'd exit early upon hitting a problematic termination condition for any element, even if remaining elements are far from convergence. but I'd defer to a scipy maintainer's opinion here, since non-scalar usage would be new functionality (right? does the released code error on arrays?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. See the traceback in #8354.
scipy/optimize/zeros.py
Outdated
fder2 = asarray(fprime2(*myargs)) # convert to ndarray | ||
# Halley's method | ||
# https://en.wikipedia.org/wiki/Halley%27s_method | ||
p = p0 - 2 * fval * fder / (2 * fder ** 2 - fval * fder2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is removing the switch changing the scalar behavior here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this switch is particular to the "parabolic Halley's method" a variant, see link in #5922. I removed it because branches require indexing which IMO makes the code a little messy. Using the traditional Halley's method, see Wikipedia link in code, removed the need for branching, and IMHO makes the code cleaner. Does this answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Guess I'd avoid coupling behavior changes for the scalar case to the feature addition of vectorization (but again my opinion here matters less than the maintainers'). The two changes could be done separately, one but not the other, or both, though vectorizing the modified version is simpler as you say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess I'd avoid coupling behavior changes for the scalar case to the feature addition of vectorization
I agree 100% with this. Let's stick to one issue at a time. (The modification looks more prone to overflow because of the fder**2
, so it's not clear the changes are a win.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version (as of scipy-1.0.0) with the "parabolic" variant of Halley's already has fder**2
on L183, so if it is likely to overflow, then it would have already.
I agree it would be more explicit to future maintainers to understand if we separated the switch from the parabolic variant of Halley's to the more widely accepted form into a separate PR. Although it's a requirement for this PR, because the more common version of Halley's doesn't require any branching which I'm really hoping to avoid here. So should I open another PR to propose switching out the parabolic variant of Halley's?
scipy/optimize/zeros.py
Outdated
if p1 != p0: | ||
msg = "Tolerance of %s reached" % (p1 - p0) | ||
divide_by_zero = (q1 == q0) | ||
if divide_by_zero.any(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here as well, wouldn't it be better to continue for all other elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous answer in the newton
branch. My ethic was to make the smallest changes possible to add some new features. If users demand more features, then I think we should discuss the trade offs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The halting conditions are the biggest thing to think about here. This way is simple, but I'm sure that someone is going to yell at us in the future about evaluating their callback function too many times. Might be worth seeing how messy the indexing would look.
Thanks Tony! Great feedback! I had already thought of using |
PS @tkelman and others, this PR is in support of pvlib-python issue 408 in case you want to comment or contribute. |
@tkelman, see this comment for a comparison that includes multivariate methods |
Hi @person142 , Are you the maintainer for Best Regards |
No, we don't really have dedicated maintainers, just people who work more or less on various modules.
I guess I already laid out my personal preferences, but it seems there's a demand here and I don't want to stand in the way of that. I can take a look at the code, though I would note that there are other people around more qualified to review |
Thank you very much for your response @person142. I really appreciate your time, and I look forward to your review. If there's anyone else who you think should review this PR, please let me know. I am also working on the
Thanks again! |
Thanks @mikofski
@pvanmulbregt IIRC there was some more conversation after your last comment above in this PR discussion. Can't find it so quickly - where are we now on this? |
@jaimefrio's comments seem to have been addressed long ago
@rgommers I haven't seen any response to my comments. |
I'm OK with letting the scalar and array versions diverge. I'm sorry if I implied otherwise. IMO it's up to the SciPy Dev community to decide the future. @pvanmulbregt are you OK with merging this PR now, or are there more changes you want to discuss? |
Hi @rgommers if there are no further comments, do you think this PR be merged now? Thanks! |
Hi @mikofski, one more request: please add a clear example in the docstring. I did some testing, and it seems to me that your intended use-case is quite specific here, namely many starting points that are close together. If they're far apart, my first test sees 5x slowdown over a naive loop:
Did I understand your intention with this code? Either way, a clear example would help. |
I did go through the whole conversation here again. Overall there does seem to be a need, although there are also some reservations. It looks like a hesitant +0.5 on balance.
It looks like given this API, users would expect the same behavior for scalar as for sequence input, also for future extensions. However, it always seems possible to support sequence input in a simple for-loop in case of future extensions that aren't easily vectorizable. That would then just not be as fast as expected, but it would work. |
Hi @rgommers, I added a note to indicate the use cases that might benefit most from the vectorized form, and an example of a vectorized function. RE: the intention of this code, it was prompted by work on solar energy modeling package called PVLib-Python in which I needed to solve a very large (>1e5) set of implicit equations, which have the same form, but different coefficients. Thanks very much for your input, I am glad you think it will be a positive contribution, and I hope that others do as well. I will strive to keep it up to date with the scalar branch when appropriate and in collaboration with the SciPy Dev community. |
Okay, in it goes. Thanks @mikofski for this feature and the persistence, and everyone else for review/discussion. |
I'll send a follow-up with some minor tweaks, it was just time to get this merged. |
Added in scipygh-8357, seems unnecessary.
follow up in gh-8969. most importantly, removes the unnecessary |
Added in scipygh-8357, seems unnecessary.
Sorry I didn't have a chance to review the PR when it first came out. I support this change + think it's useful... for supporting applications that can benefit from Newton's method and related root finders. It is not a solution for addressing #7242 and I would ask that you edit the issue description accordingly. I've edited the title of #7242 to clarify that I meant a robust method such as Brent's or Chandrupatla's. |
@jason-s thanks. I removed "addresses 7242" from the description to clarify that this only addresses Newton and not bounded root finders. |
Fixes #8354
Signed-off-by: Mark Mikofski bwana.marko@yahoo.com