Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This adds root finding to pchip as this is possible to do optimal because pchip is a monotonic interpolation. #3260

Closed
wants to merge 3 commits into from

Conversation

bmcage
Copy link
Contributor

@bmcage bmcage commented Jan 31, 2014

Alternative would be fsolve and friends, but those would not use the known structure of pchip.

A typical use for pchip is an interpolation that can be inverted. This adds a root finding to pchip that uses the pchip form to invert fast.

I am not sure this is the best way. It might be better to use phcip to obtain yi=f(xi) and yi' = f'(xi), then create the cubic monotonic interpolation through yi with derivatives yi' as self.pchipinv. This should be the correct f^-1.
Then when asking roots, one can use this self.pchipinv.

The algorithm proposed here works differently: search the cubic for the y value given, calculate the 3 cubic roots. As it is a monotonic function only 1 of the roots will be in the correct interval.

Some problems:

  1. computing cubic roots gives imaginary parts which need to be discarded.
    See the np.allclose(ires, 0., atol=1e-10) to discard this. -10 is hard coded
  2. return value can be numerically close to the bounding edges.
    See the np.allclose(x1-rres, 0., atol=1e-10) to recognize this. -10 is hard coded
  3. Use of newaxis to dump the results: x[c] = ress[:, np.newaxis]
    I believe this will be correct for the use cases possible

pchip is a monotonic interpolation.
Alternative would be fsolve and friends, but those would not use the known structure of pchip
@pv
Copy link
Member

pv commented Jan 31, 2014

It could be more efficient to convert pchip to use the new PPoly piecewise polynomial representation, which has efficient root finding already implemented.

We will probably eventually deprecate polyint.PiecewisePolynomial in favor of that.

@bmcage
Copy link
Contributor Author

bmcage commented Jan 31, 2014

PPoly seems not optimal to use. It's not actually root finding but inversion, more like fsolve. You then know the interval. After all pchip is monotone, so you know there is one, and only one value that satisfies y=f(x).
My problem 1 and 2 however can be fixed by using the same inversion fuction as PPoly:
real_roots in https://github.com/scipy/scipy/blob/master/scipy/interpolate/_ppoly.pyx

Would this real_roots be exposed?

@pv
Copy link
Member

pv commented Jan 31, 2014

The root finding internal function is exposed as scipy.interpolate._ppoly.real_roots.

I see, the point is that you can find the correct interval fast via searchsorted in y. One could think about adding an additional argument "assume_monotonic" to real_roots, so that the binary search can be done inside the routine itself.

@bmcage
Copy link
Contributor Author

bmcage commented Feb 7, 2014

I tried the other possible approach: add an inverse function that returns an interpolated object which is the inverse. This is not working however, too unstable. The numerical precision errors in solution values or inverse derivatives, are such that the inverse interpolated polynomial is very different over large intervals. So that approach is a no go, root finding will be the only acceptable approach.
So, an assume_monotonic added to real_roots is an option. Approach here can be done via inheritance if needed in current scipy

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c6401f1 on bmcage:monohermspline into 233ad82 on scipy:master.

@pv
Copy link
Member

pv commented Feb 13, 2014

This should be reimplemented on top of current master branch, as gh-3267 converted Pchip to use BPoly.

Actually, I think this PR would best be done by adding a method def solve(self, y, assume_monotonic=False): to the PPoly piecewise polynomial class. After that, a similar method could be added to Pchip.

For assume_monotonic == True, we should use the faster approach here, and for assume_monotonic=False a slower approach that finds all the roots. Ideally, the solve method would be written in Cython directly in the real_roots routine.

@ev-br
Copy link
Member

ev-br commented Feb 13, 2014

If/when assume_monotonic is added, the behavior should be at least very clearly documented:

In [21]: %history
import numpy as np
from scipy.interpolate import pchip
xi = [1., 2., 3., 4.]
yi = [-1., 1, -1, 1]
p = pchip(xi, yi)
In [22]: p.root(1)
Out[22]: array(3.9999999947957403)

In [23]: p.root(0.5)
Out[23]: array(3.6736481776669305)

@ev-br
Copy link
Member

ev-br commented Feb 13, 2014

@bmcage can you provide an example where the inverse interpolation gives unacceptable numerical errors?

@bmcage
Copy link
Contributor Author

bmcage commented Feb 14, 2014

@ev-br I pushed my local branch with that test: https://github.com/bmcage/scipy/tree/monohermsplineinv
If you then run the test: $ python runtests.py --python scipy/interpolate/tests/test_polyint.py
you will see the bad interpolation
In test_inv_vGn of test_polyint.py uncomment the part using pylab to create a plot to see it. That is the function I need to inverse hundred of thousands of times in an inverse algorithm.

@ev-br
Copy link
Member

ev-br commented Feb 16, 2014

Hmm... This example looks a bit artificial to me, is it actual data you're interpolating? What is the actual problem you're solving?
In any case, since the h is logarithmically spaced and u follows a power law, have you tried working with them in the log space?
[If this turns into a discussion about how to deal with these exact data, we might want to move it from a github issue to the scipy-user mailing list]

@bmcage
Copy link
Contributor Author

bmcage commented Feb 16, 2014

I'm doing it logwise in reality. It's just a testcase which shows how it can fail dramatically. Rootfinding does not cause such errors.

@pv pv removed the PR label Aug 13, 2014
bmcage added a commit to bmcage/centrifuge-1d that referenced this pull request May 20, 2015
@pv pv added needs-work Items that are pending response from the author and removed needs-work Items that are pending response from the author labels Feb 17, 2016
@ev-br ev-br added this to 1D splines / polynomials in scipy.interpolate Apr 29, 2022
@ev-br ev-br moved this from 1D splines / polynomials to In progress in scipy.interpolate May 8, 2022
ev-br pushed a commit to ev-br/scipy that referenced this pull request May 9, 2022
ev-br pushed a commit to ev-br/scipy that referenced this pull request May 23, 2022
@ev-br ev-br moved this from In progress to 1D splines / polynomials in scipy.interpolate May 24, 2022
@ev-br
Copy link
Member

ev-br commented Jun 26, 2022

OK, I looked at this PR again, in gh-16147. Implementation details have changed a bit since this was first submitted, so here's a brief summary.

Given a piecewise polynomial, f(x), represented as a PPoly object, PPoly.solve(y) returns solutions, x, of f(x) = y. Since pchip (a PPoly subclass) constructs a locally monotonic polynomial, the idea is that it can use a more efficient implementation which takes the monotonicity into account.

The currently missing bit is that PPoly.solve only accepts a single float number, and does not accept arrays : The idea is that given the array of y values, .solve(y) produces an array of solutions, so that
p = PPoly(...); p.solve(y) should be equivalent to [p.solve(yval) for yval in y] etc.

The discussion in #4356 (comment) claims that vectorizing .solve is an easy-fix enhancement, but I now don't think it is.

Here's a problematic thing. Consider

In [2]: from scipy.interpolate import pchip

In [3]: xi = [1., 2., 3., 4.]
   ...: yi = [-1., 2, -1, 1]
   ...: p = pchip(xi, yi)

In [4]: p.solve(0)
Out[4]: array([1.18350342, 2.61303686, 3.73205081])

In [5]: p.solve(2)
Out[5]: array([2.        , 4.19582335])

Note that the number of solutions is different for different values of y. So, what would be the output of p.solve([0, 2])?
It cannot be a numpy array with dtype=float since numpy does not have ragged arrays.

So, possible options are

  1. one can pad the shorter arrays with nans:
array([[1.18350342, 2.61303686, 3.73205081],
       [2., 4.19582335, nan]])
  1. always return an object array of arrays,
array(array([1.18350342, 2.61303686, 3.73205081]),
      array([2., 4.19582335]),
      dtype=object)
  1. return an array of floats when possible, an object array otherwise

Option 1 feels ugly.
Option 2 is at least consistent, but these sorts of arrays are not very useful I'd guess.
Option 3 is certainly a bad form to change the array dtype depending on input values.

All in all, I fail to see an advantage over just manually looping over the input array in user code: [p.solve(_) for _ in y] --- then each element of the list is then a float numpy array and the rest is up to a user who is better equipped to handle it then a library.

@bmcage would appreciate your thoughts (I realize it's been a while)

I'll also ping @AtsushiSakai and @mdhaber, maybe y'all have better ideas? maybe you would weigh in @pv?

@mdhaber
Copy link
Contributor

mdhaber commented Jun 26, 2022

  1. Output convention side, is there a way to vectorize the calculation efficiently within the solve method of PPoly, or would it just be a loop anyway?

  2. What does the solve method return currently when there is no solution?

  3. The comment ENH: add PPoly.solve(y) for solving p(x) == y #4356 (comment) suggests that solve for array y would be helpful to complete the present PR. Without digging, i can't see why that is needed, but if it is needed, does it need to be a feature of the public interface of solve, or can you just loop as needed in this PR?

@ev-br
Copy link
Member

ev-br commented Jun 26, 2022

Output convention side, is there a way to vectorize the calculation efficiently within the solve method of PPoly, or would it just be a loop anyway?

It's a loop this way or another. In the solve method it's likely this one: https://github.com/scipy/scipy/blob/main/scipy/interpolate/_interpolate.py#L1268

What does the solve method return currently when there is no solution?

Following up the example above,

In [3]: p.solve(8)
Out[3]: array([4.91391421])

In [4]: p.solve(8, extrapolate=False)
Out[4]: array([], dtype=float64)

The comment ENH: add PPoly.solve(y) for solving p(x) == y #4356 (comment) suggests that solve for array y would be helpful to complete the present PR. Without digging, i can't see why that is needed, but if it is needed, does it need to be a feature of the public interface of solve, or can you just loop as needed in this PR?

The intended usage AFAIU is exemplified by the tests in this PR, which are somewhat cleaned up and expanded in gh-16147.

@AtsushiSakai
Copy link
Member

Personally, I like option 1 style output, because dtype stability is more important than nan filtering. I have no better ideas for ragged array representation.

@ev-br
Copy link
Member

ev-br commented Jul 18, 2022

OK, so per review comments object arrays are not wanted; on the table there are only option 1 --- pad shorter arrays with nans --- and the unnumbered option of not accepting arrays at all and asking a user to do [p.solve(_) for _ in y] themselves, which is the status quo. I actually prefer the status quo, as accepting arrays here

  • does not seem to bring drastic benefits as opposed to user looping
  • does complicate the user API considerably

On balance, I suggest we close this PR and gh-16147. I'm going to hit the red button in a couple of weeks, please comment if you prefer a different course of action.

@ev-br
Copy link
Member

ev-br commented Aug 4, 2022

OK, no further comments, so let's close this and gh-16147, for now at least. To summarize, I don't see a way this is can be implemented so that it both useful in special cases and is general enough. If there is a suggestion, we certainly can and will reconsider.

Thanks @bmcage and all reviewers!

@ev-br ev-br closed this Aug 4, 2022
scipy.interpolate automation moved this from 1D splines / polynomials to Abandoned / rejected / supeseded Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-work Items that are pending response from the author scipy.interpolate
Projects
scipy.interpolate
Abandoned / rejected / supeseded
Development

Successfully merging this pull request may close these issues.

None yet

7 participants