-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add more interpolation methods to RegularGridInterpolator. #7903
Conversation
This includes the addition of 5 new spline-based interpolators for scipy.interpolate.RegularGridInterpolator. These allow for smooth interpolation on a regular grid in an arbitrary number of dimensions. These also provide gradient calculations of the interpolated result with respect to an evaluation point or set of points.
[Updated] A bit more detail: This includes the addition of
These each allow for smooth interpolation interp = RegularGridInterpolator(points, values, method='cubic')
result = interp(x)
deriv = interp.gradient(x) Note that gradients cannot be computed for the two currently existing interpolation methods, These additional 5 methods are examples of tensor spline interpolation. They work by applying instances of This is useful to my work for various surrogate modeling purposes, but is Though I've implemented and tested the functionality and API as I'm currently using it in my own projects, I am certainly open to comments and changes for consideration for inclusion into Proposed updated docs: https://thearn.github.io/docs/generated/scipy.interpolate.RegularGridInterpolator.html#scipy.interpolate.RegularGridInterpolator |
Quick comment: please don't do unrelated whitespace cleanup in a PR that
does something else.
I'd also prefer not using UnivariateSpline in the implementation. That
is a semi-deprecated interface to FITPACK. If possible, consider using
make_interp_spline so that FITPACK is not used at all (if you want to
use FITPACK, please use splrep+BSpline or splrep+splev). The issue with
FITPACK is that some people may expect that the knots of the spline
coincide with grid points --- IIRC this is not guaranteed with FITPACK
with its automatic grid selection.
|
Sorry, auto-pep8 plugin issue with the whitespace. I can revert that. I think UPDATE I'll update the PR shortly, but I'm suddenly getting a failure for I'm not sure how that's coming about, my changes are only within |
…rpolation (#1) * FIX: switched from UnivariateSpline to make_interp_spline for 1D interpolation * FIX: can vectorize inner loop with _bsplines calls * FIX: revert auto-pep8 whitespace edits * FIX: removed FITPACK specific wording and transforms
* FIX: updated docstrings to reflect method changes * FIX: removed unnecessary reshaping when using `make_interp_spline` * FIX: grid is now an unneeded paramater to _evaluate_seperable()
@thearn that test failure in |
* FIX: Vectorized entire first dimension gradient call
* FIX: Vectorized entire first dimension gradient call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a cursory look, left a few comments, mostly on cosmetics. Am intending to give it a proper code review, but definitely not in time for 1.2.0, sadly.
Overall I'm a big +1 for the functionality.
Not sure how to deal with gradients: having them for some interpolation kinds and not others feels clunky.
A bigger picture question is whether to separate evaluations into an NdBSpline
class (or TensorProductSpline
or some such) --- I don't have a firm opinion though.
is used, then if any dimension has fewer points than `k` + 1, an error | ||
will be raised. If spline_dim_error=False, then the spline interpolant | ||
order will be reduced as needed on a per-dimension basis. Default | ||
is True (raise an exception). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option feels borderline extraneous. Maybe better raise unconditionally if there are not enough points?
|
||
Methods | ||
------- | ||
__call__ | ||
gradient | ||
methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should not be public (API surface).
strategy, and are best used when the fitted data (the points and values | ||
arrays) are large relative to the number of points to be interpolated. | ||
As such, they are not an efficient choice for some tasks, such as | ||
re-sampling of image data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the comment about the values of the fitted data being larger than the number of data points. Rephrase?
Also for resampling, the See Also
section could list relevant signal/ndimage routines.
@@ -2319,12 +2335,24 @@ class RegularGridInterpolator(object): | |||
return an array of `nan` values. Nearest-neighbor interpolation will work | |||
as usual in this case. | |||
|
|||
The 'slinear', 'cubic', 'quintic' methods are spline-based interpolators. | |||
Use of the spline interpolations allows for getting gradient | |||
values via the ``gradient`` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could use the .. versionadded::
directive (otherwise it looks like they are available from scipy 0.14.0)
value4 = interp(x) | ||
|
||
# values from different methods should be different | ||
assert_raises(AssertionError, assert_equal, value1, value2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy.testing.assert_(value1 != value2)
values = values - 2j * values | ||
sample = np.asarray([[0.1, 0.1, 1., .9]]) | ||
|
||
# spline methods dont support complex values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: either do not or with an apostrophe, don't.
result1 = interp(x) | ||
gradient1 = interp.gradient(x) | ||
result_actual_1 = 0.2630309995970872 | ||
result_gradient_1 = np.array([0.22505535, -0.46465198, 0.02523666]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these values and why are they correct? Maybe this test is trying to do two things at once: check against an independently calculated values (this is good! only please add at least a comment on how to reproduce these values), and somehow validate a random input. I'd say these should be two separate tests.
result_gradient_1 = np.array([0.22505535, -0.46465198, 0.02523666]) | ||
|
||
assert_almost_equal(result1, result_actual_1) | ||
assert_almost_equal(gradient1, result_gradient_1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bikeshedding: maybe use assert_allclose
with appropriate atol and rtol?
[1, 3.3, 1.2, 4.0, 5.0, 1.0, 3]]).T | ||
points, z, f, df = self._sample_2d_data_smooth() | ||
x, y = points | ||
np.random.seed(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very minor nitpick, optional: a recommended idiom is to create a RandomState
instance instead of seeding the global generator. We do not really follow this, and there is no need for a code churn on this account alone, but in new code maybe better follow the recommendation.
Needs a rebase |
This PR has been silent for some time now, any ETA on the new methods? |
No, it has stalled - if you'd be interested in helping move it forward, that would be useful. |
I'll take a look, if help still is needed, some time in the future after I am done with my current projects. |
Now that gh-15357 is in, we can carry on looking at further enhancements from this PR. As I see, there are two major things, which
Also, from a consistency POV it seems better to have it compute gradients for all methods, with Thoughts? |
@AtsushiSakai @thearn would you be interested in continuing this? I'll gladly review --- or I can take over if you prefer. |
Thank you. I can continue these, but I need time to implement big feature like gradients because I don’t have enough spare time these days. So, It is ok for me that you can take over these. If you have something which has higher priority, you can leave these. I will try these when I have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pseudo-review is a way to record of my tracing through this code (which is quite ingenious!), trying to understand the details of what is what.
To avoid rebasing this branch, I simply dropped the full RegularGridInterpolator class from this PR into _rgi.py
, renamed it RGI2
, and dropped a breakpoint into the __call__
method.
Comments below use the following toy example:
>>> X, Y = np.arange(5), np.arange(6)
>>> values = (X**3 + 2*X) * (Y**3)[:, None]
>>> rgi2 = RGI2((X, Y), values.T, method='cubic')
>>> x, y = 1.5, 0.5 # the evaluation point
>>> rgi2((x, y))
Then the result is (x**3 + 2*x) * y**3
and the gradient is [ (3*x**2 + 2) * y**3, (x**3 + 2*x) * 3*y**2 ]
data_values, | ||
xi[:, i], | ||
ki[i], | ||
compute_gradients) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line produces first_values = (X**3 + 2*X) * y**3
and first_derivs = (X**3 + 2* X) * 3*y**2
--- the 0th dimension is still a vector, and the second dimension (last in the array, first in the iteration) has y
instead of Y
.
ki, | ||
compute_gradients=False, | ||
first_dim_gradient=True) | ||
all_gradients[:, -1] = top_grad[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if clause seems unreachable: the default value of first_dim_gradient
is False, and the only way to call it with first_dim_gradient=True
is guarded by if first_dim_gradient
. Removing this clause completely makes things run fine.
x[i], | ||
ki[i], | ||
compute_gradients) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this point, values = (X**3 + 2*X) * 0.5**3
and local_derivs = (X**3 + 2*X) * 3*(0.5)**2
: these are 1D arrays where the first dimension is an array, and the second dimension is a scalar, the 2nd element of the input xi
array.
ki, | ||
compute_gradients=False) | ||
else: | ||
gradient[i] = local_derivs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This incantation interpolates the local_derivs
--- which is a 1D array of derivatives at x[-1]
(this is already computed!) and the X
array with respect to the first dimension. The result is evaluated at x[0]
, which gives the derivative w.r.t. the first dimension at x[0]
.
So, the current status of porting the gradient computations from this PR: the approach in this PR is quite ingenious indeed, and it can be used in user code today by just copy-pasting the RGI class from this PR, which is pure python. This is probably very slow since
And the code get significantly more complicated as compared to the current RGI class which does evaluations but not gradients. The long-term plan for scipy.interpolate is to finally implement an NdBSpline class to do evaluations of tensor product splines. This would allow to compute derivatives much more easily and efficiently. I'm going to leave this PR open though, to make it more discoverable and to boost the "we want this" signal. |
A belated thank you @thearn for starting this project. While it took us an embarrassing amount of time to act, and not all of your code made it, the idea did strike a chord; the lengths you had to go to work around limitations of then-current scipy.interpolate served as a helpful guidance for improving the library. Thank you! |
Thank you, @ev-br ! |
This includes the addition of
53 new spline-based interpolators forscipy.interpolate.RegularGridInterpolator. These allow for smooth interpolation
on a regular grid in an arbitrary number of dimensions. These also provide
gradient calculations of the interpolated result with respect to an
evaluation point or set of points.