ENH: Efficient interpolation on regular grids in arbitrary dimensions #3323

Closed
wants to merge 7 commits into
from

Conversation

Projects
None yet
4 participants
@andreas-h
Contributor

andreas-h commented Feb 13, 2014

Based on Johannes Buchner's regulargrid package (see https://github.com/JohannesBuchner/regulargrid), I implemented a new class RegularGridInterpolator for efficient linear and nearest-neighbour interpolation on regular (possibly unevenly spaced) grids in arbitrary dimensions.

I believe this is a significant addition to the scipy.interpolate package because the existing ND-interpolators perform a triangulation of the input points with becomes unfeasible with medium (~8) dimensions. There, by taking advantage of the regular grid structure, the interpolation can be sped up quite a bit.

Another advantage of this class is that it accepts anthing that can be appropriately indexed as values parameter, meaning that it is possible to use this interpolation object for interpolation of disk-based data sets (netCDF, pytables, ...).

I asked for permission about inclusion in Scipy; the (positive) answer is here: JohannesBuchner/regulargrid#2.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 13, 2014

Member

Yes, this is a very commonly needed feature, and currently missing. What is needed is an interpolation method that does not make a copy of the data or do any expensive preprocessing, and this seems to fit the bill.

However, I suggest at the same time also adding a convenience function interpn. This should be similar in spirit to griddata i.e. a front-end to various interpolation methods, but for data on a rectangular grid, and having a method= keyword arg. The other methods (in addition to "nearest" and "linear" you add here) currently available is "splinef2d" (for RectBivariateSpline) and that's restricted to 2D.

The point of interpn would be to guide newbies away from interp2d (which probably should be deprecated at some point, due to its wonkiness and the fact that it's FITPACK-based).

The interface probably should mimic griddata by being interpn((x, y, z, ...), values, xi).

Member

pv commented Feb 13, 2014

Yes, this is a very commonly needed feature, and currently missing. What is needed is an interpolation method that does not make a copy of the data or do any expensive preprocessing, and this seems to fit the bill.

However, I suggest at the same time also adding a convenience function interpn. This should be similar in spirit to griddata i.e. a front-end to various interpolation methods, but for data on a rectangular grid, and having a method= keyword arg. The other methods (in addition to "nearest" and "linear" you add here) currently available is "splinef2d" (for RectBivariateSpline) and that's restricted to 2D.

The point of interpn would be to guide newbies away from interp2d (which probably should be deprecated at some point, due to its wonkiness and the fact that it's FITPACK-based).

The interface probably should mimic griddata by being interpn((x, y, z, ...), values, xi).

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 13, 2014

Contributor

Which other interpolation methods do you have in mind? Otherwise,
|interpn| would just be a wrapper around |RegularGridInterpolator|, correct?

Contributor

andreas-h commented Feb 13, 2014

Which other interpolation methods do you have in mind? Otherwise,
|interpn| would just be a wrapper around |RegularGridInterpolator|, correct?

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 13, 2014

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 13, 2014

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 13, 2014

Member

@andreas-h: RectBivariateSpline, and any other methods that may be eventually added

Member

pv commented Feb 13, 2014

@andreas-h: RectBivariateSpline, and any other methods that may be eventually added

scipy/interpolate/interpolate.py
+
+ References
+ ----------
+ - Python package *regulargrid* by Johannes Buchner, see https://pypi.python.org/pypi/regulargrid/

This comment has been minimized.

@pv

pv Feb 13, 2014

Member

RST reference format

.. [1] Python package ...
.. [2] Foo bar. Lorem ipsum.
@pv

pv Feb 13, 2014

Member

RST reference format

.. [1] Python package ...
.. [2] Foo bar. Lorem ipsum.
@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 13, 2014

Contributor

@pv: Doesn't |RectBivariateSpline| only support the 2d case?

All the optional parameters to |RectBivariateSpline| should then be
available in |interpn| as |**kwargs|, right?

I should be able to do that this weekend.

Do you think there's any chance this will make it into 0.14?

Contributor

andreas-h commented Feb 13, 2014

@pv: Doesn't |RectBivariateSpline| only support the 2d case?

All the optional parameters to |RectBivariateSpline| should then be
available in |interpn| as |**kwargs|, right?

I should be able to do that this weekend.

Do you think there's any chance this will make it into 0.14?

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 13, 2014

Member

Yes, it's 2D only (as I noted above). It needs to be documented and an error raised for other dimensions.

The optional parameters shouldn't be available in interpn, I believe. This is intended to be a simple interface, for interpolation only. People who want more control, can use the interpolators directly.

Member

pv commented Feb 13, 2014

Yes, it's 2D only (as I noted above). It needs to be documented and an error raised for other dimensions.

The optional parameters shouldn't be available in interpn, I believe. This is intended to be a simple interface, for interpolation only. People who want more control, can use the interpolators directly.

@pv pv added this to the 0.14.0 milestone Feb 13, 2014

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 13, 2014

Member

We can try to get this (and interpn) to 0.14.0.

Member

pv commented Feb 13, 2014

We can try to get this (and interpn) to 0.14.0.

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 13, 2014

Contributor

Currently in the nearest neighbor implementation, I'm using yi <= .5 as condition to determine if a given coordinate "belongs" to the next upper or lower gridpoint. Is there any convention on how to handle this? Should I explicitly mention this behaviour in the docstring?

Contributor

andreas-h commented Feb 13, 2014

Currently in the nearest neighbor implementation, I'm using yi <= .5 as condition to determine if a given coordinate "belongs" to the next upper or lower gridpoint. Is there any convention on how to handle this? Should I explicitly mention this behaviour in the docstring?

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 13, 2014

Coverage Status

Changes Unknown when pulling a68a922 on andreas-h:regulargrid into * on scipy:master*.

Coverage Status

Changes Unknown when pulling a68a922 on andreas-h:regulargrid into * on scipy:master*.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 14, 2014

Coverage Status

Changes Unknown when pulling 8fbf99b on andreas-h:regulargrid into * on scipy:master*.

Coverage Status

Changes Unknown when pulling 8fbf99b on andreas-h:regulargrid into * on scipy:master*.

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 14, 2014

Contributor

@pv Do you think it's a good idea to add bounds_error and fill_value to the interpn interface?

Contributor

andreas-h commented Feb 14, 2014

@pv Do you think it's a good idea to add bounds_error and fill_value to the interpn interface?

scipy/interpolate/interpolate.py
+ if method not in ["linear", "nearest", "splinef2d"]:
+ raise ValueError("interpn only understands the methods 'linear', "
+ "'nearest', and 'splinef2d'. You provided "
+ "{}".format(method))

This comment has been minimized.

@pv

pv Feb 14, 2014

Member

This is not valid in Python 2.6 which we still support. Better use the % string formatting.

@pv

pv Feb 14, 2014

Member

This is not valid in Python 2.6 which we still support. Better use the % string formatting.

+
+ RectBivariateSpline : Bivariate spline approximation over a rectangular mesh
+
+ """

This comment has been minimized.

@pv

pv Feb 14, 2014

Member

I think an example would be very useful to add.

@pv

pv Feb 14, 2014

Member

I think an example would be very useful to add.

scipy/interpolate/interpolate.py
+ if not len(points) == ndim:
+ raise ValueError("There are {} point arrays, but values has {} "
+ "dimensions".format(len(points), ndim))
+ # sanity check input grid

This comment has been minimized.

@pv

pv Feb 14, 2014

Member

Style nitpick: empty line before comment (also helps to make the code more readable)

@pv

pv Feb 14, 2014

Member

Style nitpick: empty line before comment (also helps to make the code more readable)

scipy/interpolate/interpolate.py
+ "dimension {}".format(len(p), values.shape[i], i))
+ grid = tuple([np.asarray(p) for p in points])
+ # sanity check requested xi
+ xi = np.atleast_2d(xi)

This comment has been minimized.

@pv

pv Feb 14, 2014

Member

This should support broadcasting --- there's a _ndim_coords_from_arrays helper function that's used in griddata.

@pv

pv Feb 14, 2014

Member

This should support broadcasting --- there's a _ndim_coords_from_arrays helper function that's used in griddata.

This comment has been minimized.

@andreas-h

andreas-h Feb 19, 2014

Contributor

griddata expects xi to be of shape (M, D). My intention was to allow 1D xi of shape (D, ) as well. And I don't see how in that case _ndim_coords_from_arrays helps.

What exactly is the problem with atleast_2d? Is it better to drop support for 1D xi in favor of the solution used in griddata?

@andreas-h

andreas-h Feb 19, 2014

Contributor

griddata expects xi to be of shape (M, D). My intention was to allow 1D xi of shape (D, ) as well. And I don't see how in that case _ndim_coords_from_arrays helps.

What exactly is the problem with atleast_2d? Is it better to drop support for 1D xi in favor of the solution used in griddata?

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 14, 2014

Member

Not sure about bounds_error and fill_value. griddata has fill_value.
RectBivariateSpline doesn't seem to support either.

Member

pv commented Feb 14, 2014

Not sure about bounds_error and fill_value. griddata has fill_value.
RectBivariateSpline doesn't seem to support either.

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 14, 2014

Contributor

The only problem would be extrapolation, which wouldn't be supported for
splinef2d. bounds_error with a given fill_value could be handled in the
wrapper by checking for out_of_bounds before the call to
RectBivariateSpline and filling the return array before returning. I
think I'd be in favor of adding both to interpn. Will do so over the
weekend, if you don't object.

Contributor

andreas-h commented Feb 14, 2014

The only problem would be extrapolation, which wouldn't be supported for
splinef2d. bounds_error with a given fill_value could be handled in the
wrapper by checking for out_of_bounds before the call to
RectBivariateSpline and filling the return array before returning. I
think I'd be in favor of adding both to interpn. Will do so over the
weekend, if you don't object.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 14, 2014

Member

I don't see problems with that.

Member

pv commented Feb 14, 2014

I don't see problems with that.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 14, 2014

Coverage Status

Changes Unknown when pulling e5918ef on andreas-h:regulargrid into * on scipy:master*.

Coverage Status

Changes Unknown when pulling e5918ef on andreas-h:regulargrid into * on scipy:master*.

@ev-br

This comment has been minimized.

Show comment
Hide comment
@ev-br

ev-br Feb 14, 2014

Member

Re nearest neighbor implementation (y <= 0.5 etc). While I don't have any intelligent suggestions here, I just note that there's some discussion on data model vs pixel model in the roadmap under ndimage.

Member

ev-br commented Feb 14, 2014

Re nearest neighbor implementation (y <= 0.5 etc). While I don't have any intelligent suggestions here, I just note that there's some discussion on data model vs pixel model in the roadmap under ndimage.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 14, 2014

Member

I think we want here nearest-neighbor semantics, ie. voronoi cells. I.e. 0.5 seems fine to me.

Member

pv commented Feb 14, 2014

I think we want here nearest-neighbor semantics, ie. voronoi cells. I.e. 0.5 seems fine to me.

@pv pv added needs-work labels Feb 19, 2014

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 19, 2014

Contributor

one further question is about the kwarg method or kind. interp1d and interp2d use kind, griddata uses method. I personally prefer method, so would go with this for RegularGridInterpolator and interpn.

What do you think?

Contributor

andreas-h commented Feb 19, 2014

one further question is about the kwarg method or kind. interp1d and interp2d use kind, griddata uses method. I personally prefer method, so would go with this for RegularGridInterpolator and interpn.

What do you think?

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 19, 2014

Member

I'd go with method, as it's used a bit more often in scipy than kind.

Member

pv commented Feb 19, 2014

I'd go with method, as it's used a bit more often in scipy than kind.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 19, 2014

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 19, 2014

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Feb 19, 2014

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

scipy/interpolate/interpolate.py
+ grid = tuple([np.asarray(p) for p in points])
+
+ # sanity check requested xi
+ xi = np.atleast_2d(xi)

This comment has been minimized.

@pv

pv Feb 19, 2014

Member

I do not fully understand your point """griddata expects xi to be of shape (M, D). My intention was to allow 1D xi of shape (D, ) as well""". You mean you want to evaluate at a single point?

However, I think uniform interface with griddata is of high priority here. The xi parameter has exactly the same meaning, so it should behave the same way.

The problem with atleast_2d is that it does not broadcast. If you want to evaluate it on a grid, you need to do a meshgrid+ravel dance on the user side.

@pv

pv Feb 19, 2014

Member

I do not fully understand your point """griddata expects xi to be of shape (M, D). My intention was to allow 1D xi of shape (D, ) as well""". You mean you want to evaluate at a single point?

However, I think uniform interface with griddata is of high priority here. The xi parameter has exactly the same meaning, so it should behave the same way.

The problem with atleast_2d is that it does not broadcast. If you want to evaluate it on a grid, you need to do a meshgrid+ravel dance on the user side.

This comment has been minimized.

@pv

pv Feb 19, 2014

Member

The other option is to add an "expected dimension" argument to _ndim_coords_from_arrays, and transpose 1D the array if >1D is expected.
Can be done if it doesn't cause some inputs to be ambiguous. (It probably doesn't cause ambiguity.)

@pv

pv Feb 19, 2014

Member

The other option is to add an "expected dimension" argument to _ndim_coords_from_arrays, and transpose 1D the array if >1D is expected.
Can be done if it doesn't cause some inputs to be ambiguous. (It probably doesn't cause ambiguity.)

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 19, 2014

Member

I think this is almost ready now, and should make it to 0.14.0.

Member

pv commented Feb 19, 2014

I think this is almost ready now, and should make it to 0.14.0.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 23, 2014

Member

Some missing stuff is added in: pv/scipy-work@andreas-h:regulargrid...pr-3323

  • uniform xi argument handling in interpn and griddata (and it does as suggested above for 1D inputs)
  • support for non-scalar valued values in regular grid interpolation
  • improve tests and fix up array_like handling at some points
  • minor doc improvements
Member

pv commented Feb 23, 2014

Some missing stuff is added in: pv/scipy-work@andreas-h:regulargrid...pr-3323

  • uniform xi argument handling in interpn and griddata (and it does as suggested above for 1D inputs)
  • support for non-scalar valued values in regular grid interpolation
  • improve tests and fix up array_like handling at some points
  • minor doc improvements
@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 23, 2014

Contributor

Looks great! I wanted to add the lower-dim xi case last night but couldn't figure out how to do the proper slicing ... So thanks for reading my thoughts and implementing them =)

Do I need to do anything with this PR or do you do the necessary git magic before merging?

Contributor

andreas-h commented Feb 23, 2014

Looks great! I wanted to add the lower-dim xi case last night but couldn't figure out how to do the proper slicing ... So thanks for reading my thoughts and implementing them =)

Do I need to do anything with this PR or do you do the necessary git magic before merging?

@andreas-h

This comment has been minimized.

Show comment
Hide comment
@andreas-h

andreas-h Feb 23, 2014

Contributor

I hope I can still write an example for interpn tonight or tomorrow, but don't want to promise that. So probably you go ahead merging and I can submit the example PR against master then.

Contributor

andreas-h commented Feb 23, 2014

I hope I can still write an example for interpn tonight or tomorrow, but don't want to promise that. So probably you go ahead merging and I can submit the example PR against master then.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 23, 2014

Member

@andreas-h: comments? I'd be ready to merge to 0.14.0 with those changes.

Member

pv commented Feb 23, 2014

@andreas-h: comments? I'd be ready to merge to 0.14.0 with those changes.

@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 23, 2014

Member

Ok, I'll merge this and any remaining enhancements can then be made later.

Member

pv commented Feb 23, 2014

Ok, I'll merge this and any remaining enhancements can then be made later.

pv added a commit that referenced this pull request Feb 23, 2014

Merge pull request #3323 from pr-3323
ENH: interpolate: add interpn and RegularGridInterpolator

Implemented a new class RegularGridInterpolator for efficient linear and
nearest-neighbour interpolation on regular (possibly unevenly spaced)
grids in arbitrary dimensions.

The existing ND-interpolators perform a triangulation of the input
points with becomes unfeasible with medium (~8) dimensions.  There, by
taking advantage of the regular grid structure, the interpolation can be
sped up quite a bit.

Another advantage of this class is that it accepts anthing that can be
appropriately indexed as values parameter, meaning that it is possible
to use this interpolation object for interpolation of disk-based data
sets (netCDF, pytables, ...).

This PR also adds a simpler function `interpn` for regular grid data
interpolation.
@pv

This comment has been minimized.

Show comment
Hide comment
@pv

pv Feb 23, 2014

Member

Thanks, merged in a90dc28

Member

pv commented Feb 23, 2014

Thanks, merged in a90dc28

@pv pv closed this Feb 23, 2014

@andreas-h andreas-h deleted the andreas-h:regulargrid branch Feb 24, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment