Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Efficient interpolation on regular grids in arbitrary dimensions #3323

Closed
wants to merge 7 commits into from

Conversation

andreas-h
Copy link
Contributor

Based on Johannes Buchner's regulargrid package (see https://github.com/JohannesBuchner/regulargrid), I implemented a new class RegularGridInterpolator for efficient linear and nearest-neighbour interpolation on regular (possibly unevenly spaced) grids in arbitrary dimensions.

I believe this is a significant addition to the scipy.interpolate package because the existing ND-interpolators perform a triangulation of the input points with becomes unfeasible with medium (~8) dimensions. There, by taking advantage of the regular grid structure, the interpolation can be sped up quite a bit.

Another advantage of this class is that it accepts anthing that can be appropriately indexed as values parameter, meaning that it is possible to use this interpolation object for interpolation of disk-based data sets (netCDF, pytables, ...).

I asked for permission about inclusion in Scipy; the (positive) answer is here: JohannesBuchner/regulargrid#2.

@pv
Copy link
Member

pv commented Feb 13, 2014

Yes, this is a very commonly needed feature, and currently missing. What is needed is an interpolation method that does not make a copy of the data or do any expensive preprocessing, and this seems to fit the bill.

However, I suggest at the same time also adding a convenience function interpn. This should be similar in spirit to griddata i.e. a front-end to various interpolation methods, but for data on a rectangular grid, and having a method= keyword arg. The other methods (in addition to "nearest" and "linear" you add here) currently available is "splinef2d" (for RectBivariateSpline) and that's restricted to 2D.

The point of interpn would be to guide newbies away from interp2d (which probably should be deprecated at some point, due to its wonkiness and the fact that it's FITPACK-based).

The interface probably should mimic griddata by being interpn((x, y, z, ...), values, xi).

@andreas-h
Copy link
Contributor Author

Which other interpolation methods do you have in mind? Otherwise,
|interpn| would just be a wrapper around |RegularGridInterpolator|, correct?

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling e23aa50 on andreas-h:regulargrid into * on scipy:master*.

@pv
Copy link
Member

pv commented Feb 13, 2014

@andreas-h: RectBivariateSpline, and any other methods that may be eventually added


References
----------
- Python package *regulargrid* by Johannes Buchner, see https://pypi.python.org/pypi/regulargrid/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RST reference format

.. [1] Python package ...
.. [2] Foo bar. Lorem ipsum.

@andreas-h
Copy link
Contributor Author

@pv: Doesn't |RectBivariateSpline| only support the 2d case?

All the optional parameters to |RectBivariateSpline| should then be
available in |interpn| as |**kwargs|, right?

I should be able to do that this weekend.

Do you think there's any chance this will make it into 0.14?

@pv
Copy link
Member

pv commented Feb 13, 2014

Yes, it's 2D only (as I noted above). It needs to be documented and an error raised for other dimensions.

The optional parameters shouldn't be available in interpn, I believe. This is intended to be a simple interface, for interpolation only. People who want more control, can use the interpolators directly.

@pv pv added this to the 0.14.0 milestone Feb 13, 2014
@pv
Copy link
Member

pv commented Feb 13, 2014

We can try to get this (and interpn) to 0.14.0.

@andreas-h
Copy link
Contributor Author

Currently in the nearest neighbor implementation, I'm using yi <= .5 as condition to determine if a given coordinate "belongs" to the next upper or lower gridpoint. Is there any convention on how to handle this? Should I explicitly mention this behaviour in the docstring?

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling a68a922 on andreas-h:regulargrid into * on scipy:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 8fbf99b on andreas-h:regulargrid into * on scipy:master*.

@andreas-h
Copy link
Contributor Author

@pv Do you think it's a good idea to add bounds_error and fill_value to the interpn interface?

if method not in ["linear", "nearest", "splinef2d"]:
raise ValueError("interpn only understands the methods 'linear', "
"'nearest', and 'splinef2d'. You provided "
"{}".format(method))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not valid in Python 2.6 which we still support. Better use the % string formatting.

@pv
Copy link
Member

pv commented Feb 14, 2014

Not sure about bounds_error and fill_value. griddata has fill_value.
RectBivariateSpline doesn't seem to support either.

@andreas-h
Copy link
Contributor Author

The only problem would be extrapolation, which wouldn't be supported for
splinef2d. bounds_error with a given fill_value could be handled in the
wrapper by checking for out_of_bounds before the call to
RectBivariateSpline and filling the return array before returning. I
think I'd be in favor of adding both to interpn. Will do so over the
weekend, if you don't object.

@pv
Copy link
Member

pv commented Feb 14, 2014

I don't see problems with that.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling e5918ef on andreas-h:regulargrid into * on scipy:master*.

@ev-br
Copy link
Member

ev-br commented Feb 14, 2014

Re nearest neighbor implementation (y <= 0.5 etc). While I don't have any intelligent suggestions here, I just note that there's some discussion on data model vs pixel model in the roadmap under ndimage.

@pv
Copy link
Member

pv commented Feb 14, 2014

I think we want here nearest-neighbor semantics, ie. voronoi cells. I.e. 0.5 seems fine to me.

@pv pv added the needs-work label Feb 19, 2014
@pv pv added the PR label Feb 19, 2014
@andreas-h
Copy link
Contributor Author

one further question is about the kwarg method or kind. interp1d and interp2d use kind, griddata uses method. I personally prefer method, so would go with this for RegularGridInterpolator and interpn.

What do you think?

@pv
Copy link
Member

pv commented Feb 19, 2014

I'd go with method, as it's used a bit more often in scipy than kind.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c5632f7 on andreas-h:regulargrid into 32cd96d on scipy:master.

grid = tuple([np.asarray(p) for p in points])

# sanity check requested xi
xi = np.atleast_2d(xi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not fully understand your point """griddata expects xi to be of shape (M, D). My intention was to allow 1D xi of shape (D, ) as well""". You mean you want to evaluate at a single point?

However, I think uniform interface with griddata is of high priority here. The xi parameter has exactly the same meaning, so it should behave the same way.

The problem with atleast_2d is that it does not broadcast. If you want to evaluate it on a grid, you need to do a meshgrid+ravel dance on the user side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is to add an "expected dimension" argument to _ndim_coords_from_arrays, and transpose 1D the array if >1D is expected.
Can be done if it doesn't cause some inputs to be ambiguous. (It probably doesn't cause ambiguity.)

@pv
Copy link
Member

pv commented Feb 19, 2014

I think this is almost ready now, and should make it to 0.14.0.

@pv
Copy link
Member

pv commented Feb 23, 2014

Some missing stuff is added in: pv/scipy-work@andreas-h:regulargrid...pr-3323

  • uniform xi argument handling in interpn and griddata (and it does as suggested above for 1D inputs)
  • support for non-scalar valued values in regular grid interpolation
  • improve tests and fix up array_like handling at some points
  • minor doc improvements

@andreas-h
Copy link
Contributor Author

Looks great! I wanted to add the lower-dim xi case last night but couldn't figure out how to do the proper slicing ... So thanks for reading my thoughts and implementing them =)

Do I need to do anything with this PR or do you do the necessary git magic before merging?

@andreas-h
Copy link
Contributor Author

I hope I can still write an example for interpn tonight or tomorrow, but don't want to promise that. So probably you go ahead merging and I can submit the example PR against master then.

@pv
Copy link
Member

pv commented Feb 23, 2014

@andreas-h: comments? I'd be ready to merge to 0.14.0 with those changes.

@pv
Copy link
Member

pv commented Feb 23, 2014

Ok, I'll merge this and any remaining enhancements can then be made later.

pv added a commit that referenced this pull request Feb 23, 2014
ENH: interpolate: add interpn and RegularGridInterpolator

Implemented a new class RegularGridInterpolator for efficient linear and
nearest-neighbour interpolation on regular (possibly unevenly spaced)
grids in arbitrary dimensions.

The existing ND-interpolators perform a triangulation of the input
points with becomes unfeasible with medium (~8) dimensions.  There, by
taking advantage of the regular grid structure, the interpolation can be
sped up quite a bit.

Another advantage of this class is that it accepts anthing that can be
appropriately indexed as values parameter, meaning that it is possible
to use this interpolation object for interpolation of disk-based data
sets (netCDF, pytables, ...).

This PR also adds a simpler function `interpn` for regular grid data
interpolation.
@pv
Copy link
Member

pv commented Feb 23, 2014

Thanks, merged in a90dc28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-work Items that are pending response from the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants