python
import numpy as np import pandas as pd import xarray as xr
np.random.seed(123456)
Xarray offers flexible interpolation routines, which have a similar interface to our indexing <indexing>
.
Note
interp
requires scipy installed.
Interpolating a :py~xarray.DataArray
works mostly like labeled indexing of a :py~xarray.DataArray
,
python
- da = xr.DataArray(
np.sin(0.3 * np.arange(12).reshape(4, 3)), [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])],
) # label lookup da.sel(time=3)
# interpolation da.interp(time=2.5)
Similar to the indexing, :py~xarray.DataArray.interp
also accepts an array-like, which gives the interpolated result as an array.
python
# label lookup da.sel(time=[2, 3])
# interpolation da.interp(time=[2.5, 3.5])
To interpolate data with a :pynumpy.datetime64 <reference/arrays.datetime>
coordinate you can pass a string.
python
- da_dt64 = xr.DataArray(
[1, 3], [("time", pd.date_range("1/1/2000", "1/3/2000", periods=2))]
) da_dt64.interp(time="2000-01-02")
The interpolated data can be merged into the original :py~xarray.DataArray
by specifying the time periods required.
python
da_dt64.interp(time=pd.date_range("1/1/2000", "1/3/2000", periods=3))
Interpolation of data indexed by a :py~xarray.CFTimeIndex
is also allowed. See CFTimeIndex
for examples.
Note
Currently, our interpolation only works for regular grids. Therefore, similarly to :py~xarray.DataArray.sel
, only 1D coordinates along a dimension can be used as the original coordinate to be interpolated.
Like :py~xarray.DataArray.sel
, :py~xarray.DataArray.interp
accepts multiple coordinates. In this case, multidimensional interpolation is carried out.
python
# label lookup da.sel(time=2, space=0.1)
# interpolation da.interp(time=2.5, space=0.15)
Array-like coordinates are also accepted:
python
# label lookup da.sel(time=[2, 3], space=[0.1, 0.2])
# interpolation da.interp(time=[1.5, 2.5], space=[0.15, 0.25])
:py~xarray.DataArray.interp_like
method is a useful shortcut. This method interpolates an xarray object onto the coordinates of another xarray object. For example, if we want to compute the difference between two :py~xarray.DataArray
s (da
and other
) staying on slightly different coordinates,
python
- other = xr.DataArray(
np.sin(0.4 * np.arange(9).reshape(3, 3)), [("time", [0.9, 1.9, 2.9]), ("space", [0.15, 0.25, 0.35])],
)
it might be a good idea to first interpolate da
so that it will stay on the same coordinates of other
, and then subtract it. :py~xarray.DataArray.interp_like
can be used for such a case,
python
# interpolate da along other's coordinates interpolated = da.interp_like(other) interpolated
It is now possible to safely compute the difference other - interpolated
.
We use :pyscipy.interpolate.interp1d
for 1-dimensional interpolation and :pyscipy.interpolate.interpn
for multi-dimensional interpolation.
The interpolation method can be specified by the optional method
argument.
python
- da = xr.DataArray(
np.sin(np.linspace(0, 2 * np.pi, 10)), dims="x", coords={"x": np.linspace(0, 1, 10)},
)
da.plot.line("o", label="original") da.interp(x=np.linspace(0, 1, 100)).plot.line(label="linear (default)") da.interp(x=np.linspace(0, 1, 100), method="cubic").plot.line(label="cubic") @savefig interpolation_sample1.png width=4in plt.legend()
Additional keyword arguments can be passed to scipy's functions.
python
# fill 0 for the outside of the original coordinates. da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": 0.0}) # 1-dimensional extrapolation da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": "extrapolate"}) # multi-dimensional extrapolation da = xr.DataArray( np.sin(0.3 * np.arange(12).reshape(4, 3)), [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])], )
da.interp(time=4, space=np.linspace(-0.1, 0.5, 10), kwargs={"fill_value": None})
:py~xarray.DataArray.interp
accepts :py~xarray.DataArray
as similar to :py~xarray.DataArray.sel
, which enables us more advanced interpolation. Based on the dimension of the new coordinate passed to :py~xarray.DataArray.interp
, the dimension of the result are determined.
For example, if you want to interpolate a two dimensional array along a particular dimension, as illustrated below, you can pass two 1-dimensional :py~xarray.DataArray
s with a common dimension as new coordinate.
For example:
python
- da = xr.DataArray(
np.sin(0.3 * np.arange(20).reshape(5, 4)), [("x", np.arange(5)), ("y", [0.1, 0.2, 0.3, 0.4])],
) # advanced indexing x = xr.DataArray([0, 2, 4], dims="z") y = xr.DataArray([0.1, 0.2, 0.3], dims="z") da.sel(x=x, y=y)
# advanced interpolation x = xr.DataArray([0.5, 1.5, 2.5], dims="z") y = xr.DataArray([0.15, 0.25, 0.35], dims="z") da.interp(x=x, y=y)
where values on the original coordinates (x, y) = ((0.5, 0.15), (1.5, 0.25), (2.5, 0.35))
are obtained by the 2-dimensional interpolation and mapped along a new dimension z
.
If you want to add a coordinate to the new dimension z
, you can supply :py~xarray.DataArray
s with a coordinate,
python
x = xr.DataArray([0.5, 1.5, 2.5], dims="z", coords={"z": ["a", "b", "c"]}) y = xr.DataArray([0.15, 0.25, 0.35], dims="z", coords={"z": ["a", "b", "c"]}) da.interp(x=x, y=y)
For the details of the advanced indexing, see more advanced indexing <more_advanced_indexing>
.
Our :py~xarray.DataArray.interp
works with arrays with NaN the same way that scipy.interpolate.interp1d and scipy.interpolate.interpn do. linear
and nearest
methods return arrays including NaN, while other methods such as cubic
or quadratic
return all NaN arrays.
python
da = xr.DataArray([0, 2, np.nan, 3, 3.25], dims="x", coords={"x": range(5)}) da.interp(x=[0.5, 1.5, 2.5]) da.interp(x=[0.5, 1.5, 2.5], method="cubic")
To avoid this, you can drop NaN by :py~xarray.DataArray.dropna
, and then make the interpolation
python
dropped = da.dropna("x") dropped dropped.interp(x=[0.5, 1.5, 2.5], method="cubic")
If NaNs are distributed randomly in your multidimensional array, dropping all the columns containing more than one NaNs by :py~xarray.DataArray.dropna
may lose a significant amount of information. In such a case, you can fill NaN by :py~xarray.DataArray.interpolate_na
, which is similar to :pypandas.Series.interpolate
.
python
filled = da.interpolate_na(dim="x") filled
This fills NaN by interpolating along the specified dimension. After filling NaNs, you can interpolate:
python
filled.interp(x=[0.5, 1.5, 2.5], method="cubic")
For the details of :py~xarray.DataArray.interpolate_na
, see Missing values <missing_values>
.
Let's see how :py~xarray.DataArray.interp
works on real data.
python
# Raw data ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) fig, axes = plt.subplots(ncols=2, figsize=(10, 4)) ds.air.plot(ax=axes[0]) axes[0].set_title("Raw data")
# Interpolated data new_lon = np.linspace(ds.lon[0], ds.lon[-1], ds.dims["lon"] * 4) new_lat = np.linspace(ds.lat[0], ds.lat[-1], ds.dims["lat"] * 4) dsi = ds.interp(lat=new_lat, lon=new_lon) dsi.air.plot(ax=axes[1]) @savefig interpolation_sample3.png width=8in axes[1].set_title("Interpolated data")
Our advanced interpolation can be used to remap the data to the new coordinate. Consider the new coordinates x and z on the two dimensional plane. The remapping can be done as follows
python
# new coordinate x = np.linspace(240, 300, 100) z = np.linspace(20, 70, 100) # relation between new and original coordinates lat = xr.DataArray(z, dims=["z"], coords={"z": z}) lon = xr.DataArray( (x[:, np.newaxis] - 270) / np.cos(z * np.pi / 180) + 270, dims=["x", "z"], coords={"x": x, "z": z}, )
fig, axes = plt.subplots(ncols=2, figsize=(10, 4)) ds.air.plot(ax=axes[0]) # draw the new coordinate on the original coordinates. for idx in [0, 33, 66, 99]: axes[0].plot(lon.isel(x=idx), lat, "--k") for idx in [0, 33, 66, 99]: axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), "--k") axes[0].set_title("Raw data")
dsi = ds.interp(lon=lon, lat=lat) dsi.air.plot(ax=axes[1]) @savefig interpolation_sample4.png width=8in axes[1].set_title("Remapped data")