Skip to content

Latest commit

 

History

History
320 lines (224 loc) · 9.65 KB

interpolation.rst

File metadata and controls

320 lines (224 loc) · 9.65 KB

Interpolating data

python

import numpy as np import pandas as pd import xarray as xr

np.random.seed(123456)

Xarray offers flexible interpolation routines, which have a similar interface to our indexing <indexing>.

Note

interp requires scipy installed.

Scalar and 1-dimensional interpolation

Interpolating a :py~xarray.DataArray works mostly like labeled indexing of a :py~xarray.DataArray,

python

da = xr.DataArray(

np.sin(0.3 * np.arange(12).reshape(4, 3)), [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])],

) # label lookup da.sel(time=3)

# interpolation da.interp(time=2.5)

Similar to the indexing, :py~xarray.DataArray.interp also accepts an array-like, which gives the interpolated result as an array.

python

# label lookup da.sel(time=[2, 3])

# interpolation da.interp(time=[2.5, 3.5])

To interpolate data with a :pynumpy.datetime64 <reference/arrays.datetime> coordinate you can pass a string.

python

da_dt64 = xr.DataArray(

[1, 3], [("time", pd.date_range("1/1/2000", "1/3/2000", periods=2))]

) da_dt64.interp(time="2000-01-02")

The interpolated data can be merged into the original :py~xarray.DataArray by specifying the time periods required.

python

da_dt64.interp(time=pd.date_range("1/1/2000", "1/3/2000", periods=3))

Interpolation of data indexed by a :py~xarray.CFTimeIndex is also allowed. See CFTimeIndex for examples.

Note

Currently, our interpolation only works for regular grids. Therefore, similarly to :py~xarray.DataArray.sel, only 1D coordinates along a dimension can be used as the original coordinate to be interpolated.

Multi-dimensional Interpolation

Like :py~xarray.DataArray.sel, :py~xarray.DataArray.interp accepts multiple coordinates. In this case, multidimensional interpolation is carried out.

python

# label lookup da.sel(time=2, space=0.1)

# interpolation da.interp(time=2.5, space=0.15)

Array-like coordinates are also accepted:

python

# label lookup da.sel(time=[2, 3], space=[0.1, 0.2])

# interpolation da.interp(time=[1.5, 2.5], space=[0.15, 0.25])

:py~xarray.DataArray.interp_like method is a useful shortcut. This method interpolates an xarray object onto the coordinates of another xarray object. For example, if we want to compute the difference between two :py~xarray.DataArray s (da and other) staying on slightly different coordinates,

python

other = xr.DataArray(

np.sin(0.4 * np.arange(9).reshape(3, 3)), [("time", [0.9, 1.9, 2.9]), ("space", [0.15, 0.25, 0.35])],

)

it might be a good idea to first interpolate da so that it will stay on the same coordinates of other, and then subtract it. :py~xarray.DataArray.interp_like can be used for such a case,

python

# interpolate da along other's coordinates interpolated = da.interp_like(other) interpolated

It is now possible to safely compute the difference other - interpolated.

Interpolation methods

We use :pyscipy.interpolate.interp1d for 1-dimensional interpolation and :pyscipy.interpolate.interpn for multi-dimensional interpolation.

The interpolation method can be specified by the optional method argument.

python

da = xr.DataArray(

np.sin(np.linspace(0, 2 * np.pi, 10)), dims="x", coords={"x": np.linspace(0, 1, 10)},

)

da.plot.line("o", label="original") da.interp(x=np.linspace(0, 1, 100)).plot.line(label="linear (default)") da.interp(x=np.linspace(0, 1, 100), method="cubic").plot.line(label="cubic") @savefig interpolation_sample1.png width=4in plt.legend()

Additional keyword arguments can be passed to scipy's functions.

python

# fill 0 for the outside of the original coordinates. da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": 0.0}) # 1-dimensional extrapolation da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": "extrapolate"}) # multi-dimensional extrapolation da = xr.DataArray( np.sin(0.3 * np.arange(12).reshape(4, 3)), [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])], )

da.interp(time=4, space=np.linspace(-0.1, 0.5, 10), kwargs={"fill_value": None})

Advanced Interpolation

:py~xarray.DataArray.interp accepts :py~xarray.DataArray as similar to :py~xarray.DataArray.sel, which enables us more advanced interpolation. Based on the dimension of the new coordinate passed to :py~xarray.DataArray.interp, the dimension of the result are determined.

For example, if you want to interpolate a two dimensional array along a particular dimension, as illustrated below, you can pass two 1-dimensional :py~xarray.DataArray s with a common dimension as new coordinate.

advanced indexing and interpolation

For example:

python

da = xr.DataArray(

np.sin(0.3 * np.arange(20).reshape(5, 4)), [("x", np.arange(5)), ("y", [0.1, 0.2, 0.3, 0.4])],

) # advanced indexing x = xr.DataArray([0, 2, 4], dims="z") y = xr.DataArray([0.1, 0.2, 0.3], dims="z") da.sel(x=x, y=y)

# advanced interpolation x = xr.DataArray([0.5, 1.5, 2.5], dims="z") y = xr.DataArray([0.15, 0.25, 0.35], dims="z") da.interp(x=x, y=y)

where values on the original coordinates (x, y) = ((0.5, 0.15), (1.5, 0.25), (2.5, 0.35)) are obtained by the 2-dimensional interpolation and mapped along a new dimension z.

If you want to add a coordinate to the new dimension z, you can supply :py~xarray.DataArray s with a coordinate,

python

x = xr.DataArray([0.5, 1.5, 2.5], dims="z", coords={"z": ["a", "b", "c"]}) y = xr.DataArray([0.15, 0.25, 0.35], dims="z", coords={"z": ["a", "b", "c"]}) da.interp(x=x, y=y)

For the details of the advanced indexing, see more advanced indexing <more_advanced_indexing>.

Interpolating arrays with NaN

Our :py~xarray.DataArray.interp works with arrays with NaN the same way that scipy.interpolate.interp1d and scipy.interpolate.interpn do. linear and nearest methods return arrays including NaN, while other methods such as cubic or quadratic return all NaN arrays.

python

da = xr.DataArray([0, 2, np.nan, 3, 3.25], dims="x", coords={"x": range(5)}) da.interp(x=[0.5, 1.5, 2.5]) da.interp(x=[0.5, 1.5, 2.5], method="cubic")

To avoid this, you can drop NaN by :py~xarray.DataArray.dropna, and then make the interpolation

python

dropped = da.dropna("x") dropped dropped.interp(x=[0.5, 1.5, 2.5], method="cubic")

If NaNs are distributed randomly in your multidimensional array, dropping all the columns containing more than one NaNs by :py~xarray.DataArray.dropna may lose a significant amount of information. In such a case, you can fill NaN by :py~xarray.DataArray.interpolate_na, which is similar to :pypandas.Series.interpolate.

python

filled = da.interpolate_na(dim="x") filled

This fills NaN by interpolating along the specified dimension. After filling NaNs, you can interpolate:

python

filled.interp(x=[0.5, 1.5, 2.5], method="cubic")

For the details of :py~xarray.DataArray.interpolate_na, see Missing values <missing_values>.

Example

Let's see how :py~xarray.DataArray.interp works on real data.

python

# Raw data ds = xr.tutorial.open_dataset("air_temperature").isel(time=0) fig, axes = plt.subplots(ncols=2, figsize=(10, 4)) ds.air.plot(ax=axes[0]) axes[0].set_title("Raw data")

# Interpolated data new_lon = np.linspace(ds.lon[0], ds.lon[-1], ds.dims["lon"] * 4) new_lat = np.linspace(ds.lat[0], ds.lat[-1], ds.dims["lat"] * 4) dsi = ds.interp(lat=new_lat, lon=new_lon) dsi.air.plot(ax=axes[1]) @savefig interpolation_sample3.png width=8in axes[1].set_title("Interpolated data")

Our advanced interpolation can be used to remap the data to the new coordinate. Consider the new coordinates x and z on the two dimensional plane. The remapping can be done as follows

python

# new coordinate x = np.linspace(240, 300, 100) z = np.linspace(20, 70, 100) # relation between new and original coordinates lat = xr.DataArray(z, dims=["z"], coords={"z": z}) lon = xr.DataArray( (x[:, np.newaxis] - 270) / np.cos(z * np.pi / 180) + 270, dims=["x", "z"], coords={"x": x, "z": z}, )

fig, axes = plt.subplots(ncols=2, figsize=(10, 4)) ds.air.plot(ax=axes[0]) # draw the new coordinate on the original coordinates. for idx in [0, 33, 66, 99]: axes[0].plot(lon.isel(x=idx), lat, "--k") for idx in [0, 33, 66, 99]: axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), "--k") axes[0].set_title("Raw data")

dsi = ds.interp(lon=lon, lat=lat) dsi.air.plot(ax=axes[1]) @savefig interpolation_sample4.png width=8in axes[1].set_title("Remapped data")