.. currentmodule:: csaps
csaps provides object-oriented API for computing and evaluating univariate, multivariate and nd-gridded splines, but in most cases we recommend to use a shortcut function :func:`csaps` for smoothing data and computing splines.
Firstly, we import :func:`csaps` function and other modules for our examples and also define the function that will produce univariate data:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from csaps import csaps
def univariate_data(n=25, seed=1234):
np.random.seed(seed)
x = np.linspace(-5., 5., n)
y = np.exp(-(x/2.5)**2) + (np.random.rand(n) - 0.2) * 0.3
return x, y
Univariate data are two vectors: X and Y with the same size. X is data sites, Y is data values. X-values must satisfy the condition: x1 < x2 < ... < xN
It is a simple example how to smooth univariate data:
.. plot:: x, y = univariate_data() xi = np.linspace(x[0], x[-1], 150) yi = csaps(x, y, xi, smooth=0.85) plt.plot(x, y, 'o', xi, yi, '-')
Also we can smooth multivariate (n-dimensional) data using the same function.
The algorithm supports computing and evaluating spline for multivariate data with vectorization. You can compute smoothing splines for X, Y data where X is data site vector and Y is ND-array of data value vectors.
The example of data:
# data sites
x = [1, 2, 3, 4]
# 3 data vectors (3-D)
y = [
(2, 4, 6, 8),
(1, 3, 5, 7),
(5, 1, 3, 9),
]
By default, the shape of Y array must be: (d0, d1, ..., dN)
where dN
must equal to X vector size. Also you can use :ref:`axis parameter <tutorial-using-axis>` to
set the data values axis for Y array.
In this case the smoothing spline will be computed for all Y data vectors at a time. The same weights vector and the same smoothing parameter will be used for all Y data.
2-D data example:
.. plot:: np.random.seed(1234) theta = np.linspace(0, 2*np.pi, 35) x = np.cos(theta) + np.random.randn(35) * 0.1 y = np.sin(theta) + np.random.randn(35) * 0.1 data = [x, y] theta_i = np.linspace(0, 2*np.pi, 200) data_i = csaps(theta, data, theta_i, smooth=0.95) xi = data_i[0, :] yi = data_i[1, :] plt.plot(x, y, ':o', xi, yi, '-')
3-D data example:
.. plot:: np.random.seed(1234) n = 100 theta = np.linspace(-4 * np.pi, 4 * np.pi, n) z = np.linspace(-2, 2, n) r = z ** 2 + 1 x = r * np.sin(theta) + np.random.randn(n) * 0.3 y = r * np.cos(theta) + np.random.randn(n) * 0.3 data = [x, y, z] theta_i = np.linspace(-4 * np.pi, 4 * np.pi, 250) data_i = csaps(theta, data, theta_i, smooth=0.95) xi = data_i[0, :] yi = data_i[1, :] zi = data_i[2, :] fig = plt.figure(figsize=(7, 4.5)) ax = fig.add_subplot(111, projection='3d') ax.set_facecolor('none') ax.plot(x, y, z, '.:') ax.plot(xi, yi, zi, '-')
Finally, using the same function we can smooth n-d gridded data.
The algorithm can make smoothing splines for n-d gridded data smoothing. In this case the algorithm makes coordinatewise smoothing (tensor-product of univariate splines coefficients).
X-data must be a sequence of vectors for each dimension. Y-data must be n-d array.
The example of data:
x = [
(-2, -1, 0, 1, 2), # X-grid data sites
(-2, -1, 0, 1, 2), # Y-grid data sites
(-2, -1, 0, 1, 2), # Z-grid data sites
]
y = np.random.rand(5, 5, 5) # 5x5x5 3-D grid data values
Also you can set the smoothing parameter for each dimension:
smooth = [
0.95, # the smoothing parameter for X
0.83, # the smoothing parameter for Y
0.51, # the smoothing parameter for Z
]
Surface data example:
.. plot:: np.random.seed(1234) xdata = [np.linspace(-3, 3, 41), np.linspace(-3.5, 3.5, 31)] i, j = np.meshgrid(*xdata, indexing='ij') ydata = (3 * (1 - j)**2. * np.exp(-(j**2) - (i + 1)**2) - 10 * (j / 5 - j**3 - i**5) * np.exp(-j**2 - i**2) - 1 / 3 * np.exp(-(j + 1)**2 - i**2)) ydata = ydata + (np.random.randn(*ydata.shape) * 0.75) ydata_s = csaps(xdata, ydata, xdata, smooth=0.988) fig = plt.figure(figsize=(7, 4.5)) ax = fig.add_subplot(111, projection='3d') ax.set_facecolor('none') c = [s['color'] for s in plt.rcParams['axes.prop_cycle']] ax.plot_wireframe(j, i, ydata, linewidths=0.5, color=c[0], alpha=0.5) ax.scatter(j, i, ydata, s=10, c=c[0], alpha=0.5) ax.plot_surface(j, i, ydata_s, color=c[1], linewidth=0, alpha=1.0) ax.view_init(elev=9., azim=290)
In all the smoothing examples above we are used the following :func:`csaps` signature:
yi = csaps(x, y, xi, smooth)
where
x
-- the data sites vector for univariate/multivariate data and a sequence of vectors for nd-gridded data.x
-values must satisfy the condition:x1 < x2 < ... < xN
y
-- the data values. For univariate case it is vector with the same size asx
, for multivariate case it is a sequence of vectors or nd-array, and for nd-gridded data it is nd-arrayxi
-- the data sites for smoothed data (output mesh). It is shape-likex
and in the same range asx
, but usually it has more interpolated pointssmooth
-- the smoothing parameter in the range[0, 1]
If we want to smooth the data without specifying the smoothing parameter we can use the following signature:
yi, smooth = csaps(x, y, xi)
In this case the smoothing parameter will be computed automatically and will be returned in the
function result. In this case the function will return :class:`AutoSmoothingResult` named
tuple: AutoSmoothingResult(values, smooth)
.
The example of auto smoothing univariate data:
.. plot:: x, y = univariate_data() xi = np.linspace(x[0], x[-1], 51) smoothing_result = csaps(x, y, xi) yi = smoothing_result.values smooth = smoothing_result.smooth plt.plot(x, y, 'o') plt.plot(xi, yi, '-', label=f'smooth={smooth:.3f}') plt.legend()
In ND-gridded data case we can use auto smoothing for all dimensions or the particular dimensions:
smooth = [
0.95,
None, # auto smoothing only for Y
0.85,
]
...
smoothing_result = csaps(x, y, xi, smooth=smooth)
- The smoothing parameter p should be in range [0, 1] where bounds are:
- 0: The smoothing spline is the least-squares straight line fit to the data
- 1: The cubic spline interpolant with natural boundary condition
The following example demonstartes these two boundary cases:
.. plot:: x = np.linspace(-5., 5., 7) y = 1 / (1 + x**2) xi = np.linspace(x[0], x[-1], 150) yi_0 = csaps(x, y, xi, smooth=0) yi_1 = csaps(x, y, xi, smooth=1) plt.plot(x, y, 'o:') plt.plot(xi, yi_0, '-', label='smooth=0') plt.plot(xi, yi_1, '-', label='smooth=1') plt.legend()
If we want to use error measure weights while computing spline, we can use the following signatures:
yi = csaps(x, y, xi, weights, smooth) yi, smooth = csaps(x, y, xi, weights) spline = csaps(x, y, weights) spline = csaps(x, y, weights, smooth)
The example of weighted smoothing univariate data:
.. plot:: x, y = univariate_data() xi = np.linspace(x[0], x[-1], 150) w = np.ones_like(x) * 0.5 w[-7:] = 0.1 w[:7] = 0.1 w[[10,13]] = 1.0 w[[11,12]] = 0.1 yi = csaps(x, y, xi, smooth=0.85) yi_w = csaps(x, y, xi, weights=w, smooth=0.85) plt.plot(x, y, 'o') plt.plot(xi, yi, '-', label='unweighted') plt.plot(xi, yi_w, '-', label='weighted') plt.legend()
In n-d gridded data case we can use the same weights for all dimensions or different weights for each dimension.
axis parameter specifies Y-data axis for computing spline in multivariate/vectorize data cases
(axis along which Y-data is assumed to be varying).
By default axis is equal to -1 (the last axis). In other words, y.shape[axis]
must be equal to x.size
.
For example, the following code will raise ValueError
without axis
parameter:
.. plot:: x, y1 = univariate_data(seed=1327) x, y2 = univariate_data(seed=2451) # We stack y-data as MxN array y = np.stack((y1, y2), axis=1) print('x.size:', x.size) print('y.shape:', y.shape) xi = np.linspace(x[0], x[-1], 150) # yi = csaps(x, y, xi, smooth=0.8) # --> ValueError: invalid "ydata" shape for given "xdata" yi = csaps(x, y, xi, smooth=0.8, axis=0) plt.plot(x, y, 'o', xi, yi, '-')
Note
axis
parameter is ignored in ND-gridded data cases.
.. versionadded:: 1.1.0
We can use normalizedsmooth
option to normalize smooth parameter.
If normalizedsmooth
is True, the smooth parameter is normalized such that results are invariant to xdata range
and less sensitive to nonuniformity of weights and xdata clumping.
Let's show it on a simple example.
.. plot:: x1, y = univariate_data(seed=1327) x2 = x1 * 10 xi1 = np.linspace(x1[0], x1[-1], 150) xi2 = np.linspace(x2[0], x2[-1], 150) yi1 = csaps(x1, y, xi1, smooth=0.8) yi2 = csaps(x2, y, xi2, smooth=0.8) yi1_n = csaps(x1, y, xi1, smooth=0.8, normalizedsmooth=True) yi2_n = csaps(x2, y, xi2, smooth=0.8, normalizedsmooth=True) f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(8, 6)) ax1.plot(x1, y, 'o', xi1, yi1, '-') ax2.plot(x2, y, 'o', xi2, yi2, '-') ax3.plot(x1, y, 'o', xi1, yi1_n, '-') ax4.plot(x2, y, 'o', xi2, yi2_n, '-')
If we want to compute spline only without evaluating (smoothing data), we can use the following signatures:
spline = csaps(x, y) spline = csaps(x, y, smooth)
In this case the smoothing spline will be computed for given data and returned as an instance of :class:`ISmoothingSpline` based class. After we can use the computed spline to evaluate (smoothing) data for given data sites repeatedly.
The example for univariate data:
.. plot:: x, y = univariate_data(n=11) spline = csaps(x, y) xi1 = np.linspace(x[0], x[-1], 20) xi2 = np.linspace(x[0], x[-1], 50) yi1 = spline(xi1) yi2 = spline(xi2) f, (ax1, ax2) = plt.subplots(2, 1, figsize=(5, 6)) ax1.plot(x, y, 'o', xi1, yi1, '.-') ax2.plot(x, y, 'o', xi2, yi2, '.-')
We can use spline analysis functionality that is provided by :class:`scipy.interpolate.PPoly`/:class:`scipy.interpolate.NdPPoly` classes.
For example, let's try to use :func:`scipy.interpolate.PPoly.derivative` method to compute spline 1st and 2nd derivatives.
.. plot:: from csaps import CubicSmoothingSpline x, y = univariate_data() xi = np.linspace(x[0], x[-1], 300) s = CubicSmoothingSpline(x, y, smooth=0.85).spline ds1 = s.derivative(nu=1) ds2 = s.derivative(nu=2) _, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(5, 6)) ax1.plot(x, y, '.', xi, s(xi), '-') ax2.plot(xi, ds1(xi), '-') ax3.plot(xi, ds2(xi), '-') ax1.set_title('spline') ax2.set_title('1st derivative') ax3.set_title('2nd derivative')