ENH, WIP: Add 2-D fitting function for polynomials #14151

AWehrhahn · 2019-07-29T13:12:45Z

ENH: add 2d polynomial fit function to numpy.polynomial.

charris · 2019-07-29T13:56:32Z

New functions need to be discussed on the mailing list.

In practice, I've found it useful to have a way to limit which coefficients are allowed to vary. The case where deg(x) + deg(y) <= N for some N turns up fairly often. We also like to keep functions common between all the polynomial types when possible, look at how the current 1-D fitting is implemented.

The style changes should be in a separate PR and aren't required by the current NumPy standard, spaces around * and / aren't needed.

charris · 2019-07-29T14:03:37Z

If you are feeling more ambitious, it would be nice to design a 2-D class to complement the current 1-class. It would have partial derivative methods, maybe integration too. It will not be a trivial effort, I believe SciPy (@rgommers) is also looking into multidimensional methods for interpolation and this might be related.

AWehrhahn · 2019-07-29T14:11:30Z

Where can I find the mailing list, I just followed the guide and it didn't mention anything.

I included a case for deg(x) + deg(y) <= N with the max_degree parameter.
What do you mean with keeping functions common between polynomial types? polyfit1d is implemented more or less the same way?

The style changes where unintentional, that was just the auto-formatter doing its work. Will reverse them in the next commit.

I don't think I will have time to do a proper 2d polynomial class, maybe at some later time

charris · 2019-07-29T14:46:26Z

What do you mean with keeping functions common between polynomial types?

That all six polynomial types have the same basic set of functions. See numpy/polynomial/polyutils.py for an example of abstracting the common fit functionality in the _fit function. That function is then used to implement 1-D fitting for all the polynomial types.

The mailing list is numpy-discussion@python.org, but if you initiate a discussion it is best to subscribe at https://mail.python.org/mailman/listinfo/numpy-discussion.

grlee77 · 2019-07-29T15:52:59Z

I am not an expert in this area, but have wanted to do this previously (but in 3D rather than 2D) and some searching turned up scikit-learn as an existing solution. It is possible to fit n-dimensional polynomials using scikit-learn's PolynomialFeatures combined with a linear model like LinearRegression or Ridge. The examples given in the docs are for 1D, but the underlying PolynomialFeatures can take multiple coordinates for 2D or higher dimensional data.

A benefit of the API of polyfit2d proposed here, is that it should be easier to quickly understand for a user just wanting to do 2D interpolation without the overhead of having to learn about scikit-learn classes / pipelines.

AWehrhahn · 2019-08-05T16:15:23Z

I moved the 2d fit procedure into polyutils and added functions in the other polynomial classes for the 2d fit. I had to disable the scaling though as the shift works differently (I assume).

charris · 2019-08-16T21:30:26Z

numpy/polynomial/polynomial.py

+        If given the maximum combined degree of the coefficients is limited
+        to this value, i.e. all terms with `n` + `m` > max_degree are set to 0.
+        The default is None.
+    scale : bool, optional


Scaling is even more important for the other polynomial basis whose weights have compact support, which is why I'd like to see a class for 2-d objects at some point so that the extra information, also including the selection of coefficients, could be carried along. I don't think it belongs in the base fitting function, however. It is also possible to scale the 2d domain to [-1, 1] in each direction and get good results in most cases, images and such.

charris · 2019-08-16T21:50:41Z

Wow, nice job, just a couple of comments.

Code lines need to be < 80 characters in length, a few exceed that.
Docstring lines should be <= 75 characters in length.
There are a couple of typos, it would be good to run a spellcheck on everything.
Scale is only present in the power basis, and should probably be omitted on that count.

I'd really like to figure out a way to reduce the amount of repetition in the docstrings, but haven't solved that problem yet. The polynomial documentation is a significant part of the total NumPy documentation :) There is an argument to be made for the polynomials to be their own project.

If you have any ideas of how to make a 2d class, I'd like to hear it. Such a class would include partial derivatives, along with scaling and offsets.

I'm wondering if it would make sense to use a mask for selecting the fitted coefficients or would that be too general? The current setup covers the most common options.

charris · 2019-08-16T21:54:47Z

I still need to review the code in detail, the tests in particular.

eric-wieser · 2019-08-16T22:23:34Z

I have some ideas on ND versions, possibly with mixed polynomial types - will try to dump them in a hackmd document or issue at some point.

numpy/polynomial/polyutils.py

AWehrhahn · 2019-08-21T14:39:54Z

Adding a mask for the selection of the fitted polynomial degrees would be pretty simple, considering that is what i do for max_degree anyway.

eric-wieser · 2019-08-24T18:07:18Z

numpy/polynomial/chebyshev.py

+    --------
+
+    """
+    return pu._fit2d(chebvander2d, x, y, z, deg, rcond, full, w, max_degree, False)


Tempting to pass chebvander in directly here, and let pu._fit2d call pu._vander2d

I mean that is exactly what chebvander2d is doing. No need to repeat that here I think.

The reason I suggest this is that you if you passed chebvander directly onto _vander_nd, then that would allow fitting of mixed-axis polynomicals, eg pu._fit2d([chebvander, polyvander], ...)

How about:

if not callable(vander2d_f): vander2d_f = lambda x, y, deg: _vander_nd_flat(vander2d_f, (x, y), deg)

I.e. allow both?

AWehrhahn · 2019-09-09T14:11:20Z

I added the degree handling, using a mask. One can then pass a 2d array with the same size as the output coefficient matrix, where degrees that are to be used are set to 1 (or more), and the others set to 0.

eric-wieser · 2019-10-28T09:35:40Z

numpy/polynomial/polyutils.py

+    for i, j in idx:
+        coeff[i, j] /= scale_x ** i * scale_y ** j
+    return coeff


I think this can be vectorized, will come back to it...

numpy/polynomial/polynomial.py

numpy/polynomial/chebyshev.py

eric-wieser · 2019-10-28T10:11:57Z

numpy/polynomial/chebyshev.py

@@ -1647,6 +1647,129 @@ def chebfit(x, y, deg, rcond=None, full=False, w=None):
    return pu._fit(chebvander, x, y, deg, rcond, full, w)


+def chebfit2d(x, y, z, deg, rcond=None, full=False, w=None, max_degree=None):
+    """
+    2d Least squares fit of Chebyshev series to data.


I wonder if we should construct this docstring from a template, it irritates me to see such a large docstring for such a small function body.

numpy/polynomial/chebyshev.py

Fix some of the documentation Fix the tests by using meshgrid 1D case should be consistent with _fit

AWehrhahn

I fixed this in a later version

numpy/polynomial/polyutils.py

eric-wieser · 2020-05-13T12:34:03Z

numpy/polynomial/polyutils.py

@@ -725,6 +726,182 @@ def _fit(vander_f, x, y, deg, rcond=None, full=False, w=None):
    else:
        return c

+def _fitnd(vandernd_f, coords, data, deg=1, rcond=None, full=False, w=None,


w is not documented

numpy/polynomial/polyutils.py

eric-wieser · 2020-05-13T12:35:40Z

numpy/polynomial/polyutils.py

+    vander2d_f : {function(array_like, ..., int) -> ndarray, list of function(array_like, int) -> ndarray}
+        The 2d vander function, such as ``polyvander2d``,
+        or a list of 1d vander functions for each dimension


Can we remove the complexity here and only allow the list of 1d vander functions?

Sure, will change that soon

eric-wieser · 2020-05-13T12:38:10Z

numpy/polynomial/chebyshev.py

@@ -1663,6 +1663,130 @@ def chebfit(x, y, deg, rcond=None, full=False, w=None):
    return pu._fit(chebvander, x, y, deg, rcond, full, w)


+def chebfit2d(x, y, z, deg, rcond=None, full=False, w=None, max_degree=None):


Any reason not to immediately go to nd?

Suggested change

def chebfit2d(x, y, z, deg, rcond=None, full=False, w=None, max_degree=None):

def chebfitnd(xs, fxs, deg, *, rcond=None, full=False, w=None, max_degree=None):

The function call is slightly different. The 2d version has seperate paramaters for each dimension x and y. While the nd version only has 1 parameter (x, y) with all coordinates in a tuple.
The 2d version is more consistent with the existing functions for e.g polyval2d.

So, I'd argue that in hindsight the 2d functions were a mistake - we should have gone straight from 1d to nd, rather than blowing up our API with intermediate integers.

Since what you're adding is a new API, what I'm arguing is that we should learn from our mistake, and go straight to writing the nd function (and the slight argument difference this entails).

If you feel strongly about the calling convention, you could use instead:

Suggested change

def chebfit2d(x, y, z, deg, rcond=None, full=False, w=None, max_degree=None):

def chebfitnd(*args, *, deg, rcond=None, full=False, w=None, max_degree=None):

*xs, fxs = args

I have no particular attachement to the calling convention

Why not deprecating all the XXX2d and XXX3d functions (like chebfit2d, chebfit3d, chebval2d, chebval3d, chebgrid2d, chebgrid3d, chebvander2d, chebvander3d, etc.) since they all are obsolete once you have the XXXnd functions (like chebfitnd, chebvalnd, chebgridnd, chebvandernd, ...)?

eric-wieser · 2020-05-13T12:38:59Z

numpy/polynomial/chebyshev.py

+    --------
+
+    """
+    return pu._fitnd(chebvander2d, (x, y), z, deg, rcond, full, w, max_degree)


With my above suggestion,

Suggested change

return pu._fitnd(chebvander2d, (x, y), z, deg, rcond, full, w, max_degree)

return pu._fitnd([chebvander] * len(xs), xs, fxs, deg, rcond, full, w, max_degree)

Fix documentation

Currently there are no functions in chebyshev.py that do N-dimensional fitting for N>1 This submission is a attempt to fill the above mentioned gap and complements the efforts in numpy#14151. The multidimensional interpolation method implemented here uses the 1D Discrete Cosine Transform (DCT) instead of the vandermonde method. This is a reimplementation of the chebfitfunc function found in numpy#6071. The accuracy of the new interpolation method is better than the old one especially for polynomials of high degrees. Also added domain as input to chebinterpolate in order to interpolate the function over any input domain. Added test_2d_approximation for testing interpolation of a 2D function. Replaced all references to the missing chebfromfunction with chebinterpolate.

Currently there are no functions in chebyshev.py that do N-dimensional fitting for N>1 This submission is a attempt to fill the above mentioned gap and complements the efforts in numpy#14151. The multidimensional interpolation method implemented here uses the 1D Discrete Cosine Transform (DCT) instead of the vandermonde method. This is a reimplementation of the chebfitfunc function found in numpy#6071. The accuracy of the new interpolation method is better than the old one especially for polynomials of high degrees. Also added domain as input to chebinterpolate in order to interpolate the function over any input domain. Added test_2d_approximation for testing interpolation of a 2D function. Replaced all references to the missing chebfromfunction with chebinterpolate. Made doctest of chebinterpolate more robust.

charris changed the title ~~Polyfit2d~~ ENH, WIP: Add 2-D fitting function for polynomials Jul 29, 2019

charris added 01 - Enhancement component: numpy.polynomial labels Jul 29, 2019

AWehrhahn force-pushed the polyfit2d branch from 484ec0e to 14569cb Compare July 29, 2019 14:28

charris reviewed Aug 16, 2019

View reviewed changes

eric-wieser reviewed Aug 16, 2019

View reviewed changes

numpy/polynomial/polyutils.py Outdated Show resolved Hide resolved

eric-wieser self-requested a review August 16, 2019 22:27

eric-wieser reviewed Aug 24, 2019

View reviewed changes