Skip to content

Conversation

@ev-br
Copy link
Member

@ev-br ev-br commented May 20, 2025

This PR adds a minimal Array API interface to BSpline and make_interp_spline.

There are several minor details worth spelling out. First, BSpline uses a C extension, therefore non-numpy array inputs need to go through the now-standard np.asarray(input); result = c_function(input); return xp.asarray(result) dance.

Second, BSpline instances used to store numpy arrays. Namely, for array-like t and c, and spl = BSpline(t, c, k), then spl.t and spl.c are numpy array attributes derived from the __init__ arguments.
This PR changes these attributes into properties, so that if t and c are torch tensors, and spl = BSpline(t, c, k),
a new private attribute spl._t is a numpy array, and spl.t is a propery which constructs a torch tensor from spl._t.

The advantage of this is that at the call time, we can use spl._t without paying the price of np.asarray (which is not negligible here, actually). An unpleasant consequence is that the property abstraction leaks for mutable properties: spl.t.fill(0) goes through the getter not the setter. There's nothing to be done about it though, and this does not lead to new failure modes AFAICS.

Third, if t, c and x are torch tensors, then spl = BSpline(t, c, k); spl(x) should produce a torch tensor, too. To achieve this, we store the namespace on an instance, via self.xp = array_namespace(t, c) and the __call__ is

def __call__(self, x):
     x_np = np.asarray(x)
     out = heavy_lifting_in_c(x_np)
     return self.xp.asarray(out)

At the moment, the new attribute xp is public, since I don't see a particularly strong need to hide it. If anything, it can be useful to a user: otherwise there's no easy way for a user to tell what is the type of spl(1) without calling it.
If this turns out to be problematic, we can always hide the attribute behind a property. Or store xp.asarray as a private method.

One last small thing: since module objects are not picklable, we need to add a custom pickler to correctly handle the xp attribute.

@ev-br ev-br added scipy.interpolate array types Items related to array API support and input array validation (see gh-18286) labels May 20, 2025
@lucascolley
Copy link
Member

Sounds like a good idea from a high level!

@lucascolley lucascolley added the enhancement A new feature or improvement label May 20, 2025
@lucascolley lucascolley changed the title Add Array API interface to BSpline ENH: interpolate.BSpline: add array API standard interface May 20, 2025
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, thanks Evgeni.

At the moment, the new attribute xp is public, since I don't see a particularly strong need to hide it.

Private please! There is no need to leak this, even after array API support becomes fully public.

If anything, it can be useful to a user: otherwise there's no easy way for a user to tell what is the type of spl(1) without calling it.

If this info is needed - which should be rare - isinstance on one of the array attributes is still quite cheap, and more idiomatic.

If this turns out to be problematic, we can always hide the attribute behind a property. Or store xp.asarray as a private method.

Renaming it from .xp to ._xp seems most straightforward here.

@ev-br ev-br requested review from ilayn and larsoner as code owners May 21, 2025 08:41
@lucascolley lucascolley removed request for ilayn and larsoner May 21, 2025 09:18
@ev-br
Copy link
Member Author

ev-br commented May 21, 2025

Comments addressed.

Moving concat_1d into _lib runs into some sort of import cycle between array-api-compat and array-api-extra. Using a local import fixes the immediate problem, but I suppose it is brittle either way. The multivendoring contraption there is frankly rather crazy IMO. Bringing it back to sanity is out of scope for this PR, so if we'd rather avoid local imports, I can move concat_1d back to scipy.interpolate.

@lucascolley
Copy link
Member

lucascolley commented May 21, 2025

Good point. I think the simplest fix may be to move _array_api.array_namespace to a different file which doesn't import from _array_api. But yes, that is out of scope.

@ev-br
Copy link
Member Author

ev-br commented May 21, 2025

One last small thing: since module objects are not picklable, we need to add a custom pickler to correctly handle the xp attribute.

Actually, we don't if we store a private function instead of a private namespace: instead of self._xp.asarray we store self._asarray = array_namespace(t, c).asarray and all is well. The need to call other namespaced functions should be rare, and here it only happens in derivative and antiderivative, which are essentially constructors. So no performance considerations here, and the code is simpler, I'd say it's worth it.
The last commit makes this change.

Copy link
Member

@j-bowhay j-bowhay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, not looked at the tests yet

c = np.zeros_like(t)
c[k] = 1.

t, c = xp.asarray(t), xp.asarray(c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It slightly bugs me that we have to call xp.asarray on these for the sole purpose of being able to get the array namespace later but I guess the cost of doing so is pretty low

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugs me too, a bit. This method is not meant to be in a tight loop though, and the parts that are, have numpy arrays to work on.

c = self.c.copy()
c = self._asarray(self.c, copy=True)
t = self.t
xp = array_namespace(t, c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not store xp when the object is initialized?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's deliberate, actually:

#23020 (comment)

Copy link
Contributor

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if c and t are different arrays like BSpline(t=torch_array, c=numpy_array, 2)? (Not that it makes sense, but in case a user does it.)

@lucascolley
Copy link
Member

What happens if c and t are different arrays like BSpline(t=torch_array, c=numpy_array, 2)? (Not that it makes sense, but in case a user does it.)

raise TypeError(f"Multiple namespaces for array inputs: {namespaces}")

https://github.com/data-apis/array-api-compat/blob/6c708d13e826fb850161babf34f306fe80cae875/array_api_compat/common/_helpers.py#L665

ev-br added 8 commits July 11, 2025 17:41
convert BSpline (minus design_matrix, which returns a csr_array),
splder, splantider, make_interp_spline

Since heavy lifting is in C, only do array_namespace on inputs,
convert to numpy, do the work, convert to self.xp on exit.
xp.concat does not accept python scalars or 0D arrays,
so we need a helper for thing like `np.r_[0, x]` and
`np.r_[x[0], x]`.
Store a private `_asarray` function on a BSpline instance instead
of the namespace. This way, we do not need a custom getstate/setstate
or reduce or any other custom pickle support.
@ev-br ev-br force-pushed the bspline_array_api branch from 0d7e256 to 1bcb647 Compare July 11, 2025 16:02
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Jake Bowhay <60778417+j-bowhay@users.noreply.github.com>
@ev-br ev-br force-pushed the bspline_array_api branch from 1bcb647 to 94b83b2 Compare July 11, 2025 16:09
@ev-br
Copy link
Member Author

ev-br commented Jul 11, 2025

What happens if c and t are different arrays like BSpline(t=torch_array, c=numpy_array, 2)? (Not that it makes sense, but in case a user does it.)

What Lucas said: mixing array types in the constructor is an automatic TypeError from array_namespace. What is somewhat less clear-cut is mixing array types between __init__ and __call__ (in the scikit-learn language: .fit and .predict): what is BSpline(t=np_array, c=np_array, k)(torch_tensor)?

In this PR --- and I think this is correct in general --- the logic is that the array namespace is fixed by the arguments of the constructor; at call time, the arguments are cast to the array type of that fixed namespace:

In [1]: import torch

In [2]: import numpy as np

In [3]: from scipy.interpolate import BSpline, make_interp_spline

In [4]: spl = make_interp_spline(np.arange(10), np.arange(10)**2)

In [5]: spl_ = BSpline(torch.asarray(spl.t), torch.asarray(spl.c), spl.k)

In [7]: spl_(1)
Out[7]: tensor(1., dtype=torch.float64)

In [8]: spl_(np.asarray(1))
Out[8]: tensor(1., dtype=torch.float64)

@ev-br
Copy link
Member Author

ev-br commented Jul 11, 2025

Addressed all review comments.

@lorentzenchr
Copy link
Contributor

In this PR --- and I think this is correct in general --- the logic is that the array namespace is fixed by the arguments of the constructor; at call time, the arguments are cast to the array type of that fixed namespace

Maybe it does not matter much and consistency within scipy is more import, but as a user I would expect to get the same array back as I pass into __call__.

@lucascolley lucascolley self-requested a review July 11, 2025 18:37
@lucascolley lucascolley added this to the 1.17.0 milestone Jul 11, 2025
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Evgeni, very close!

Comment on lines +357 to 360
xp = array_namespace(t)
t = np.asarray(t)
k = t.shape[0] - 2
t = _as_float_array(t)
Copy link
Member

@lucascolley lucascolley Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe consider renaming _as_float_array? It was not clear to me that it was something NumPy-specific. perhaps include 'c_contiguous'?


What is the precise reason we need to convert t to and from np in this method? Can the np.r_ call not be replaced with concat_1d?

If we only need to worry about C-contiguity for NumPy arrays, could we not make _as_float_array operate on xp arrays by using _array_api._asarray(..., order='C')?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No opinion on renaming _as_float_array. Send a patch if you have feel strongly about it?


Conversions to/from numpy: there are of course multiple ways to skin this one. It's used ten times,

$ git grep -n "_as_float_array" scipy/interpolate/ | wc -l
11

and IIRC other uses are after the conversion to numpy. So it felt easier to just do it like it's done here and be done with it. Care to send a patch if you see a way to make it nicer across the module?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, the only reason we are moving to np here is to use _as_float_array? If so, that seems bad.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATM all alternative constructors, including basis_element, use this pattern. If a clean-up is desired, we can do it in a sequel, after #23052 which converts other users of _as_float_array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, if you add a TODO comment here to avoid the namespace conversion I am happy to merge!

Alternatively, I could try to come up with a patch in the coming days.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A patch would be great indeed @lucascolley! Meanwhile I'll then add a TODO comment to gh-23052 to avoid flushing the CI here.

check_splev(b1, j, der, 1e-12, 1e-12)


### stolen from @pv, verbatim
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Evgeni, and for the review @lorentzenchr !

@ev-br
Copy link
Member Author

ev-br commented Jul 13, 2025

Merging as approved. Thanks to all reviewers!

@ev-br ev-br merged commit a055975 into scipy:main Jul 13, 2025
38 checks passed
@lucascolley lucascolley added the needs-release-note a maintainer should add a release note written by a reviewer/author to the wiki label Nov 21, 2025
@tylerjereddy tylerjereddy removed the needs-release-note a maintainer should add a release note written by a reviewer/author to the wiki label Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

array types Items related to array API support and input array validation (see gh-18286) enhancement A new feature or improvement scipy.interpolate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants