Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP, ENH: signal: test_spectral array API support #19001

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

tylerjereddy
Copy link
Contributor

@tylerjereddy tylerjereddy commented Aug 1, 2023

Towards gh-20678

TODO:

signal array API test module checklist, to be moved to an issue probably, since this PR now focuses on a more limited scope than all of signal
  • test_result_type.py
  • test_dltisys.py
  • test_waveforms.py
  • test_fir_filter_design.py
  • test_windows.py
  • test_czt.py
  • test_max_len_seq.py
  • test_wavelets.py (deprecated?)
  • test_spectral.py
  • test_ltisys.py
  • test_upfirdn.py
  • test_filter_design.py
  • test_savitzky_golay.py
  • test_array_tools.py
  • test_signaltools.py
  • test_bsplines.py
  • test_peak_finding.py
  • test_cont2discrete.py
  • test_short_time_fft.py

Copy link
Member

@ilayn ilayn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments; my ignorance about array API notwithstanding, seemingly we have a lot of work to adopt this API. Unfortunately same seems to be true for array vendors too. This is exactly what I was hoping to get in my comments #18915 so thanks a lot for this. Very illustrative.

A[:, 0] = xp.arange(1, Npts + 1, dtype=dtype) / Npts
sl = slice(int(bp[m]), int(bp[m + 1]))
# NOTE: lstsq isn't in the array API standard
if "cupy" in xp.__name__ or "torch" in xp.__name__:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this library specific conditions anti-pattern already?

new_bp[:size(bp)] = bp
new_bp[size(bp)] = N
bp = xp.sort(xp.unique(xp.asarray([0] + new_bp)))
if xp.any(bp > N):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We sorted bp already so you can compare the last element instead to save some pennies.

Error message is also a bit cryptic though not relevant.

new_bp = xp.empty(size(bp) + 1)
new_bp[:size(bp)] = bp
new_bp[size(bp)] = N
bp = xp.sort(xp.unique(xp.asarray([0] + new_bp)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there something that we can avoid doing this type of back and forth?

I found this https://data-apis.org/array-api/latest/API_specification/generated/array_api.unique_values.html?highlight=unique#array_api.unique_values but it doesn't have the NumPy functionality.

For example, numpy.unique has return_index keyword to get rid of the sorting need but I don't know if array API supports it which is my biggest concern adopting this api; potential code divergence later in different packages. Mini demo;

>>> import numpy as np
>>> arr = np.array([1.0, np.sqrt(2), np.sqrt(2), 1/3, 1/3, np.pi])
>>> arrun, ind = np.unique(arr, return_index=True)
>>> arr[ind]
array([0.33333333, 1.        , 1.41421356, 3.14159265])

So indeed you can embed bp into a larger array

new_bp = xp.empty(size(bp)+2, dtype=dtype)
new_bp[0] = 0
new_bp[-1] = N  # -1 is not working anymore in array_api?
new_bp[1:-1] = bp 

I know bp is typically a small array but still the code gets really complicated for a relatively simple operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this code should not get this complicated. np.unique(..., return_index=True) can be replaced by xp.unique_all. I'm not sure if that gets rid of the sorting operation here, because that's already in the original code - but if that original code can be improved first, there is a matching unique_* function that can be used.

Pxy = np.median(Pxy, axis=-1)
Pxy /= bias
Pxy = xp.median(Pxy, axis=-1)
# for PyTorch, Pxy is torch.return_types.median
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same anti-pattern comment here and a bit below. Isn't array api unapologetically expecting all arrays to be uniformly having the same methods and properties?

try:
is_complex = xp.iscomplexobj(x)
except AttributeError:
# torch shim
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same troubling line

if "torch" in xp.__name__:
strides = x.stride()[:-1]+(step*x.stride()[-1], x.stride()[-1])
result = xp.as_strided(x, size=shape, stride=strides)
elif "cupy" in xp.__name__:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more significant for me than the performance hit below. I would instead save the NumPy code branch to save good weather SciPy performance first and deal with other libraries later if they support or not.

if type in ['constant', 'c']:
ret = data - np.mean(data, axis, keepdims=True)
ret = data - xp.mean(xp.asarray(data, dtype=dtype), axis=axis, keepdims=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just do the conversion with the right dtype once at the top? Seems like lots of asarray creeping in unintentionally.

@ev-br
Copy link
Member

ev-br commented Aug 1, 2023

One other potentially tangentially related thing while we're at the subject of supporting other arrays types. There is an ongoing effort to provide a clone of scipy.signal API in cupy, as cupyx.scipy.signal. Some of these functions might be implemented using dedicated Cuda kernels etc.

So there might be value in trying to detect upfront if all arguments are cupy-compatible and delegate to the matching cupyx.scipy.signal function. This does not of course preclude transforming internal implementations to Array API so that other implementors can benefit. Just something to keep in mind for the future. Maybe we'd want to invent a sort of way to register an alternative implementation and its preconditions.

@h-vetinari
Copy link
Member

h-vetinari commented Aug 2, 2023

So there might be value in trying to detect upfront if all arguments are cupy-compatible and delegate to the matching cupyx.scipy.signal function.

I'm not sure that's such a good thing. It would get us an optional dependency on cupy, which itself optionally depends on scipy. That's a kind of weak dependency cycle (though solvable), but mainly it would complicate our testing here (at least one job {with, without} cupy; and to really test the GPU path, we'd need that in our CI), and for IMO very little gain.

If it stays only in cupy, the message is clear: use the cupy compat layer to utilize CUDA. If we start optionally dispatching to cupy, it becomes much harder to explain to users what they can expect.

I'm not saying it's a dealbreaker, but a priori I'm not enamoured with the idea.

@rgommers rgommers added the array types Items related to array API support and input array validation (see gh-18286) label Aug 16, 2023
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tyler, looks like the progress we have made elsewhere will put this PR in much better shape after a few changes!

newdata_shape = newdata.shape
newdata = newdata.reshape(N, -1)

if not overwrite_data:
newdata = newdata.copy() # make sure we have a copy
if newdata.dtype.char not in 'dfDF':
newdata = xp.asarray(newdata, copy=True) # make sure we have a copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use copy from _array_api.py now.

newdata = newdata.copy() # make sure we have a copy
if newdata.dtype.char not in 'dfDF':
newdata = xp.asarray(newdata, copy=True) # make sure we have a copy
if newdata.dtype not in [xp.float64, xp.float32, xp.complex128, xp.complex64]:
newdata = newdata.astype(dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be xp.astype(newdata, dtype).

if type in ['constant', 'c']:
ret = data - np.mean(data, axis, keepdims=True)
ret = data - xp.mean(xp.asarray(data, dtype=dtype), axis=axis, keepdims=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xp.astype(data, dtype) is clearer

A[:, 0] = xp.arange(1, Npts + 1, dtype=dtype) / Npts
sl = slice(int(bp[m]), int(bp[m + 1]))
# NOTE: lstsq isn't in the array API standard
if "cupy" in xp.__name__ or "torch" in xp.__name__:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps best to just use scipy.linalg in this PR and let it fail? Then, at a point where lstsq works with these libraries, no changes will be needed here.

A separate PR could address this library-specific special-casing for now, like Matt's gh-19023. I think it makes sense to separate the array API work, which is here to stay, and this sort of "dispatching" which will hopefully be replaced by something more robust eventually.

Pxy = np.median(Pxy, axis=-1)
Pxy /= bias
Pxy = xp.median(Pxy, axis=-1)
# for PyTorch, Pxy is torch.return_types.median
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we need to write an array-agnostic median à la cov (or push for it to be added to the standard). Again, I would remove the special casing for now and let this fail so that it's clear where work needs to be done.

x = np.asarray(x)
# Ensure we have xp.arrays, get outdtype
x = xp.asarray(x)
# https://github.com/data-apis/array-api-compat/issues/43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be able to be cleaned up now after data-apis/array-api-compat#55.


if return_onesided:
if np.iscomplexobj(x):
try:
is_complex = xp.iscomplexobj(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we should add an is_complex to _array_api.py, which I think would be x.dtype in {xp.complex64, xp.complex128}.


# Detrend each data segment individually
result = detrend_func(result)

# Apply window by multiplication
# NOTE: torch device shim -- needs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed after pytorch/pytorch#106773 I think.

x = np.zeros(16)
assert_allclose(array_api_compat.to_device(p, "cpu"),
q, atol=1e-7, rtol=1e-7)
assert_allclose(array_api_compat.to_device(f, "cpu"),
Copy link
Member

@lucascolley lucascolley Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a few changes to be made here to use the new xp_assert_close.

f, p = welch([])
@array_api_compatible
def test_empty_input(self, xp):
val = xp.asarray([]) if SCIPY_ARRAY_API else []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val = xp.asarray([]) if SCIPY_ARRAY_API else []
val = xp.asarray([])

@tylerjereddy
Copy link
Contributor Author

I rebased on latest main, and made a few fixups, but CI should be skipped entirely for now because I still need to address tons of comments and modernizations following all the recent array API infra progress.

For SCIPY_DEVICE=cuda python dev.py test -j 32 -b all on this branch, I get 39 failures, vs. 9 on main locally.

Most of the failures are of the form AttributeError: module 'numpy.array_api' has no attribute 'fft'. There are also some threading failures I saw with CUDA + high concurrency that Ralf and Lucas did not, but anyway this branch still needs more work.

@ilayn
Copy link
Member

ilayn commented Oct 28, 2023

You can ignore most of my comments as we discussed its bits and pieces in different issues (mainly in the linalg one), plus my understanding is better than when I was punching those comments above. Hence all good from my side.

freqs, Pxx = csd(x, x, fs=fs, window=window, nperseg=nperseg,
noverlap=noverlap, nfft=nfft, detrend=detrend,
return_onesided=return_onesided, scaling=scaling,
axis=axis, average=average)

return freqs, Pxx.real
if Pxx.dtype in {xp.complex64, xp.complex128}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an is_complex helper in _lib._array_api.py

Comment on lines 1919 to 1921
is_complex = x.dtype in {xp.complex64, xp.complex128}

if is_complex:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in _lib now.

Comment on lines 1936 to 1945
if hasattr(xp, 'fft'):
freqs = xp.fft.fftfreq(nfft, 1/fs)
else:
freqs = xp.asarray(np.fft.fftfreq(nfft, 1/fs))

elif sides == 'onesided':
freqs = sp_fft.rfftfreq(nfft, 1/fs)
if hasattr(xp, 'fft'):
freqs = xp.fft.rfftfreq(nfft, 1/fs)
else:
freqs = xp.asarray(np.fft.rfftfreq(nfft, 1/fs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that freqs = scipy.fft.fftfreq(..., xp=xp) will work here, with negligible overhead.

else:
result = result.real
func = sp_fft.rfft
if result.dtype in {xp.complex64, xp.complex128}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use is_complex here too.

Comment on lines 2056 to 2059
if hasattr(xp, "fft"):
func = xp.fft.fft
else:
func = np.fft.fft
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scipy.fft.fft?

Comment on lines 2063 to 2066
if hasattr(xp, "fft"):
func = xp.fft.rfft
else:
func = np.fft.rfft
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scipy.fft.rfft?

newdata = newdata.copy() # make sure we have a copy
if newdata.dtype.char not in 'dfDF':
newdata = xp.asarray(newdata, copy=True) # make sure we have a copy
if newdata.dtype not in [xp.float64, xp.float32, xp.complex128, xp.complex64]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a new is_floating helper?

if np.iscomplexobj(Pxy):
Pxy = (np.median(np.real(Pxy), axis=-1)
+ 1j * np.median(np.imag(Pxy), axis=-1))
if Pxy.dtype in [xp.complex64, xp.complex128]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_complex here too.

sides = 'twosided'
warnings.warn('Input data is complex, switching to '
'return_onesided=False')
else:
sides = 'onesided'
if not same_data:
if np.iscomplexobj(y):
if xp.iscomplexobj(y):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this isn't in the standard? Do we just need is_complex here too?

nperseg - noverlap)/float(fs)
if boundary is not None:
time -= (nperseg/2) / fs

result = result.astype(outdtype)
result = xp.asarray(result, dtype=outdtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xp.astype(...) is clearer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was intentional because astype fails with E AttributeError: 'numpy.ndarray' object has no attribute '_array'. I didn't look into it any deeper, but it is for the numpy.array_api backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm maybe result is arriving as a regular np array? IIUC it should be a np.array_api array so astype should work.

result = xp.as_strided(x, size=shape, stride=strides)
elif "cupy" in xp.__name__:
strides = x.strides[:-1]+(step*x.strides[-1], x.strides[-1])
result = xp.lib.stride_tricks.as_strided(x, shape=shape,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I think the special-casing should be saved for a separate PR.

@tylerjereddy
Copy link
Contributor Author

I addressed a few more comments (not all, yet). Some notes on what I'm seeing challenges with:

  • some of scipy.fft needed shims for device keyword handling with cupy and NumPy it seems, at least in my hands
  • xp.mean() strictly requires "real" type input, though most backends let us cheat on that for now
  • xp.unique reorganization isn't widely adopted yet I don't think? shim for that (unique_all, unique_values..)
  • some test skips added for numpy.array_api because of much stricter casting rules in some cases
  • perhaps most importantly, almost all current failures related to TestWelch come from the lack of moveaxis in the array API standard (numpy.array_api; other backends we can cheat because they have it)--suggestion for a graceful substitute?

if hasattr(xp, 'fft') and xp.__name__ != 'numpy':
if (hasattr(xp, 'fft') and xp.__name__ != 'numpy' and
"array_api_compat.numpy" not in xp.__name__ and
"cupy" not in xp.__name__):
Copy link
Member

@lucascolley lucascolley Oct 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compat numpy shim looks like a correct bug fix - you can use is_numpy from _lib to make this cleaner. We didn't catch this since we only test this function with 'raw' np arrays.

Edit: you may want is_numpy etc. to accept the raw versions as well as the compat versions. IIRC this shouldn't break anything in fft, I left it as just the compat versions as that was all we needed at the time.

For CuPy, I think we should remove this shim and skip any tests on CuPy which fail because of it. Then once array-api-compat implements fft, we can just remove the skip. The temporary shims seem unnecessary since there is no rush to get CuPy working partially in a release, but that's just my opinion.

We could add a comment into the code here to state that array-api-compat hasn't yet implemented fft if you want to; there will be a few issues like this until it has.

@@ -205,7 +207,9 @@ def rfftfreq(n, d=1.0, *, xp=None, device=None):
xp = np if xp is None else xp
# numpy does not yet support the `device` keyword
# `xp.__name__ != 'numpy'` should be removed when numpy is compatible
if hasattr(xp, 'fft') and xp.__name__ != 'numpy':
if (hasattr(xp, 'fft') and xp.__name__ != 'numpy' and
"array_api_compat.numpy" not in xp.__name__ and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_numpy here too.

dtype = 'd'
data = xp.asarray(data)
dtype = data.dtype
if data.dtype not in [xp.float32, xp.float64, xp.complex64, xp.complex128]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new is_floating could be used here too as not is_floating.

@tylerjereddy
Copy link
Contributor Author

From the latest commit:

    MAINT: PR 19001 revisions
    
    * replace an occurrence of `xp.is_complex()` with
    `is_complex()` (our internal shim) in `_spectral_helper()`;
    `is_complex` is not standard
    
    * many tests in `TestWelch` are now skipped with
    `numpy.array_api` backend because of the lack of the
    non-standard `moveaxis`
    
    * use our internal `is_complex()` helper in `csd()` as
    well, based on reviewer feedback
    
    * based on reviewer feedback, use `is_numpy(xp)` in the
    adjusted `fft._helper` filters, and don't filter `cupy`
    here; as a result, a number of `TestWelch` tests have
    a conditional skip added for `cupy` backend until `array_api_compat`
    shims around `fft`
    
    * clean up some unused imports based on linter whining
    
    * this patch appears to place this branch on parity with `main`
    for CUDA-backend array API full suite testing, though major
    tasks remain to be addressed, including switching to newer
    API testing machinery and probably covering more of `signal`
    before we'd seriously consider moving forward
    
    [skip ci] [skip circle]

@tylerjereddy
Copy link
Contributor Author

I think I've pushed in most of the modernization to new xp_assert.. testing machinery now. Some things that came up while doing that:

  • assert_array_almost_equal_nulp doesn't have an equivalent array API testing shim yet?
  • in some cases I have to turn off namespace or dtype checking since these new tests are stricter (I bet some of those we could enforce with careful design, others probably need to stay for now...)
  • a small number of cases I couldn't replace because of non-compliant ops in the assertion lines themselves (or, at least, it may require more thought... like np.trapz and so on)

@tylerjereddy
Copy link
Contributor Author

The draft conversion of TestCSD to array API testing was pushed in, though it is slightly depressing because a huge number of skips were needed for moveaxis, np.pad, and array-api-compat fft support. That said, I believe there is still some value in clearly labeling the active blockers for each test, etc., since it makes it clearer which things are waiting on the ecosystem and/or makes it easier for reviewers to identify things they don't agree should be blockers/skipped.

Perhaps I'll try migrating TestPeriodogram next.

@tylerjereddy tylerjereddy changed the title WIP, ENH: signal (welch) array API support WIP, ENH: signal array API support Nov 8, 2023
@lucascolley
Copy link
Member

lucascolley commented Nov 14, 2023

want to move from numpy.testing.assert_ to assert while we're here? (maybe just for what is already diffed in which case line 1545)

@tylerjereddy
Copy link
Contributor Author

All tests in test_spectral.py should now have draft array API testing implementations/conversions, along with skip markers where I needed them. I've also updated the original PR comment above to contain a checklist of the signal test module migration progress. So, that's just the first one.

@@ -18,120 +17,200 @@
from scipy.signal.tests._scipy_spectral_test_shim import stft_compare as stft
from scipy.signal.tests._scipy_spectral_test_shim import istft_compare as istft
from scipy.signal.tests._scipy_spectral_test_shim import csd_compare as csd
from scipy.conftest import array_api_compatible, skip_if_array_api_backend
from scipy._lib.array_api_compat import array_api_compat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import path was cleaned up recently.

Suggested change
from scipy._lib.array_api_compat import array_api_compat
from scipy._lib import array_api_compat

@lucascolley lucascolley added the enhancement A new feature or improvement label Dec 23, 2023
@lucascolley lucascolley marked this pull request as draft March 14, 2024 21:33
@tylerjereddy tylerjereddy changed the title WIP, ENH: signal array API support WIP, ENH: signal array API support (test_spectral) Mar 27, 2024
* use `xp.real(x)` instead of `x.real` for API compat
(similar shim for `mean`)

* guard use of `fft` optional extension

[skip ci] [skip circle]
* handle "dispatch" to FFT functionality via `scipy.fft`,
based on reviewer feedback; this also allowed usage of
`xp.astype()` in at least one additional place because
my manual FFT shims did not convert back to the appropriate
array type

* I did need some `scipy.fft` shims for `device` keyword
handling though...

* add a shim for `unique` vs. `unique_values`

* some test skips for `numpy.array_api` dtype/casting
issues

* use `is_complex()` in several places, based on reviewer
feedback

[skip ci] [skip circle]
* replace an occurrence of `xp.is_complex()` with
`is_complex()` (our internal shim) in `_spectral_helper()`;
`is_complex` is not standard

* many tests in `TestWelch` are now skipped with
`numpy.array_api` backend because of the lack of the
non-standard `moveaxis`

* use our internal `is_complex()` helper in `csd()` as
well, based on reviewer feedback

* based on reviewer feedback, use `is_numpy(xp)` in the
adjusted `fft._helper` filters, and don't filter `cupy`
here; as a result, a number of `TestWelch` tests have
a conditional skip added for `cupy` backend until `array_api_compat`
shims around `fft`

* clean up some unused imports based on linter whining

* this patch appears to place this branch on parity with `main`
for CUDA-backend array API full suite testing, though major
tasks remain to be addressed, including switching to newer
API testing machinery and probably covering more of `signal`
before we'd seriously consider moving forward

[skip ci] [skip circle]
* start modernizing the array API tests to use the new
`xp_assert..` functions

[skip ci] [skip circle]
* continue (finish?) modernizing the array API tests to use the new
`xp_assert..` functions

[skip ci] [skip circle]
* start converting `TestCSD` to array API testing; most of the tests
I've converted require substantial backend skipping because of i.e.,
`moveaxis` and `pad` usage, or lack of `fft` coverage by
array-api-compat

[skip ci] [skip circle]
* finish the draft conversion of `TestCSD` to array
API testing; unfortunately, a large number of backend
skips were needed across the board (many of these are
known issues like `moveaxis` and array_api_compat fft support)

[skip ci] [skip circle]
* convert 8 tests in `TestPeriodogram` to array API testing approach,
along with appropriate skips where needed

[skip ci] [skip circle]
* convert 4 tests in `TestPeriodogram` to array API testing approach,
along with appropriate skips where needed

[skip ci] [skip circle]
* convert remaining tests in `TestPeriodogram` to array API testing approach,
along with appropriate skips where needed

[skip ci] [skip circle]
* convert tests in `TestSpectrogram` to array API testing approach,
along with appropriate skips where needed

[skip ci] [skip circle]
* convert tests in `TestLombscargle` to array API testing approach,
along with appropriate skips where needed

[skip ci] [skip circle]
* convert `TestSTFT::test_input_validation` to array API testing approach,
along with appropriate skips where needed

* this required a fairly substantial number of changes, and we still
end up having to skip the usual three backends, for now

[skip ci] [skip circle]
* convert the next two `TestSTFT` tests to array API testing approach

[skip ci] [skip circle]
* convert the remaining `test_spectral.py` tests to array API testing approach

[skip ci] [skip circle]
* deal with large rebasing mess and the need
to adopt modern array API testing procedures
that have changed a few times in SciPy now...

* there are still some array API test failures
in `signal` to deal with...

[skip ci] [skip circle] [skip cirrus]
* fix failing `test_window_correction`, `test_window_external`,
`test_roundtrip_real`

* add some missing imports to `test_spectral.py` (these were
causing test failures in the regular suite)

* restore some custom/fast paths for `sliding_window_view`

[ci skip] [skip ci]
* haven't flushed the CI on this branch in months

[skip cirrus] [skip circle]
* fix linter issues reported in CI

* clean up a test failure related to `np.trapz`
usage detected by the CI

[skip cirrus] [skip circle]
* fix a few testing issues: NumPy `trapezoid` usage
is now conditional on NumPy version, `test_roundtrip_not_nola`
skipping reasons are now properly formatted, and `_spectral_helper`
has a broader check on the array namespace

[skip cirrus] [skip circle]
Copy link
Contributor Author

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucascolley @mdhaber this is passing the array API tests in CI again, and locally with GPU, but clearly still has issues given some of my recent comments and the fact that things have evolved over the months since I started working on this.

I believe a firm decision on how we're going to handle the sliding window view may be useful. Should we special case CuPy, torch, and NumPy so they are a few orders of magnitude faster with welch, but error out for other hypothetical array API backends? Or add the slow for loops for any other hypothetical array API backends and they likely suffer a major performance hit but at least can execute?

# see: https://github.com/data-apis/array-api/issues/641#issuecomment-1604884351
if is_cupy(xp):
result = xp.lib.stride_tricks.sliding_window_view(
x, window_shape=nperseg, axis=-1, writeable=True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to re-run the benchmarks from the SciPy conference array API proceedings paper (https://github.com/data-apis/scipy-2023-presentation) with CuPy 13.1.0 (which actually has sliding_window_view) I see:

  File "/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/cupy/lib/stride_tricks.py", line 119, in sliding_window_view
    raise NotImplementedError("Writeable views are not supported.")
NotImplementedError: Writeable views are not supported.

@@ -2007,10 +2060,11 @@ def _fft_helper(x, win, detrend_func, nperseg, noverlap, nfft, sides):
if sides == 'twosided':
func = sp_fft.fft
else:
result = result.real
if is_complex(result, xp=xp):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just above here, torch on GPU fails the proceedings paper benchmarks as well with:

Running:  python bench.py scipy torch_gpu
Traceback (most recent call last):
  File "/home/treddy/github_projects/scipy-2023-presentation/benchmarks/bench.py", line 130, in <module>
    main(benchmark, namespace)
  File "/home/treddy/github_projects/scipy-2023-presentation/benchmarks/bench.py", line 78, in main
    bench(namespace, print_times=False)
  File "/home/treddy/github_projects/scipy-2023-presentation/benchmarks/bench.py", line 119, in bench_scipy
    f, p = welch(x, nperseg=8)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/scipy/signal/_spectral_py.py", line 467, in welch
    freqs, Pxx = csd(x, x, fs=fs, window=window, nperseg=nperseg,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/scipy/signal/_spectral_py.py", line 610, in csd
    freqs, _, Pxy = _spectral_helper(x, y, fs, window, nperseg, noverlap,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/scipy/signal/_spectral_py.py", line 1963, in _spectral_helper
    result = _fft_helper(x, win, detrend_func, nperseg, noverlap, nfft, sides)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/scipy/signal/_spectral_py.py", line 2057, in _fft_helper
    result = win * result
             ~~~~^~~~~~~~
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

# violating the array API standard gives us 4 orders of magnitude
# better performance via views/striding, so preserving some special
# casing
# see: https://github.com/data-apis/array-api/issues/641#issuecomment-1604884351
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this branch used to have shims for array API support for libs other than CuPy and torch with something like the suggestion at: #18286 (comment)

I believe the underlying code also switched from as_strided to sliding_windows_view in main a few months ago as well, so this code path has mutated for the Nth time. Either way, this appears to pass the array API tests locally with the current skips I have at least, but doesn't really support the array API unless we restore a codepath that is a few orders of magnitude slower I think per the linked comment above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. I'd only suggest moving sliding_window_view function to scipy/_lib/_array_api.py? Then it will work with a one-line change here, and it will work later in any other place we may want to use sliding_window_view.

Failing for now on non-numpy/cupy/torch and just adding a TODO comment seems okay.

@tylerjereddy
Copy link
Contributor Author

I guess I could scope in the relevant parts of signal for array API CI testing here as well, though there's still a fair bit of code work to do before worrying about flushing the CI a bunch just yet.

@lucascolley
Copy link
Member

Should we special case CuPy, torch, and NumPy so they are a few orders of magnitude faster with welch

Or add the slow for loops for any other hypothetical array API backends and they likely suffer a major performance hit but at least can execute?

'Both' sounds like the right answer to me.

@tylerjereddy
Copy link
Contributor Author

one other question as I scan through the diff of test skips -- any movement or guidance on assert_array_almost_equal_nulp style tests for array API?

return ret
else:
dshape = data.shape
N = dshape[axis]
bp = np.sort(np.unique(np.concatenate(np.atleast_1d(0, bp, N))))
if np.any(bp > N):
if isinstance(bp, int):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-else doesn't look like it should be here. bp is an array-like of ints; just coercing it to an array on its own line should be the way to go.

@lucascolley lucascolley changed the title WIP, ENH: signal array API support (test_spectral) WIP, ENH: signal: test_spectral array API support May 17, 2024
@DietBru
Copy link
Contributor

DietBru commented May 23, 2024

Sorry for being a bit late to the discussion. An alternative solution to the sliding_window_view problem would be to implement a _spectral_helper-free csd function. This would allow deprecating the following functions in file _spectral_py.py:

For testing feature parity between the existing _spectral_helper and the new ShortTimeFFT class, I had implemented _spect_helper_csd, which is a ShortTimeFFT-based _spectral_helper implementation for wrapping the csd function in unit testing. It could be used to replace the existing csd function (since it already passes all unit tests).

Until now, I did not really see the need, since the replacement would not improve the functionality of csd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) enhancement A new feature or improvement scipy.signal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants