Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: signal: further refactor spline filters #20519

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ev-br
Copy link
Member

@ev-br ev-br commented Apr 18, 2024

Reference issue

Based off #18926, will need a rebase when that one lands.
Supersedes and closes #14558
closes #13643

I initially meant to work out #9209, but turns out it is better addressed separately.
#12691 is related, but no change to it in this PR.

What does this implement/fix?

  • refactor _splinemodule from tempita-templated C into C++;
  • use standard C++ complex arithmetics instead of GNU extensions, remove #ifdef GNUC guards
  • consolidate the python side: merge _splines.py and _bsplines.py. Previously it was not clear what is where and why; The resulting python module is under 1kLOC, so it's OK, I think;
  • incrementally improve test coverage of sepfir2d and spline_filter.

Additional information

This is pure maintenance in the end. I think it makes sense to land this separately from gh-18926: the latter does non-trivial algorithmic improvements, and this one just shuffles code around.

The main improvement is probably the complex-valued arithmetics: #13643 was due to the implementation probably predating C99; this PR uses standard tools instead; the python <-> C++ glue parrots what scipy.special does (thanks @steppi for the explanation!).

Here's a bigger picture, as far as I understand it:

  • signal.spline_filter is IIUC used for image denoising
  • spline_filter itself is a thin wrapper over signal.symirrorder{1,2} and signal.sepfir2d
  • @andfoy refactored symiirorder filters in ENH: Move symiirorder1/2, cspline2d, qspline2d and spline_filter to use sosfilt/lfilter #18926 to use lfilter / sosfilt instead of... uhm... bespoke manual implementations from the multipack era.
  • this PR does a technical refactor of the internals on top of Edgar's PR, plus some small cleanups of the python side.
    sepfir2d is still using C code from 2003, only minimally modernized.

A larger clean-up of sepfir2d could be to take a look at what CuPy does: it delegates a large chunk of signal convolution/filtering to ndimage (https://github.com/cupy/cupy/blob/v13.0.0/cupyx/scipy/signal/_bsplines.py#L72). Maybe SciPy can follow suit and remove a large amount of C code in scipy.signal.
Note that this is where gh-13643 originates.
This will require figuring out what CuPy actually does, and a careful testing, benchmarking etc etc. So it's a somewhat sizeable project, and is best done separately.

@ev-br ev-br added scipy.signal maintenance Items related to regular maintenance tasks labels Apr 18, 2024
@github-actions github-actions bot added C/C++ Items related to the internal C/C++ code base Meson Items related to the introduction of Meson as the new build system for SciPy labels Apr 18, 2024
@andfoy
Copy link
Contributor

andfoy commented Apr 18, 2024

It looks good to me! Thanks @ev-br

@lucascolley lucascolley changed the title Further refactor spline filters MAINT: signal: further refactor spline filters May 15, 2024
@ev-br
Copy link
Member Author

ev-br commented May 19, 2024

The CI failure in FAILED scipy\optimize\tests\test__basinhopping.py::TestBasinHopping::test_all_nograd_minimizers is surely unrelated.
Now that gh-18926 is in, I think it might make sense to merge this one in the same release cycle.

@ev-br ev-br added this to the 1.14.0 milestone May 22, 2024
@h-vetinari
Copy link
Member

1.14 is imminent (<24h), I think this might be getting a bit tight.

@tylerjereddy
Copy link
Contributor

This is pure maintenance in the end

... I think it might make sense to merge this one in the same release cycle.

The diff is pretty big, and I'm a bit nervous about merging this just before branching. If it is really important to include we can do it during RC stage, but I think I'd be pretty nervous with only 1 "LGTM" style review and no inline review comments yet.

@ev-br ev-br modified the milestones: 1.14.0, 1.15.0 May 29, 2024
@rgommers
Copy link
Member

No need to backport this, better to land at the beginning of a release cycle I'd say.

It's quite nice to see _bspline_util.c.in gone - this makes things way easier to understand. If there is no other reviewer, I'll try to get to this PR soon.

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing this (after @ev-br asked me nicely), but I must admit I'm a fair amount out of my depths here. I don't know the signal module or the (perhaps trivial) standard techniques, so some of the things just appear like voodoo to me. My comments are thus mostly of a janitorial nature.

scipy/signal/_spline_filters.py Outdated Show resolved Hide resolved
scipy/signal/_spline_filters.py Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.cc Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.cc Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.h Show resolved Hide resolved
scipy/signal/meson.build Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the benefit of renaming test_bsplines.py to test_spline_filters.py. In this case it just makes the diff unreviewable, so I don't know what actual changes happened there. I did a quick check locally with

git diff tags/v1.14.0rc1:scipy/signal/tests/test_bsplines.py HEAD:scipy/signal/tests/test_spline_filters.py -b

(this on the branch where I backported this PR for use in conda-forge/scipy-feedstock#277), and it looks reasonable.

I haven't checked the individual commits, perhaps this ends up being easier to follow on that more granular level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that the split between test_bsplines.py and test_splines.py is/was really arbitrary.

scipy/signal/_spline_filters.py Show resolved Hide resolved
@ev-br
Copy link
Member Author

ev-br commented Jun 15, 2024

Thank you for the review @h-vetinari ! Agreed on this stuff is indeed borderline voodoo. This whole PR is mostly janitorial too, so a janitorial review is great indeed.
I think I addressed most of your comments, and you said you started reviewing this, so there will be more, right? Or are you happy with the current state.

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I reviewed commit-by-commit now, and understood better what's happening.

I'm not in love with the amount of code movement, which often moves 2, 3, 4 times (instead of once), for reasons that don't seem self-evidently beneficial to me. Same for the move&merge of test code into different files, which would IMO be cleaner as a NFC1 follow-up (including the various fix-up commits required for the file-renames).

That said, the rest looks reasonable to me, even if I didn't actually validate the the moved code stays the same; but aside from trusting you on that, the test suite also validates that this works as intended.

Footnotes

  1. no functional change

scipy/signal/tests/test_splines.py Outdated Show resolved Hide resolved
scipy/signal/tests/test_splines.py Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.cc Outdated Show resolved Hide resolved
scipy/signal/meson.build Outdated Show resolved Hide resolved
scipy/signal/_bspline_util.cc.in Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.cc Outdated Show resolved Hide resolved
Comment on lines 103 to +94
switch (thetype) {
case NPY_FLOAT:
ret = S_separable_2Dconvolve_mirror((float *)PyArray_DATA(a_image),

ret = _separable_2Dconvolve_mirror((float *)PyArray_DATA(a_image),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we keep the switch? Can we not cast to thetype for the arguments?

scipy/signal/_bspline_util.h Outdated Show resolved Hide resolved
scipy/signal/_splinemodule.h Show resolved Hide resolved
scipy/signal/_splinemodule.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@andfoy andfoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the comments to apply on symiir1, and the docstring for _sym_iir1_initial. Please let me know if it requires any modification

@ev-br
Copy link
Member Author

ev-br commented Jun 18, 2024

Thanks Edgar! Merged in your edits.

@ev-br ev-br force-pushed the splinemodule_cpp branch 2 times, most recently from b640edf to b5708a1 Compare July 6, 2024 13:59
@ev-br
Copy link
Member Author

ev-br commented Jul 6, 2024

A gentle ping @h-vetinari ; been a while, would you be able to take a look again at some point?

@h-vetinari
Copy link
Member

I am sorry to keep beating that bush, but I find it very hard to review such large changes if there's such a mix between irrelevant formatting changes and the actual "meat" of the refactor. For example, _splinemodule.c changes to _splinemodule.cc (itself of only marginal benefit); but checking out this branch and running

git diff origin/main:scipy/signal/_splinemodule.c HEAD:scipy/signal/_splinemodule.cc -b

shows that the vast majority of the changes there are just formatting issues. The same issue applies to other modules being merged/split/renamed here, and so I don't feel I can review this with reasonable effort.

I don't want to stand in the way of this (perhaps my commit ideals are not shared by everyone), but from my POV, large refactors (at least on a commit level) need to be shorn of any formatting (and ideally: test) changes. How I tackle such situations would be:

* MAINT: fix code formatting in X (absolutely no functional changes)
* MAINT: clean up test modules (merge/split/rename; no changes in implementation or coverage)
* MAINT: do the refactor (no unnecessary changes to formatting; minimal if any changes to test code)
* MAINT: potential clean-ups that are only possible after refactor

That way, there's only one commit to review in detail, the rest is assured by the semantic separation of the changes (and in a perfect world: passing CI for each commit). It doesn't really matter if clean-ups come before and/or after, that depends on the situation.

It's possible that a single PR might need to do this several times for related areas; how exactly the commits are interleaved is irrelevant (as long as they don't gratuitously undo big code changes already introduced earlier). Of course, it's always possible to split off the "irrelevant" changes into a separate precursor PR, but that's IMO not necessary as long as the commits are individually sensible.

I'm aware that often the final state of a PR that passes CI is not knowable in advance, but with a bit of git-foo, it's possible to "construct" the kind of cleaned-up history I'm speaking of even after the fact. Yes, that represents extra work, but makes a huge difference for reviewability (and the overall quality of the git history).


Right now I'm not getting a coherent picture of what's happening here, neither the "changes from all commits" view, nor from stepping through ~20 commits individually, on top of not feeling really qualified in the subject-matter, so I need bow out. 😑

ev-br and others added 6 commits July 7, 2024 11:42
For symiirorder{1, 2}, compare to what scipy 1.9.1 / numpy 1.20.x returns
The test exercises spline_filter of complex argument, and has useful
info about spline_filter working in single precision for complex inputs
(the reason is still unclear).
@ev-br
Copy link
Member Author

ev-br commented Jul 7, 2024

Thank you @h-vetinari ! Let me try a bit harder: I've taken out the clang-format reindent commit and squashed it down to 9 commits in total to

  • keep C/C++ changes to a single commit
  • keep authorship for Edgar's and Smit's patches
  • keep python refactors separate from test additions.
  • rolled back merging test_splines.py and test_bsplines.py

Sadly github still does not detect that code in _splinemodule.h is basically the one from _bspline_util.c.in, only without Tempita. This I don't know how to explain to GH.

Does it look any better now?

@@ -644,3 +647,162 @@ def qspline1d_eval(cj, newx, dx=1.0, x0=0):
result += cj[indj] * _quadratic(newx - thisj)
res[cond3] = result
return res


def symiirorder1(signal, c0, z1, precision=-1.0):
Copy link
Member Author

@ev-br ev-br Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python functions symiirorder1 and symiirorder2 are shown as new here but are in fact just moved from _splines.py, where they were introduced in #18926 :
compare to https://github.com/scipy/scipy/pull/18926/files#diff-80d62cb9e2daea8b1e9d439cfcadeaf56bbf4dc8e41435f07a1d69089f5f875cR11

ret = _separable_2Dconvolve_mirror(reinterpret_cast<std::complex<float> *>(PyArray_DATA(a_image)),
reinterpret_cast<std::complex<float> *>(PyArray_DATA(out)), M, N,
reinterpret_cast<std::complex<float> *>(PyArray_DATA(a_hrow)),
reinterpret_cast<std::complex<float> *>(PyArray_DATA(a_hcol)),
Copy link
Member Author

@ev-br ev-br Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the meat of the matter, parroted from scipy.special and suggested by @steppi (thank you Albert)

@ev-br ev-br force-pushed the splinemodule_cpp branch 5 times, most recently from ab333d9 to 8f27cb6 Compare July 7, 2024 15:10
ev-br and others added 3 commits July 7, 2024 19:23
- merge _splines.py and _bsplines.py
- rename the merged implementation file into _spline_filters.py

None of this is user visible.
- improve the layout of symiirorder docstrings
- fix forwarding from deprecated signal.bspline module
- appease mypy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C/C++ Items related to the internal C/C++ code base maintenance Items related to regular maintenance tasks Meson Items related to the introduction of Meson as the new build system for SciPy scipy.signal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scipy.signal.sepfir2d fails with complex input on Windows
6 participants