Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of ndimage filters #3184

Merged
merged 35 commits into from Jun 25, 2020
Merged

Implementation of ndimage filters #3184

merged 35 commits into from Jun 25, 2020

Conversation

coderforlife
Copy link
Contributor

@coderforlife coderforlife commented Mar 11, 2020

So far this PR includes improved correlate and convolve functions (in terms of speed, uses similar technique to #3179) along with implementations of correlate1d and convolve1d along with tests for them. The underlying kernel creation has been generalized so that it can be used to implement all of the other filters and those will be progressively added to this PR (I have an implementation of them but haven't tested them yet).

This works to address #2099 and #3111.

Current status of tests is that it is passing 7120 tests (in test_filters.py) and failing 8 (they are all with the 1d functions along axis 0 using mode=mirror with 4D images but no other axis or mode or less-dimension image).

This is the first step towards a complete ndimage.filter suite. The functions here are designed to be flexible so that all other ndimage.filter functions can be created.
All other filters will be based on this core set of code.
@emcastillo emcastillo self-assigned this Mar 11, 2020
@jakirkham
Copy link
Member

I wonder if we should exposed the 1-D convolve as cupy.convolve (analogous to numpy.convolve). Thoughts? 🙂

@coderforlife
Copy link
Contributor Author

@jakirkham It is possible, a quick idea of how to do it would be. Note I have never used numpy.convolve or numpy.correlate so I don't know details like what do they do with >1D data or how some mode center the arrays. However, it gives the basic idea.

def convolve(a, v, mode='full'):
    if mode not in ('full', 'same', 'valid'): raise ValueError('mode') 

    # Make both 1D arrays and make sure `a` is the larger array
    a, v = a.ravel(), v.ravel()
    if len(a) == 0: raise ValueError('a cannot be empty')
    if len(v) == 0: raise ValueError('v cannot be empty')
    if len(v) > len(a): a, v = v, a

    # Deal with full padding of data
    if mode == 'full':
        pad = len(v) - 1
        a = cupy.pad(a, (pad // 2, (pad + 1) // 2)
    
    # Perform convolution
    out = cupyx.scipy.ndimage.convolve1d(a, v, mode='constant')

    # Return result
    if mode == 'valid':
        remove = 1 - len(v)
        out = out[remove//2 : -(remove+1)//2]
    return out

The downsides of this compared to numpy.convolve():

  • The default mode of 'full' requires padding the data (and thus copying the data).
  • The ndimage.convolve1d() function assumes that one of a or v is relatively small. If you were to use this for a cross-correlation of a large dataset (and thus len(a) == len(v)), it would likely have low performance.

@coderforlife
Copy link
Contributor Author

The 8 tests that are not passing are actually apparently a bug in Scipy. I have submitted a bug report over there: scipy/scipy#11661. It does not handle mirror correctly for length-1 dimensions with the 1D functions.

First change (the w[] to weights[] I forgot to do earlier) was the most major.
Bad copy-paste naming of the tests along with avoiding a scipy bug.
@awthomp
Copy link

awthomp commented Mar 12, 2020

@coderforlife -- I'm interested in CuPy's support/implementation of numpy.convolve and numpy.correlate for cusignal. I'd be more than happy to test and guide your implementation.

I don't know details like what do they do with >1D data or how some mode center the arrays
For NumPy's convolve/correlate functions, only 1D arrays are supported. This is actually probably the motivation for creating 2D support in ndimage. As you mentioned in your post, the key component here is how padding is handled in the implementation; and I see that as the big difference between doing a 1D ndimage.convolve or ndimage.correlate as well.

I took a quick-ish spin of ndimage.correlate and noticed that it doesn't do the 'sliding window' thing that numpy.convolve and numpy.correlate do. This 'sliding dot product' is the key difference between the ndimage and standard numpy implementation, IMO.

Anyway, as you may know, numpy.correlate and numpy.convolve are basically the same algorithm. For convolve, the filter is mirrored left-to-right before beginning the sliding dot product. To step through an example:

a = [1, 2, 3]
b = [0, 1, 0.5]
y = np.convolve(a, b, mode="full")

Yields:
array([0. , 1. , 2.5, 4. , 1.5])

Convolution/correlation is a point by point overlapping and sliding dot product that starts with a mirrored version of the filter (b) and the data (a). The last element in the mirrored filter is overlapped with the first element of the data, and the dot product is taken as the filter moves element by element over the data. To be explicit:

mirrored_b (denoted by m_b), used for calculation = [0.5, 1, 0]

The convolution works like:

y[0] = m_b[2] * a[0] = 0 * 1 = 0
y[1] = m_b[2] * a[1] + m_b[1] * a[0] = 0 * 2 + 1 * 1 = 1
y[2] = m_b[2] * a[2] + m_b[1] * a[1] + m_b[0] * a[0] = 0 * 3 + 1 * 2 + 0.5 * 1 = 2.5
y[3] = m_b[1] * a[2] + m_b[0] * a[1] = 1 * 3 + 0.5 * 2 = 4
y[4] = m_b[0] * a[2] = 0.5 * 3 = 1.5

Here's a visualization of what's going on too: https://en.wikipedia.org/wiki/Convolution#/media/File:Convolution_of_box_signal_with_itself2.gif

Again, correlation is very similar, but you don't have to flip the filter component (b, in this case).

@coderforlife
Copy link
Contributor Author

@awthomp I know that convolution and correlation are essentially the same thing with mirrored 'kernels', 'weights', or 'filter' depending on what you call the smaller of the two arrays. ndimage uses that fact. Also, the ndimage algorithm does use overlap-add. The ElementwiseKernel does the following for each element: sum every neighbourhood value multiplied by the corresponding weight. This is exactly what you describe. It may not look like it is doing this since it isn't "sliding" across but instead doing it per-element.

For the 1D case, it does this along an axis instead of in all dimensions. Unlike numpy.convolve/correlate it handles the edges differently.

The following code demonstrates the equality:

a = np.arange(24)
b = np.ones(3)
(np.convolve(a, b, mode="same") == ndi.convolve1d(a, b, mode="constant")).all()

where ndi is scipy.ndimage. The only other reasonable way to do a convolution is through FFT and it definitely doesn't do that.

I have corrected the code from above to properly align the output with what numpy.convolve generates and forbid non-1D arrays (and one other bug I had), and cupy.convolve could be implemented as:

def convolve(a, v, mode='full'):
    if mode not in ('full', 'same', 'valid'): raise ValueError('mode') 

    # Check both arrays and make sure `a` is the larger array
    if a.ndim != 1: raise ValueError('a must be 1D')
    if v.ndim != 1: raise ValueError('v must be 1D')
    if len(a) == 0: raise ValueError('a cannot be empty')
    if len(v) == 0: raise ValueError('v cannot be empty')
    if len(v) > len(a): a, v = v, a

    # Deal with full padding of data
    if mode == 'full':
        pad = len(v) - 1
        a = cupy.pad(a, ((pad+1)//2, pad//2), 'constant')
    
    # Perform convolution
    out = cupyx.scipy.ndimage.convolve1d(a, v, mode='constant')

    # Return result
    if mode == 'valid':
        remove = len(v) - 1
        out = out[remove//2 : -(remove+1)//2]
    return out

This should handle all cases but may not be the most efficient for all cases. In all likelihood, correlate will be implemented like the above and then convolve will be implemented in terms of correlate.

Implementing scipy.signal.convolve/correlate is a whole different can of worms. There is also fftconvolve and oaconvovle (oa = 'overlap-add' which is what we are using here). Both of those functions say they are much faster than convolve but convolve just uses those functions... they do confirm that the OA method is only efficient when len(a) >> len(b) or vice-versa. There is also choose_conv_method for determining which to call.

@awthomp
Copy link

awthomp commented Mar 12, 2020

Hi @coderforlife -- We've actually taken care of the correlate/convolve in cusignal. For large arrays/filters, we're getting a 1-2 order of magnitude perf gain over Scipy Signal, but -- as you can see -- we default to numpy (rather than invoking an fft) for small array sizes. This isn't ideal.

Much of cusignal is based on changing SciPy Signal Numpy calls for CuPy; when there's not support (like correlate2d or upfirdn), we wrote custom Numba CUDA kernels and are now exploring Raw CuPy Modules.

Usage examples are here: https://github.com/rapidsai/cusignal/blob/d64df2fc2c40bbdba4aad34641460176552cd5f6/notebooks/api_guide/signaltools_examples.ipynb

@grlee77
Copy link
Contributor

grlee77 commented Mar 12, 2020

Hey @awthomp, I had implemented scipy.signal.upfirdn a little over a year ago (prior to discovering your version in cusignal). At that time I had also reused it to implement numpy.convolve and numpy.correlate (I had added a couple arguments to control the extent and origin in upfirdn). The implementation of upfirdn differs from cusignal's in that it uses RawKernel rather than Numba. I have finally also made this version public (see #2099 (comment)).

@awthomp
Copy link

awthomp commented Mar 12, 2020

@grlee77 Awesome! Looking forward to seeing your work. In the meantime, we also implemented the 1D double precision raw kernel for upfirdn in cusignal earlier this week (rapidsai/cusignal#25). @mnicely is currently looking at optimization of it with this WIP PR: rapidsai/cusignal#31. We'd love your eyes on, if you have an opportunity and desire to contribute.

This results in a single style of kernel to maintain while also mainting the speed of 1D kernels.
format(j=j, type=int_type))
else:
boundary = _generate_boundary_condition_ops(
mode, 'ix_{}'.format(j), 'xsize_{}'.format(j))
loops.append('''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loops.append('''
loops.append('''

Copy link
Contributor

@grlee77 grlee77 Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring to only avoid duplicate 1d and nd code paths!

I can confirm good performance of the refactored kernel. The only problem I encountered was the indentation error above.
The 1D case now performs approximately the same as what I had in cupyimg. 3D cases I tested are up to 25% faster for this kernel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please address this!
thanks 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed in #3390

if not hasattr(origin, '__getitem__'):
origin = [origin, ] * input.ndim
def _fix_sequence_arg(arg, ndim, name, conv=lambda x: x):
if hasattr(arg, '__iter__') and not isinstance(arg, str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we favor try/except AttributeError over hasattr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emcastillo Can you point me to some example code you want me to use?

Just curious why y'all prefer try/except over hasattr?
https://stackoverflow.com/questions/903130/hasattr-vs-try-except-block-to-deal-with-non-existent-attributes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, yes, we use try/ except AttributeError because it is faster.
If you think that in this routine speed doesn't matter you can leave it as is 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be cleaner and faster to check if arg is not a str and then just do arg = iter(arg).

Copy link
Contributor

@mnicely mnicely Jun 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried swapping

if hasattr(arg, '__iter__') and not isinstance(arg, str):

with

if not isinstance(arg, str):
    try:
        arg = iter(arg)
    except AttributeError:
        return NotImplemented

Now most/all tests are failing. Are these not equivalent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception for an invalid iter with iter is

TypeError: 'object' object is not iterable

😅 this might be confusing

@grlee77
Copy link
Contributor

grlee77 commented Mar 23, 2020

The 8 tests that are not passing are actually apparently a bug in Scipy. I have submitted a bug report over there: scipy/scipy#11661. It does not handle mirror correctly for length-1 dimensions with the 1D functions.

A fix for this will be included in SciPy 1.5 (scipy/scipy#11683). You can avoid the failing test cases by decorating those cases with:

@testing.with_requires('scipy>=1.5.0')

@emcastillo emcastillo added cat:feature New features/APIs st:awaiting-author Awaiting response from author labels Mar 25, 2020
@chainer-ci
Copy link
Member

Jenkins CI test (for commit 693a2cc, target branch master) failed with status FAILURE.

@emcastillo
Copy link
Member

Jenkins, test this please

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit 04e6ab3:

@coderforlife
Copy link
Contributor Author

It looks like I forgot to decorate the TestFortranOrder tests with @testing.with_requires('scipy'). Sorry about that. I just committed that fix.

However, the vast majority of the failures were from the morphology code.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 04e6ab3, target branch master) failed with status FAILURE.

@coderforlife
Copy link
Contributor Author

Looks good (except morphology tests, someone else needs to fix that). I am curious about the coverage test, is it being run without scipy? It says none of my code (except the function declarations and imports) is covered...

@emcastillo
Copy link
Member

emcastillo commented Jun 24, 2020

Do you happen to know why morphology tests are failing?

_min_or_max_filter() takes 8 positional arguments but 9 were given
module 'cupyx.scipy.ndimage.filters' has no attribute '_normalize_sequence'
the _normalize_sequence was deleted in this PR, lets just restore it and clean it in another PR.
I can't merge this until the CIs are ok.

Don't care too much about the coverage test :)

@coderforlife
Copy link
Contributor Author

_normalize_sequence was renamed to _fix_sequence_arg - I had no idea someone else was using these "protected" functions... just needs a third argument now for the name of the argument (used in the exception).

_min_or_max_filter doesn't have the structure argument anymore, I will have to figure that one out (it seems to be specifically created for the morphology functions). Additionally, the final argument is not a bool anymore but one of two strings ('min' or 'max').

So 2 easy things to fix and 1 thing I will have to look into, could be medium or hard.

@emcastillo
Copy link
Member

If you want you can restore the old functions and someone can fix them later in a follow-up PR

@asi1024 asi1024 modified the milestones: v8.0.0b4, v8.0.0b5 Jun 24, 2020
@coderforlife
Copy link
Contributor Author

I have fixed all of the issues. One other issue that didn't come up in the tests is that their tests used pytest.skip() which caused issues when running the tests with unittest which is recommended in the "contribution guide". So I switched it to raise unittest.SkipTest(...).

@emcastillo
Copy link
Member

Awesome work and thanks once again for patiently addressing our comments!🙇‍♂️

@emcastillo
Copy link
Member

Jenkins, test this please

@pfn-ci-bot
Copy link
Collaborator

Successfully created a job for commit 97c1c94:

@coderforlife
Copy link
Contributor Author

It's okay - I already have the next PR planned and it will hopefully go smoother. It is added all of the other basic filters on top of this core set (uniform_filter1d, uniform_filter, gaussian_filter1d, gaussian_filter, prewitt, sobel, generic_laplace, laplace, gaussian_laplace, generic_gradient_magnitude, gaussian_gradient_magnitude, rank_filter, median_filter, and percentile_filter). I have them mostly written (just need to update for recent changes this core code). However, no tests yet for them.

I am thinking to do them in two batches: non-linear filters (the last 3 in that list) since they actually have a bit of new cuda code with them, then do all of those linear filters (which all just call the convolve or convolve1d functions).

I have also started working on generic_filter but that is going to be a bit different since it has to parse another reduction kernel to build it. I have it working with a few simple cases of reduction kernels and fused functions, but it needs a bit more work. Of course, it can't quite match the scipy function (which takes any Python function and LowLevelCallbacks which wrap Cython or C functions), but I think taking fuseable/fused Python functions and ReductionKernels would match as close as possible.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 97c1c94, target branch master) succeeded!

@emcastillo
Copy link
Member

wow, thats awesome!

This PR will be merged after the release today, right now code is frozen.

@emcastillo emcastillo added the st:test-and-merge (deprecated) Ready to merge after test pass. label Jun 25, 2020
@mergify mergify bot merged commit 60c620d into cupy:master Jun 25, 2020
@jakirkham
Copy link
Member

This is great! Thanks everyone for pushing this forward 😄

@asi1024 asi1024 removed the st:awaiting-author Awaiting response from author label Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:feature New features/APIs st:test-and-merge (deprecated) Ready to merge after test pass.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet