Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add all binary morphology functions to cupyx.scipy.ndimage #3907

Merged
merged 10 commits into from
Sep 16, 2020

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Sep 1, 2020

This PR adds binary_erosion, binary_dilation, binary_opening, binary_closing, binary_hit_or_miss, binary_propagation, binary_fill_holes, generate_binary_structure and iterate_structure.

The only new kernel is the one corresponding to binary erosion. I was able to mostly reuse the existing utility _filters_core._generate_nd_kernel, but did have to add a couple of arguments to get it to work for this case. @coderforlife, please take a look if you have a chance.

All other binary morphological operations are built on top of binary_erosion. Some of the specific data arrays used in the tests were borrowed from SciPy's test suite.

The only feature in SciPy that is not implemented here is the case where iterations > 1 and brute_force=False. SciPy uses a different algorithm in that case, but I have only implemented the brute force kernel here.

@grlee77
Copy link
Contributor Author

grlee77 commented Sep 1, 2020

I put some benchmarks for a (4096, 4096) binary image below. I only benchmarked binary erosion as all of the other functions also use it under the hood.

Benchmark Script

import numpy as np
import cupy as cp
from cupyx.time import repeat
from scipy import ndimage as ndi
from cupyx.scipy import ndimage as cndi
shape = (4096, 4096)
x = np.random.randn(*shape) > 0.6
s = ndi.generate_binary_structure(x.ndim, 1)

# run 1 iteration on the CPU
perf = repeat(ndi.binary_erosion, (x, s), n_warmup=1, n_repeat=10)
print(perf)

# run 1 iteration on the GPU
xg = cp.asarray(x)
sg = cndi.generate_binary_structure(x.ndim, 1)
perf_gpu = repeat(cndi.binary_erosion, (xg, sg), n_warmup=20, n_repeat=100)
print(perf_gpu)

# compare with 5 iterations on the CPU
kwargs = dict(iterations=5, brute_force=True)
perf_cpu_5iter_bf = repeat(ndi.binary_erosion, (x, s), kwargs, n_warmup=1, n_repeat=10)
print(perf_cpu_5iter_bf)

kwargs = dict(iterations=5, brute_force=False)
perf_cpu_5iter = repeat(ndi.binary_erosion, (x, s), kwargs, n_warmup=1, n_repeat=10)
print(perf_cpu_5iter)

# compare with 5 iterations on the GPU (only brute_force=True is implemented)
xg = cp.asarray(x)
sg = cndi.generate_binary_structure(x.ndim, 1)
kwargs = dict(iterations=5, brute_force=True)
perf_gpu_5iter = repeat(cndi.binary_erosion, (xg, sg), kwargs, n_warmup=20, n_repeat=100)
print(perf_gpu_5iter)

Benchmark Results

single iteration CPU

binary_erosion      :    CPU:161958.266 us   +/-1280.388 (min:159108.258 / max:163741.605) us     GPU-0:161991.273 us   +/-1279.958 (min:159142.853 / max:163778.748) us

single iteration GPU

binary_erosion      :    CPU:   57.767 us   +/-36.566 (min:   46.448 / max:  407.474) us     GPU-0: 1952.998 us   +/-81.631 (min: 1900.544 / max: 2227.200) us

CPU, 5 iterations, brute_force=True

binary_erosion      :    CPU:492891.860 us   +/-3904.152 (min:486823.349 / max:499552.778) us     GPU-0:492924.872 us   +/-3903.662 (min:486854.309 / max:499583.923) us

CPU, 5 iterations, brute_force=False

binary_erosion      :    CPU:263275.548 us   +/-8081.263 (min:257600.988 / max:281175.795) us     GPU-0:263310.513 us   +/-8081.219 (min:257634.888 / max:281208.618) us

GPU, 5 iterations, brute_force=True

binary_erosion      :    CPU: 4045.870 us   +/-225.893 (min: 3803.620 / max: 4516.481) us     GPU-0: 4051.458 us   +/-226.087 (min: 3808.384 / max: 4531.968) us

@grlee77
Copy link
Contributor Author

grlee77 commented Sep 1, 2020

I think the call to _center_is_true has to synchronize as it checks if the central pixel of the structuring element is True or False.

For cases with iterations > 1 there is also device synchronization because of the call: changed = not (tmp_in == tmp_out).all(). This check is used to detect if any of the values have changed since the prior iteration.

cupyx/scipy/ndimage/morphology.py Outdated Show resolved Hide resolved
cupyx/scipy/ndimage/morphology.py Outdated Show resolved Hide resolved
cupyx/scipy/ndimage/morphology.py Outdated Show resolved Hide resolved
single to double backticks

remove unneeded on_cpu argument

use astype rather than asarray to disallow array-like structure input
@jakirkham
Copy link
Member

Jenkins, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 45ab6e3, target branch master) failed with status FAILURE.

@grlee77
Copy link
Contributor Author

grlee77 commented Sep 11, 2020

It seems very odd that only a single parameterized case out of 1728 failed for BinaryErosionAndDilation (and that this only occurred on one of the runs)

@jakirkham
Copy link
Member

Sure, let's restart CI and see if it clears 🙂

Jenkins, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 45ab6e3, target branch master) succeeded!

@jakirkham
Copy link
Member

Would you be able to take another look @asi1024? 🙂

@asi1024
Copy link
Member

asi1024 commented Sep 16, 2020

Jenkins, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 45ab6e3, target branch master) succeeded!

@asi1024
Copy link
Member

asi1024 commented Sep 16, 2020

LGTM!

@asi1024 asi1024 merged commit cd8b6eb into cupy:master Sep 16, 2020
@asi1024 asi1024 added this to the v9.0.0a1 milestone Sep 16, 2020
@jakirkham
Copy link
Member

Thank you both! 😄

@grlee77 grlee77 deleted the ndimage_binary_morphology branch December 18, 2020 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants