bugfix: prevent integer overflow in EllipseModel #5179

mark-boer · 2021-01-10T01:02:49Z

Description

This is a bugfix to issue #5042. It makes sure that the EllipseModel and CircleModel do not cause integer overflows

Checklist

I have a question, while trying to implement a unit test I ran into another bug. The test is still implemented, but is marked with a pytest.mark.skip. How would you like to handle this?

Also, does the code need to pass any kind of static analysis? Like black or flake8?

For reviewers

Check that the PR title is short, concise, and will make sense 1 year
later.
Check that new functions are imported in corresponding __init__.py.
Check that new features, API changes, and deprecations are mentioned in
doc/release/release_dev.rst.

…er overflows

pep8speaks · 2021-01-10T01:02:55Z

Hello @mark-boer! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-21 02:12:32 UTC

mark-boer · 2021-01-15T21:54:01Z

Found the reason the test I wrote is failing, the input data chosen, leads to a ridiculously small eigenvalue, which in turn led to significant floating point errors in the matrix inverse. It is probably a solution to first subtract the centroid of the input data and then add this centroid to the center of the Circle. Also I found that the algebraic method used, is not actually a proper least square fit. I'll create a new issue.

mark-boer · 2021-01-15T22:34:32Z

The behaviour could be improved by using something like this:

A = np.append(xy * 2, np.ones((xy.shape[0], 1)), axis=1)
f = np.sum(xy ** 2, axis=1)
C, _, _, _ = np.linalg.lstsq(A, f, rcond=None)
center = C[0:2]
distances = scipy.spatial.minkowski_distance(center,xy)
r = np.sqrt(np.mean(distances ** 2))

See e.g. sphere fit i wrote a year ago, this contains 2 fits, one fast algebraic one and one iterative one:
https://github.com/mark-boer/geomfitty/blob/4f24071f524302a95144dbd91bc84fa33c5dc26b/geomfitty/fit3d.py#L42

stefanv · 2021-01-17T02:44:26Z

Thanks for the fix, @mark-boer. I did a quick check and it looks like vstack does type promotion already. Could you explain the rationale behind the fix (casting the data to float)?

Before merging, we should remove the failing test and address that as part of your other issue.

mark-boer · 2021-01-17T13:15:28Z

The vstack on which line do you mean? The EllipseModel overflows in the following line:

S1 = D1.T @ D1

For more info on what goes wrong see issue #5042.

jni · 2021-01-19T12:10:35Z

@stefanv I think the point of the cast might be to keep things at the same float bit depth for both D1 and D2? (ie both float32 or both float64, depending on input dtype).

jni · 2021-01-19T12:12:05Z

I agree with @stefanv that ideally the skipped test should instead get added in a separate PR that fixes the issue, but I'm not that fussed if we instead just merge this as-is and remove the skip marker in a follow-up PR.

@scikit-image/core any others up for review?

Thank you @mark-boer!

mark-boer · 2021-01-19T12:18:13Z

@jni, I changed the input data so it cast to a float of at least 32 bits. But if the user chooses to call the function with a float64, it is not cast down to a float32. If that doesn't make sense, we can also just forcibly cast it to a float64.

I will remove the failing test this evening and create a new PR with the fix to this failing test.

jni · 2021-01-19T12:23:39Z

I changed the input data so it cast to a float of at least 32 bits. But if the user chooses to call the function with a float64, it is not cast down to a float32. If that doesn't make sense, we can also just forcibly cast it to a float64.

Makes perfect sense, actually! I think the fact that dtype was set for the second vstack and not the first confused @stefanv upon reading the code. Took me a while as well after reading his comment. Perhaps adding a dtype= to the vstack calls, even if redundant, will make the code clearer? Or maybe just add a comment to the effect of what you just wrote?

mark-boer · 2021-01-19T12:31:28Z

The dtype, is actually an argument of the np.ones, even though this is not really necessary , I thought that saved an additional cast.

edit: I will add a comment to the code 😉 that explains the why

rfezzani · 2021-01-19T16:04:12Z

skimage/measure/fit.py

        x = data[:, 0]
        y = data[:, 1]

        # Quadratic part of design matrix [eqn. 15] from [1]
        D1 = np.vstack([x ** 2, x * y, y ** 2]).T
        # Linear part of design matrix [eqn. 16] from [1]
-        D2 = np.vstack([x, y, np.ones(len(x))]).T
+        D2 = np.vstack([x, y, np.ones(len(x), dtype=float_type)]).T


What about

Suggested change

D2 = np.vstack([x, y, np.ones(len(x), dtype=float_type)]).T

D2 = np.vstack([x, y, np.ones_like(x)]).T

That's even better :-)

stefanv

Thanks, this is very readable now @mark-boer. I left some minor suggestions, but they're mostly stylistic.

skimage/measure/tests/test_fit.py

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

jni

Just gonna approve this some more for good measure. =P

stefanv · 2021-01-21T02:35:35Z

Thank you very much, @mark-boer

@mkcor

* resize using scipy.ndimage.zoom * Clip negative values in output of RGB to HED conversion. Fixes #5164 * Fixed stain separation tests. - Round trips start with stains, such that the roundtrip does not require negative values. - Using a blue+red+orange stain combination, the test previously used a 2-stain combination that made it impossible to do a correct roundtrip. * Improve IHC stain separation example. * BENCH: add benchmarks for resize and rescale * move mode conversion code to _shared.utils.py use grid-constant and grid-wrap for SciPy >= 1.6.0 closes gh-5064 * separate grid mode conversion utility * Update doc/examples/color_exposure/plot_ihc_color_separation.py Improvement to the IHC example. Co-authored-by: Riadh Fezzani <rfezzani@gmail.com> * add TODO for removing legacy code paths * avoid pep8 warning in benchmark_interpolation.py * keep same ValueError on unrecognized modes * test_rank.py: fetch reference data once into a dictionary of arrays (#5175) This should avoid stochastic test failures that have been occuring on GitHub Actions * Add normalized mutual information metric (#5158) * Add normalized mutual information metric * Add tests for NMI * Add NMI to metrics init * Fix error message for incorrect target shape * Add 3D NMI test * properly add nmi to metrics init * _true, _test -> 0, 1 * ravel -> reshape(-1) * Apply suggestions from @mkcor's code review Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org> * Fix warning and exception messages Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org> * segment Other section of TODO by date based on NEP-29 24 month window for dropping upstream package version * Fix bug in optical flow code and examples mode 'nearest' is a scipy.ndimage mode, but warp only accepts the numpy.pad equivalent of 'edge' * fix additional cases of warp with 'nearest'->'edge' * DOC: Fixup to linspace in example (#5181) Right now the data is not sampled every half degree. Also, -pi/2 to pi/2. * Minor fixups to Hough line transform code and examples (#5182) * Correction to linspace call in hough_line default `theta = np.pi / 2 - np.arange(180) / 180.0 * np.pi` done correctly below. * BUG: Fixup to image orientation in example * DOC: Tiny typo * Changed range to conform to documentation The original was [pi/2, -pi/2), not [-pi/2, pi/2). Updated wording in docs. * Fixed x-bounds of hough image in example * Fixed long line * Fixes to tests Number of lines in checkerboard, longer line in peak ordering assertion, updated angle * Accidentally modified the wrong value. * Updated example to include three lines As requested in review * Fixed off-by one error in pixel bins in hough line transform (#5183) * Added 1/2 pixel bounds to extent of displayed images (#5184) * fast, non-Cython implementation for correlate_sparse (#5171) * Replace correlate_sparse Cython version with a simple slicing-based variant in _mean_std This slicing variant has two benefits: 1.) profiled 30-40% faster on 2d and 3d cases I tested with 2.) can immediately be used by other array backends such as CuPy * remove intermediate padded_sq variable to reduce memory footprint * Update skimage/filters/_sparse.py remove duplicate declarations Co-authored-by: Riadh Fezzani <rfezzani@gmail.com> * move _to_np_mode and _to_ndimage_mode to _shared.utils.py * add comment about corner_index needing to be present * raise RuntimeError on missing corner index * update code style based on reviewer comments Co-authored-by: Riadh Fezzani <rfezzani@gmail.com> * Add release step on github to RELEASE.txt (See #5185) (#5187) Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu> * Prevent integer overflow in EllipseModel (#5179) * Change dtype of CircleModel and EllipseModel to float, to avoid integer overflows * remove failing test and add code review notes * Apply suggestions from code review Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com> * Remove unused pytest import Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com> Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com> Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu> * Add saturation parameter to color.label2rgb (#5156) * Remove reference to opencv in threshold_local documentation (#5191) Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org> Co-authored-by: Riadh <r.fezzani@vitadx.com> * resize using scipy.ndimage.zoom * BENCH: add benchmarks for resize and rescale * move mode conversion code to _shared.utils.py use grid-constant and grid-wrap for SciPy >= 1.6.0 closes gh-5064 * separate grid mode conversion utility * add TODO for removing legacy code paths * avoid pep8 warning in benchmark_interpolation.py * keep same ValueError on unrecognized modes * segment Other section of TODO by date based on NEP-29 24 month window for dropping upstream package version * Fix bug in optical flow code and examples mode 'nearest' is a scipy.ndimage mode, but warp only accepts the numpy.pad equivalent of 'edge' * fix additional cases of warp with 'nearest'->'edge' * Apply suggestions from code review Co-authored-by: Riadh Fezzani <rfezzani@gmail.com> Co-authored-by: Cris Luengo <cris.l.luengo@gmail.com> Co-authored-by: Cris Luengo <crisluengo@users.noreply.github.com> Co-authored-by: Riadh Fezzani <rfezzani@gmail.com> Co-authored-by: Alexandre de Siqueira <alex.desiqueira@igdore.org> Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu> Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org> Co-authored-by: Joseph Fox-Rabinovitz <madphysicist@users.noreply.github.com> Co-authored-by: François Boulogne <fboulogne@sciunto.org> Co-authored-by: Mark Boer <m.h.boer.2@gmail.com> Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com> Co-authored-by: Carlos Andrés Álvarez Restrepo <charlie_cha@outlook.com> Co-authored-by: Riadh <r.fezzani@vitadx.com>

Change dtype of CircleModel and EllipseModel to float, to avoid integ…

63c00a5

…er overflows

grlee77 mentioned this pull request Jan 11, 2021

2021's calendar of community management #5169

Closed

mark-boer mentioned this pull request Jan 15, 2021

CircleModel does not minimize squared distances #5186

Open

jni approved these changes Jan 19, 2021

View reviewed changes

rfezzani reviewed Jan 19, 2021

View reviewed changes

remove failing test and add code review notes

eeae846

mark-boer mentioned this pull request Jan 19, 2021

Fix floating point rounding in CircleModel.estimate #5190

Merged

stefanv reviewed Jan 20, 2021

View reviewed changes

skimage/measure/tests/test_fit.py Outdated Show resolved Hide resolved

skimage/measure/tests/test_fit.py Outdated Show resolved Hide resolved

skimage/measure/tests/test_fit.py Outdated Show resolved Hide resolved

mark-boer and others added 2 commits January 21, 2021 01:02

Apply suggestions from code review

59a5b31

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

Remove unused pytest import

ae249de

Co-authored-by: Stefan van der Walt <sjvdwalt@gmail.com>

jni approved these changes Jan 21, 2021

View reviewed changes

stefanv merged commit 9b05cac into scikit-image:master Jan 21, 2021

rfezzani mentioned this pull request Oct 18, 2021

Issues with EllipseModel #5042

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: prevent integer overflow in EllipseModel #5179

bugfix: prevent integer overflow in EllipseModel #5179

mark-boer commented Jan 10, 2021

pep8speaks commented Jan 10, 2021 •

edited

mark-boer commented Jan 15, 2021

mark-boer commented Jan 15, 2021

stefanv commented Jan 17, 2021

mark-boer commented Jan 17, 2021

jni commented Jan 19, 2021

jni commented Jan 19, 2021

mark-boer commented Jan 19, 2021

jni commented Jan 19, 2021

mark-boer commented Jan 19, 2021 •

edited

rfezzani Jan 19, 2021

mark-boer Jan 19, 2021

stefanv left a comment

jni left a comment

stefanv commented Jan 21, 2021

	D2 = np.vstack([x, y, np.ones(len(x), dtype=float_type)]).T
	D2 = np.vstack([x, y, np.ones_like(x)]).T

bugfix: prevent integer overflow in EllipseModel #5179

bugfix: prevent integer overflow in EllipseModel #5179

Conversation

mark-boer commented Jan 10, 2021

Description

Checklist

For reviewers

pep8speaks commented Jan 10, 2021 • edited

Comment last updated at 2021-01-21 02:12:32 UTC

mark-boer commented Jan 15, 2021

mark-boer commented Jan 15, 2021

stefanv commented Jan 17, 2021

mark-boer commented Jan 17, 2021

jni commented Jan 19, 2021

jni commented Jan 19, 2021

mark-boer commented Jan 19, 2021

jni commented Jan 19, 2021

mark-boer commented Jan 19, 2021 • edited

rfezzani Jan 19, 2021

Choose a reason for hiding this comment

mark-boer Jan 19, 2021

Choose a reason for hiding this comment

stefanv left a comment

Choose a reason for hiding this comment

jni left a comment

Choose a reason for hiding this comment

stefanv commented Jan 21, 2021

pep8speaks commented Jan 10, 2021 •

edited

mark-boer commented Jan 19, 2021 •

edited