Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix edgecase bugs in segmentation.relabel_sequential #4465

Merged

Conversation

uschmidt83
Copy link
Contributor

@uschmidt83 uschmidt83 commented Feb 18, 2020

Description

  • Fix edgecase bugs in segmentation.relabel_sequential (see modified tests).

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.
  • Consider backporting the PR with @meeseeksdev backport to v0.14.x

@pep8speaks
Copy link

pep8speaks commented Feb 18, 2020

Hello @uschmidt83! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 124:39: E261 at least two spaces before inline comment

Comment last updated at 2020-02-19 16:40:02 UTC

Comment on lines 135 to 138
if offset == 1 and np.all(labels0 == new_labels0):
if not (labels == 0).any():
labels = np.concatenate(([0], labels))
return label_field, labels, labels
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be best to not handle this special case separately and simply remove these lines of code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 I'm actually confused about what's happening in this edge case. The test example has a lot going on so it's still not clear to me. Could you elaborate on the problem you're fixing and why it goes wrong in the current code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jni, these lines intended to check if label_field already contains sequential labels, hence there would be nothing to do. Unfortunately, the logic only works if offset == 1 and label_field contains the 0 label. Hence, I added these things.

PS: I was actually encountering this bug in a real example, I didn't go on some esoteric bug hunt ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was probably intended to avoid some computation, but it is unnecessary.
I just pushed a commit that (IMO) simplifies the function and makes it easier to understand and maintain.

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 I'm a bit confused about this code, though I don't doubt it's necessary. Would you mind explaining the error and the fix? And why removing those lines altogether would help?

@@ -30,9 +30,15 @@ def test_join_segmentations():
join_segmentations(s1, s3)


def _check_maps(ar, ar_relab, fw, inv):
assert_array_equal(fw[ar], ar_relab)
assert_array_equal(inv[ar_relab], ar)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh I like this. =)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 135 to 138
if offset == 1 and np.all(labels0 == new_labels0):
if not (labels == 0).any():
labels = np.concatenate(([0], labels))
return label_field, labels, labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 I'm actually confused about what's happening in this edge case. The test example has a lot going on so it's still not clear to me. Could you elaborate on the problem you're fixing and why it goes wrong in the current code?

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 great, you're absolutely right! I've made a minor change suggestion but I'm approving without it. Thanks!

new_labels0 = np.arange(offset, offset + len(labels0))
if np.all(labels0 == new_labels0):
return label_field, labels, labels
new_m = offset - 1 + len(labels0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this short for new_maximum or new_max_label? If so, I wouldn't mind this being the new name rather than new_m.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this short for new_maximum or new_max_label?

Yes. After relabeling, relabeled.max() == new_m.

If so, I wouldn't mind this being the new name rather than new_m.

I personally don't have an opinion on how to call this variable. I just chose the name because m is already used:

m = label_field.max()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 who would write such code??? 😂 😂 😂

Well, our preference has evolved to more self-explanatory variable names, so I would support a renaming of m to max_label and new_m to new_max_label. Feel free to leave that one to us, though, as indeed it is an orthogonal change to your contribution. Thank you for this fix!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just changed the variable names as requested.

required_type = np.min_scalar_type(offset + len(labels0))
new_max_label = offset - 1 + len(labels0)
new_labels0 = np.arange(offset, int(new_max_label + 1))
required_type = np.min_scalar_type(new_max_label)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also changed another subtlety regarding required_type.
(It is amazing how much complexity can be in such a small function.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It is amazing how much complexity can be in such a small function.)

I know! Thank you again for your persistence here! It's a much better function thanks to your contributions!

Copy link
Member

@rfezzani rfezzani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, thank you @uschmidt83 ;-)

- Implicit (reasonable) assumption: the values for max_label and new_max_label fit into the 'int' data type.
- Avoid making a copy of 'label_field' if the output type has to be "upgraded".
Copy link
Member

@rfezzani rfezzani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again @uschmidt83 👏

@rfezzani rfezzani merged commit 88af67f into scikit-image:master Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants