New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix edgecase bugs in segmentation.relabel_sequential #4465
Fix edgecase bugs in segmentation.relabel_sequential #4465
Conversation
Hello @uschmidt83! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2020-02-19 16:40:02 UTC |
skimage/segmentation/_join.py
Outdated
if offset == 1 and np.all(labels0 == new_labels0): | ||
if not (labels == 0).any(): | ||
labels = np.concatenate(([0], labels)) | ||
return label_field, labels, labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be best to not handle this special case separately and simply remove these lines of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 I'm actually confused about what's happening in this edge case. The test example has a lot going on so it's still not clear to me. Could you elaborate on the problem you're fixing and why it goes wrong in the current code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jni, these lines intended to check if label_field
already contains sequential labels, hence there would be nothing to do. Unfortunately, the logic only works if offset == 1
and label_field
contains the 0 label. Hence, I added these things.
PS: I was actually encountering this bug in a real example, I didn't go on some esoteric bug hunt ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check was probably intended to avoid some computation, but it is unnecessary.
I just pushed a commit that (IMO) simplifies the function and makes it easier to understand and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 I'm a bit confused about this code, though I don't doubt it's necessary. Would you mind explaining the error and the fix? And why removing those lines altogether would help?
@@ -30,9 +30,15 @@ def test_join_segmentations(): | |||
join_segmentations(s1, s3) | |||
|
|||
|
|||
def _check_maps(ar, ar_relab, fw, inv): | |||
assert_array_equal(fw[ar], ar_relab) | |||
assert_array_equal(inv[ar_relab], ar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh I like this. =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
skimage/segmentation/_join.py
Outdated
if offset == 1 and np.all(labels0 == new_labels0): | ||
if not (labels == 0).any(): | ||
labels = np.concatenate(([0], labels)) | ||
return label_field, labels, labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 I'm actually confused about what's happening in this edge case. The test example has a lot going on so it's still not clear to me. Could you elaborate on the problem you're fixing and why it goes wrong in the current code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 great, you're absolutely right! I've made a minor change suggestion but I'm approving without it. Thanks!
skimage/segmentation/_join.py
Outdated
new_labels0 = np.arange(offset, offset + len(labels0)) | ||
if np.all(labels0 == new_labels0): | ||
return label_field, labels, labels | ||
new_m = offset - 1 + len(labels0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this short for new_maximum
or new_max_label
? If so, I wouldn't mind this being the new name rather than new_m
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this short for
new_maximum
ornew_max_label
?
Yes. After relabeling, relabeled.max() == new_m
.
If so, I wouldn't mind this being the new name rather than
new_m
.
I personally don't have an opinion on how to call this variable. I just chose the name because m
is already used:
scikit-image/skimage/segmentation/_join.py
Line 124 in 7d70b5e
m = label_field.max() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 who would write such code??? 😂 😂 😂
Well, our preference has evolved to more self-explanatory variable names, so I would support a renaming of m
to max_label
and new_m
to new_max_label
. Feel free to leave that one to us, though, as indeed it is an orthogonal change to your contribution. Thank you for this fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed the variable names as requested.
required_type = np.min_scalar_type(offset + len(labels0)) | ||
new_max_label = offset - 1 + len(labels0) | ||
new_labels0 = np.arange(offset, int(new_max_label + 1)) | ||
required_type = np.min_scalar_type(new_max_label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also changed another subtlety regarding required_type
.
(It is amazing how much complexity can be in such a small function.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It is amazing how much complexity can be in such a small function.)
I know! Thank you again for your persistence here! It's a much better function thanks to your contributions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, thank you @uschmidt83 ;-)
- Implicit (reasonable) assumption: the values for max_label and new_max_label fit into the 'int' data type. - Avoid making a copy of 'label_field' if the output type has to be "upgraded".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again @uschmidt83 👏
Description
segmentation.relabel_sequential
(see modified tests).Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.@meeseeksdev backport to v0.14.x