New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several issues with relabel_sequential #3740
Fix several issues with relabel_sequential #3740
Conversation
Hello @uschmidt83! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on February 11, 2019 at 08:29 Hours UTC |
new_type = np.min_scalar_type(int(m)) | ||
label_field = label_field.astype(new_type) | ||
m = m.astype(new_type) # Ensures m is an integer | ||
labels = np.unique(label_field) | ||
labels0 = labels[labels != 0] | ||
if m == len(labels0): # nothing to do, already 1...n labels | ||
required_type = np.min_scalar_type(offset + len(labels0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min_scalar_type is a little annoying. It can return unsigned types. I think I've personally come around to not liking the use of "unsigned" unless you need specific "unsigned" behavior, that is, 255+2 == 1
is something you want to be true in your math.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole required_type
thing is really covering an edge case that I don't think will happen much in practice, unless someone purposefully uses restricted data types (especially uint8
).
I frequently save label masks to disk as uint16
TIFF files. Hence, they have this type when loaded again from disk. I like working with 16-bit integers for space reasons, which can be important for large 3D arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to allow a dtype parameter to ensure uint16???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new logic (use the larger of min_dtype and input dtype) is sufficient here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Co-Authored-By: uschmidt83 <uschmidt83@users.noreply.github.com>
@@ -114,20 +116,29 @@ def relabel_sequential(label_field, offset=1): | |||
>>> relab | |||
array([5, 5, 6, 6, 7, 9, 8]) | |||
""" | |||
offset = int(offset) | |||
if offset <= 0: | |||
raise ValueError("Offset must be strictly positive.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, yeah the error message was non-sensical before
In [14]: relabel_sequential(np.asarray([2, 3, 4]), -1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-14-e5025e3e4264> in <module>
----> 1 relabel_sequential(np.asarray([2, 3, 4]), -1)
~/miniconda3/envs/owl/lib/python3.7/site-packages/skimage/segmentation/_join.py in relabel_sequential(label_field, offset)
129 labels = np.concatenate(([0], labels))
130 inverse_map = np.zeros(offset - 1 + len(labels), dtype=np.intp)
--> 131 inverse_map[(offset - 1):] = labels
132 relabeled = forward_map[label_field]
133 return relabeled, forward_map, inverse_map
ValueError: could not broadcast input array from shape (4) into shape (2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uschmidt83 thank you, very nice! I have one nitpicky suggestion and one bigger one regarding negative values, but imho that can happen in a separate PR if necessary. This is already a significant improvement.
ar = np.array([1, 3, 2, 5, 4]) | ||
ar_relab, fw, inv = relabel_sequential(ar, offset=offset) | ||
ar_relab_ref = ar.copy() | ||
ar_relab_ref[ar_relab_ref > 0] += offset - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, ar_relab_ref = np.where(ar > 0, ar + offset - 1, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I change and commit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah go for it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -114,20 +116,29 @@ def relabel_sequential(label_field, offset=1): | |||
>>> relab | |||
array([5, 5, 6, 6, 7, 9, 8]) | |||
""" | |||
offset = int(offset) | |||
if offset <= 0: | |||
raise ValueError("Offset must be strictly positive.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if m == len(labels0): # nothing to do, already 1...n labels | ||
required_type = np.min_scalar_type(offset + len(labels0)) | ||
if np.dtype(required_type).itemsize > np.dtype(label_field.dtype).itemsize: | ||
label_field = label_field.astype(required_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice.
return label_field, labels, labels | ||
forward_map = np.zeros(m + 1, int) | ||
forward_map[labels0] = np.arange(offset, offset + len(labels0)) | ||
forward_map = np.zeros(int(m + 1), dtype=label_field.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I replaced m + 1
with int(m + 1)
to fix an issue when m
is of type np.uint64
.
Is this intended behavior or a numpy bug?
>>> (np.uint64(5) + 1).dtype
dtype('float64')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes! I think actually I came across this and the answer was that it is indeed intended, because (a) NumPy's type promotion rules do not depend on the input values, only on the types, and (b) they are supposed to be safe in the sense of being able to represent the result, and float64
is the only thing that can represent anything with a Python int, since they are unbounded.
At least, that's my memory of it. Perhaps @stefanv has more details. But, either way, thanks for the fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this sorcery!!!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the argument Juan outlines is being followed here. Counter-intuitive, but probably correct? Same as (np.uint8(3)/2).dtype
.
Co-Authored-By: uschmidt83 <uschmidt83@users.noreply.github.com>
new_type = np.min_scalar_type(int(m)) | ||
label_field = label_field.astype(new_type) | ||
m = m.astype(new_type) # Ensures m is an integer | ||
labels = np.unique(label_field) | ||
labels0 = labels[labels != 0] | ||
if m == len(labels0): # nothing to do, already 1...n labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, the previous line
if m == len(labels0): # nothing to do, already 1...n labels
was only valid for offset = 1
. This was only mentioned in the comment but not checked, i.e. it should've been
if offset == 1 and m == len(labels0): # nothing to do, already 1...n labels
Anyway, I guess hardly anyone uses offset != 1
. Why was the previous function relabel_from_one
replaced with relabel_sequential
? (Since the offset
introduces quite a bit of (subtle) complexity.)
Also, relabel_sequential
assumes that the background has value 0, but I noticed color.label2rgb
assumes (by default) that the background label has value -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was the previous function
relabel_from_one
replaced withrelabel_sequential
?
Someone requested the functionality and it seemed like the right thing to do and an easy fix. Oops. =)
Also,
relabel_sequential
assumes that the background has value 0, but I noticedcolor.label2rgb
assumes (by default) that the background label has value -1.
Yes, historically, skimage used -1 as the background label, but we have slowly started homogenising to 0. But this will take time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, thanks!
@scikit-image/core anyone else want to review this? The Travis failures are due to the Qt 5.12 bug that was fixed on master. |
Anything else I can do to get this merged? |
return label_field, labels, labels | ||
forward_map = np.zeros(m + 1, int) | ||
forward_map[labels0] = np.arange(offset, offset + len(labels0)) | ||
forward_map = np.zeros(int(m + 1), dtype=label_field.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the argument Juan outlines is being followed here. Counter-intuitive, but probably correct? Same as (np.uint8(3)/2).dtype
.
Thanks for the ping @uschmidt83! |
Description
skimage.segmentation.relabel_sequential
(see added tests).Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.@meeseeksdev backport to v0.14.x