New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix several issues with relabel_sequential #3740
Changes from all commits
cd934f0
5307a3c
995cfca
16c771e
eb2d299
f1c05f2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -62,7 +62,7 @@ def relabel_sequential(label_field, offset=1): | |
Parameters | ||
---------- | ||
label_field : numpy array of int, arbitrary shape | ||
An array of labels. | ||
An array of labels, which must be non-negative integers. | ||
offset : int, optional | ||
The return labels will start at `offset`, which should be | ||
strictly positive. | ||
|
@@ -72,14 +72,16 @@ def relabel_sequential(label_field, offset=1): | |
relabeled : numpy array of int, same shape as `label_field` | ||
The input label field with labels mapped to | ||
{offset, ..., number_of_labels + offset - 1}. | ||
The data type will be the same as `label_field`, except when | ||
offset + number_of_labels causes overflow of the current data type. | ||
forward_map : numpy array of int, shape ``(label_field.max() + 1,)`` | ||
The map from the original label space to the returned label | ||
space. Can be used to re-apply the same mapping. See examples | ||
for usage. | ||
for usage. The data type will be the same as `relabeled`. | ||
inverse_map : 1D numpy array of int, of length offset + number of labels | ||
The map from the new label space to the original space. This | ||
can be used to reconstruct the original label field from the | ||
relabeled one. | ||
relabeled one. The data type will be the same as `relabeled`. | ||
|
||
Notes | ||
----- | ||
|
@@ -114,20 +116,29 @@ def relabel_sequential(label_field, offset=1): | |
>>> relab | ||
array([5, 5, 6, 6, 7, 9, 8]) | ||
""" | ||
offset = int(offset) | ||
if offset <= 0: | ||
raise ValueError("Offset must be strictly positive.") | ||
if np.min(label_field) < 0: | ||
raise ValueError("Cannot relabel array that contains negative values.") | ||
m = label_field.max() | ||
if not np.issubdtype(label_field.dtype, np.signedinteger): | ||
if not np.issubdtype(label_field.dtype, np.integer): | ||
new_type = np.min_scalar_type(int(m)) | ||
label_field = label_field.astype(new_type) | ||
m = m.astype(new_type) # Ensures m is an integer | ||
labels = np.unique(label_field) | ||
labels0 = labels[labels != 0] | ||
if m == len(labels0): # nothing to do, already 1...n labels | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw, the previous line if m == len(labels0): # nothing to do, already 1...n labels was only valid for if offset == 1 and m == len(labels0): # nothing to do, already 1...n labels Anyway, I guess hardly anyone uses Also, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Someone requested the functionality and it seemed like the right thing to do and an easy fix. Oops. =)
Yes, historically, skimage used -1 as the background label, but we have slowly started homogenising to 0. But this will take time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to know, thanks! |
||
required_type = np.min_scalar_type(offset + len(labels0)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. min_scalar_type is a little annoying. It can return unsigned types. I think I've personally come around to not liking the use of "unsigned" unless you need specific "unsigned" behavior, that is, Thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The whole I frequently save label masks to disk as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Want to allow a dtype parameter to ensure uint16??? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the new logic (use the larger of min_dtype and input dtype) is sufficient here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. |
||
if np.dtype(required_type).itemsize > np.dtype(label_field.dtype).itemsize: | ||
label_field = label_field.astype(required_type) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very nice. |
||
new_labels0 = np.arange(offset, offset + len(labels0)) | ||
if np.all(labels0 == new_labels0): | ||
return label_field, labels, labels | ||
forward_map = np.zeros(m + 1, int) | ||
forward_map[labels0] = np.arange(offset, offset + len(labels0)) | ||
forward_map = np.zeros(int(m + 1), dtype=label_field.dtype) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Btw, I replaced Is this intended behavior or a numpy bug? >>> (np.uint64(5) + 1).dtype
dtype('float64') There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yikes! I think actually I came across this and the answer was that it is indeed intended, because (a) NumPy's type promotion rules do not depend on the input values, only on the types, and (b) they are supposed to be safe in the sense of being able to represent the result, and At least, that's my memory of it. Perhaps @stefanv has more details. But, either way, thanks for the fix! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this sorcery!!!! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like the argument Juan outlines is being followed here. Counter-intuitive, but probably correct? Same as |
||
forward_map[labels0] = new_labels0 | ||
if not (labels == 0).any(): | ||
labels = np.concatenate(([0], labels)) | ||
inverse_map = np.zeros(offset - 1 + len(labels), dtype=np.intp) | ||
inverse_map = np.zeros(offset - 1 + len(labels), dtype=label_field.dtype) | ||
inverse_map[(offset - 1):] = labels | ||
relabeled = forward_map[label_field] | ||
return relabeled, forward_map, inverse_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one, yeah the error message was non-sensical before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍