Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several issues with relabel_sequential #3740

Merged
merged 6 commits into from Apr 4, 2019

Conversation

uschmidt83
Copy link
Contributor

Description

  • Fixing several issues with skimage.segmentation.relabel_sequential (see added tests).
  • Slightly changing the semantic of the function by providing data type stability (if posssible).

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.
  • Consider backporting the PR with @meeseeksdev backport to v0.14.x

@pep8speaks
Copy link

pep8speaks commented Feb 10, 2019

Hello @uschmidt83! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 11, 2019 at 08:29 Hours UTC

@uschmidt83 uschmidt83 changed the title Fixing several issues with relabel_sequential Fix several issues with relabel_sequential Feb 10, 2019
new_type = np.min_scalar_type(int(m))
label_field = label_field.astype(new_type)
m = m.astype(new_type) # Ensures m is an integer
labels = np.unique(label_field)
labels0 = labels[labels != 0]
if m == len(labels0): # nothing to do, already 1...n labels
required_type = np.min_scalar_type(offset + len(labels0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_scalar_type is a little annoying. It can return unsigned types. I think I've personally come around to not liking the use of "unsigned" unless you need specific "unsigned" behavior, that is, 255+2 == 1 is something you want to be true in your math.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole required_type thing is really covering an edge case that I don't think will happen much in practice, unless someone purposefully uses restricted data types (especially uint8).

I frequently save label masks to disk as uint16 TIFF files. Hence, they have this type when loaded again from disk. I like working with 16-bit integers for space reasons, which can be important for large 3D arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to allow a dtype parameter to ensure uint16???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new logic (use the larger of min_dtype and input dtype) is sufficient here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Co-Authored-By: uschmidt83 <uschmidt83@users.noreply.github.com>
@@ -114,20 +116,29 @@ def relabel_sequential(label_field, offset=1):
>>> relab
array([5, 5, 6, 6, 7, 9, 8])
"""
offset = int(offset)
if offset <= 0:
raise ValueError("Offset must be strictly positive.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, yeah the error message was non-sensical before

In [14]: relabel_sequential(np.asarray([2, 3, 4]), -1)                                                  
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-e5025e3e4264> in <module>
----> 1 relabel_sequential(np.asarray([2, 3, 4]), -1)

~/miniconda3/envs/owl/lib/python3.7/site-packages/skimage/segmentation/_join.py in relabel_sequential(label_field, offset)
    129         labels = np.concatenate(([0], labels))
    130     inverse_map = np.zeros(offset - 1 + len(labels), dtype=np.intp)
--> 131     inverse_map[(offset - 1):] = labels
    132     relabeled = forward_map[label_field]
    133     return relabeled, forward_map, inverse_map

ValueError: could not broadcast input array from shape (4) into shape (2)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschmidt83 thank you, very nice! I have one nitpicky suggestion and one bigger one regarding negative values, but imho that can happen in a separate PR if necessary. This is already a significant improvement.

skimage/segmentation/tests/test_join.py Show resolved Hide resolved
ar = np.array([1, 3, 2, 5, 4])
ar_relab, fw, inv = relabel_sequential(ar, offset=offset)
ar_relab_ref = ar.copy()
ar_relab_ref[ar_relab_ref > 0] += offset - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, ar_relab_ref = np.where(ar > 0, ar + offset - 1, 0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change and commit this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah go for it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -114,20 +116,29 @@ def relabel_sequential(label_field, offset=1):
>>> relab
array([5, 5, 6, 6, 7, 9, 8])
"""
offset = int(offset)
if offset <= 0:
raise ValueError("Offset must be strictly positive.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

skimage/segmentation/_join.py Outdated Show resolved Hide resolved
if m == len(labels0): # nothing to do, already 1...n labels
required_type = np.min_scalar_type(offset + len(labels0))
if np.dtype(required_type).itemsize > np.dtype(label_field.dtype).itemsize:
label_field = label_field.astype(required_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

return label_field, labels, labels
forward_map = np.zeros(m + 1, int)
forward_map[labels0] = np.arange(offset, offset + len(labels0))
forward_map = np.zeros(int(m + 1), dtype=label_field.dtype)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I replaced m + 1 with int(m + 1) to fix an issue when m is of type np.uint64.

Is this intended behavior or a numpy bug?

>>> (np.uint64(5) + 1).dtype
dtype('float64')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes! I think actually I came across this and the answer was that it is indeed intended, because (a) NumPy's type promotion rules do not depend on the input values, only on the types, and (b) they are supposed to be safe in the sense of being able to represent the result, and float64 is the only thing that can represent anything with a Python int, since they are unbounded.

At least, that's my memory of it. Perhaps @stefanv has more details. But, either way, thanks for the fix!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this sorcery!!!!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the argument Juan outlines is being followed here. Counter-intuitive, but probably correct? Same as (np.uint8(3)/2).dtype.

uschmidt83 and others added 2 commits February 11, 2019 09:28
Co-Authored-By: uschmidt83 <uschmidt83@users.noreply.github.com>
new_type = np.min_scalar_type(int(m))
label_field = label_field.astype(new_type)
m = m.astype(new_type) # Ensures m is an integer
labels = np.unique(label_field)
labels0 = labels[labels != 0]
if m == len(labels0): # nothing to do, already 1...n labels
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, the previous line

if m == len(labels0):  # nothing to do, already 1...n labels

was only valid for offset = 1. This was only mentioned in the comment but not checked, i.e. it should've been

if offset == 1 and m == len(labels0):  # nothing to do, already 1...n labels

Anyway, I guess hardly anyone uses offset != 1. Why was the previous function relabel_from_one replaced with relabel_sequential? (Since the offset introduces quite a bit of (subtle) complexity.)

Also, relabel_sequential assumes that the background has value 0, but I noticed color.label2rgb assumes (by default) that the background label has value -1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the previous function relabel_from_one replaced with relabel_sequential?

Someone requested the functionality and it seemed like the right thing to do and an easy fix. Oops. =)

Also, relabel_sequential assumes that the background has value 0, but I noticed color.label2rgb assumes (by default) that the background label has value -1.

Yes, historically, skimage used -1 as the background label, but we have slowly started homogenising to 0. But this will take time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, thanks!

@jni
Copy link
Member

jni commented Feb 15, 2019

@scikit-image/core anyone else want to review this? The Travis failures are due to the Qt 5.12 bug that was fixed on master.

@uschmidt83
Copy link
Contributor Author

Anything else I can do to get this merged?

return label_field, labels, labels
forward_map = np.zeros(m + 1, int)
forward_map[labels0] = np.arange(offset, offset + len(labels0))
forward_map = np.zeros(int(m + 1), dtype=label_field.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the argument Juan outlines is being followed here. Counter-intuitive, but probably correct? Same as (np.uint8(3)/2).dtype.

@stefanv stefanv merged commit fe94ee9 into scikit-image:master Apr 4, 2019
@jni
Copy link
Member

jni commented Apr 4, 2019

Thanks for the ping @uschmidt83!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants