Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add a boundary tracing algorithm #4853

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

fbidu
Copy link

@fbidu fbidu commented Jul 18, 2020

Description

This is a work in progress implementation of the boundary tracing algorithm that started at SciPy sprints. It is not ready for review yet

Thread on Zulip

Might close #1131

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

@pep8speaks
Copy link

pep8speaks commented Jul 18, 2020

Hello @fbidu! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 131:80: E501 line too long (106 > 79 characters)

Line 4:80: E501 line too long (132 > 79 characters)
Line 40:1: W293 blank line contains whitespace
Line 45:1: W293 blank line contains whitespace
Line 93:80: E501 line too long (90 > 79 characters)
Line 102:80: E501 line too long (91 > 79 characters)
Line 106:1: W293 blank line contains whitespace
Line 135:1: E305 expected 2 blank lines after class or function definition, found 1
Line 138:80: E501 line too long (80 > 79 characters)

Line 7:80: E501 line too long (86 > 79 characters)
Line 26:80: E501 line too long (88 > 79 characters)
Line 53:75: E231 missing whitespace after ','

Line 25:80: E501 line too long (80 > 79 characters)

Comment last updated at 2020-08-22 12:57:08 UTC

@emmanuelle
Copy link
Member

Hi @fbidu great to see that you've made a lot of progress! Please tell us if you need some feedback now. You'll need to add a gallery example too since this is a new feature, probably in doc/examples/segmentation.

fbidu and others added 2 commits July 19, 2020 10:27
Co-authored-by: Emmanuelle Gouillart <emma@plot.ly>
Co-authored-by: Emmanuelle Gouillart <emma@plot.ly>
@fbidu
Copy link
Author

fbidu commented Jul 19, 2020

Hi @emmanuelle, I've just commited your suggestions. Thanks!

Currently, I'm still working on a new implementation of that function. The current one was a nice start and allowed me to write some tests but it needs some work to make it better. I'll be sure to add the examples later on and let you all know when I have something reviewable. Right know I'm still working on grokking the current behavior

At this point I'm just breaking apart some pieces of the original
function to add my notes after analyzing the current behavior
Copy link
Member

@rfezzani rfezzani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fbidu, I left some comments. Is there any reference that we can cite to describe the used method?

skimage/measure/_boundary.py Show resolved Hide resolved
skimage/measure/_boundary.py Show resolved Hide resolved

return backtrack_start, start

backtrack_start, start = find_boundary_start()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as for the starting_point function, the find_boundary_start is called only once...

current = neighbors_ids[idx]
counter += 1

if np.all(current == start) and np.all(backtrack == backtrack_start):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not directly use this condition to stop the while loop instead of the while True?..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. This looks like a bottom controlled loop to me; moving it to the top would mean we never enter the body given the current initialization.

Copy link
Member

@rfezzani rfezzani Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You simply need to correctly initialize you variables to enter the loop 😉
I see, it's more complicated then it looks =p

# initilization
# starting point is the most upper left point
idx_start = 0
while True: # asserting that the starting point is not isolated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while True: # asserting that the starting point is not isolated
for start in zip(y, x):

Comment on lines +117 to +118
y = neighbors_ids[:, 0]
x = neighbors_ids[:, 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
y = neighbors_ids[:, 0]
x = neighbors_ids[:, 1]
y, x = neighbors_ids.T

boundary.append(current)
backtrack = neighbors_ids[idx - 1]
current = neighbors_ids[idx]
counter += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see counter used elsewhere...

Suggested change
counter += 1

raise RuntimeError("The backtrack is not on the neighborhood")


def trace_boundary(coords):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the function is not used for tracing, can we find an other name (region_boundary?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or even object_boundary. "region" brings regionprops to mind while "object" fit better within the morphological context.

@fbidu
Copy link
Author

fbidu commented Aug 8, 2020

Hello everybody! Sorry for the delay, I didn't have much time in the last couple of weeks to work on this PR but I'll do more work today.

@rfezzani thanks for your review and you're right on the 'functions for single calls' and so on. What happened is that the code you're seeing right now was mostly copied from another project. The idea right now is to completely rewrite it in order to become more optimizable. Given the fact that I'm not very familiar with this particular algorithm, I did a careful line by line debugging of its behavior while taking notes. Since I knew it would be a few days before I could lay my hands in this code again, I did a small refactoring in order to isolate the main parts I identified and put my (paper-based) notes on the code so it wouldn't get lost.

Thank you all for the comments! Have a great weekend 😄

@lagru
Copy link
Member

lagru commented Aug 24, 2020

@fbidu Great that you are still working on this! I haven't forgotten you nevertheless this PR dropped a bit from my radar. I'll try to catch up till Friday.

Copy link
Contributor

@FirefoxMetzger FirefoxMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job so far! I think Moore's Boundary Tracing will be a nice addition to skimage. I left some comments on the current implementation.

import numpy as np

"""
Code copied from https://github.com/machine-shop/deepwings/blob/master/deepwings/method_features_extraction/image_processing.py#L185
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly checked out their repo; while it is open-source there is no LICENSE file and I didn't find any other information on the license. Are we allowed to simply copy-paste the code?

License questions always confuse the heck out of me 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this code was written by one of @stefanv's students so I expect that this won't be a problem. Nevertheless you're right, we should make sure that everyone involved is in agreement and happy. @theobdt is it alright if we reuse your code here? It would probably be easiest if you could license it with a BSD-compatible license or if you explicitly stated your consent here (in case we have it). 🙏

See this comment on zulip by @stefanv.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @lagru, yes I personally agree with this code being reused, as long as @stefanv is also ok with it. We also worked on an initial Cython version, which might be relevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with me!

Comment on lines +38 to +39
"""Coordinates of the region's boundary. The region must not have isolated
points.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Coordinates of the region's boundary. The region must not have isolated
points.
"""Coordinates of the region's boundary. The region must be simply connected.

This algorithm will get problems with more than merely having isolated points. We also can't have separate groups of points; i.e., for each two points in the region there must exist a path that connects them for which each point along the path is within the region. There is a mathematical term for this, but if forgot xD

On top, we will get issues if the region "has holes". For example, if you put in a donut, it can find the outer boundary, but will not touch the inner one. Not sure if we implicitly exclude such shapes. If not, we need simply connected regions (regions that aren't "split" and that have no holes).

boundary : 2D array
List of coordinates of pixels in the boundary
The first element is the most upper left pixel of the region.
The following coordinates are in clockwise order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now it is fine to default to clock-wise. Long-term, I think an additional parameter clockwise=True would be a nice addition.

Comment on lines +86 to +99
# initilization
# starting point is the most upper left point
idx_start = 0
while True: # asserting that the starting point is not isolated
start = [y[idx_start], x[idx_start]]

# Focus Start is a 3x3 matrix centered around start
focus_start = binary[start[0] - 1 : start[0] + 2, start[1] - 1 : start[1] + 2]

# If the current pixel is not the only one set to one in its
# 3x3 region, we stop
if np.sum(focus_start) > 1:
break
idx_start += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit unnecessary. We require that the region doesn't contain isolated points, or - if you agree with my suggestion - that it is simply connected. In this case we are guaranteed to find at least one neighbor that is also part of the region.

An exception (edge case) is if the region only has a single pixel, in which case your current implementation will throw an error because it runs out of bounds (either because of y[idx_start] or further down when using idx_start which now is idx_start=len(coords).

I would replace this with simply grabbing the coordinate with the lowest indices.

current = start
backtrack = backtrack_start
boundary = []
counter = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
counter = 0

Looks like counter isn't used anymore.

Comment on lines +117 to +119
y = neighbors_ids[:, 0]
x = neighbors_ids[:, 1]
neighbors = binary[tuple([y, x])]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You control _moore_neighborhood; wouldn't it make sense to return neighbors in a format you can directly use instead of having to first do conversion so that you can use it?

current = neighbors_ids[idx]
counter += 1

if np.all(current == start) and np.all(backtrack == backtrack_start):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. This looks like a bottom controlled loop to me; moving it to the top would mean we never enter the body given the current initialization.

if np.all(current == start) and np.all(backtrack == backtrack_start):
break

return np.array(boundary)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quite opinion based comment: I would prefer the output would be of the same format as np.nonzero(). The same goes for measures.regionprops().coords, but I don't know what the members think about this (@lagru )?

@emmanuelle
Copy link
Member

Hi @fbidu how can we help you with this PR? I'd love to get this feature merged :-).

Base automatically changed from master to main February 18, 2021 18:23
@stefanv
Copy link
Member

stefanv commented Sep 27, 2022

Agreed that this would be a great addition! Please let us know if we can help you complete it.

@fbidu
Copy link
Author

fbidu commented Sep 28, 2022

Hi team! Oh, I forgot about this, sorry :( So I don't have much context here anymore but I can try to make some progress over the weekend 🤔

Any tips on how it would be best to move forward is appreciated 🙏

@lagru
Copy link
Member

lagru commented Sep 28, 2022

Hey @fbidu, no worries. We all did. 😅

Any tips on how it would be best to move forward is appreciated

@theobdt mentioned a initial Cython version in #4853 (comment). That might be a good place to compare. Otherwise a good place to start would probably be resolving the merge conflicts. After that get the test's working again if that's not the case. Otherwise, it looks like there are a good number of open comments that should probably be addressed. Let us know if you need help with that.

@@ -128,6 +128,7 @@
"registration/tests/data/TransformedX75Y75.png": "a1e9ead5f8e4a0f604271e1f9c50e89baf53f068f1d19fab2876af4938e695ea",
"data/cells.tif": "2120cfe08e0396324793a10a905c9bbcb64b117215eb63b2c24b643e1600c8c9",
"data/mitosis.tif": "2751ba667c4067c5d30817cff004aa06f6f6287f1cdbb5b8c9c6a500308cb456",
"data/boundary_tracing_tests.npz": "1b37154113ac3964d43ef4f1fe93f7377e85a29750a84487f8e101be388b017f",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we try to limit / reduce the number of data files we have in the repository itself. Especially if they are highly specific to a few tests. In case we decide this is necessary https://gitlab.com/scikit-image/data is probably a better place for this file. 🙂

Copy link
Member

@stefanv stefanv Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, even better, re-use an existing image for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracing the boundary curve of an image region
9 participants