Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the meaning of index list returned? #13

Closed
wenshuaizhao opened this issue Jun 9, 2019 · 4 comments
Closed

What's the meaning of index list returned? #13

wenshuaizhao opened this issue Jun 9, 2019 · 4 comments
Labels
feature New feature or request

Comments

@wenshuaizhao
Copy link

Hi, thank you for your great contribution first!
But I wonder about the index list the function 'connected_components(data)' returns, why is the index list not continuous, such as[0,1,2,3], but in fact it returns list like [0,210,213,220].

Another, I want to know for the multi_label case, such as [0,1,2], then it returns connected components labels like [0,210,213,220]. Then how can we tell the different connected components belonging to original label?

Last, does this repository has any attribute to find if the connected components neighbored?

@william-silversmith
Copy link
Contributor

Hi Wenshuai!

It's possible to recover the labels using numpy.unique. The reason the ids are not contiguous is that the ids are assembled via merges in the union find algorithm. To get contiguous ids I'd have to renumber the array.

I didn't do this because I was stripping everything out to be as fast as possible, but it's probably a false economy as performing the renumber step is likely faster than calling np.unique.... however the fast way of doing it might use excessive memory, but I think I see a way of doing it with less on average.

I like your idea of assembling a region graph from the image, my lab might find that useful too. I'll try googling around first but if there's nothing convenient out there I'm happy to accept pull requests or write it myself, but I'm pretty busy so it's hard to say when it will be available.

@william-silversmith william-silversmith added the feature New feature or request label Jun 10, 2019
william-silversmith added a commit that referenced this issue Jun 10, 2019
william-silversmith added a commit that referenced this issue Jun 10, 2019
* wip: return renumbered array

Addresses #13

* test+fix: change diagonal test to reflect new numbering scheme

* refactor: remove unnecessary fastremap in testing
@william-silversmith
Copy link
Contributor

@wenshuaizhao I just released version 1.2.0 which renumbers the array. There should be no performance penalty in memory or significant runtime from this change. You can now use np.max(cc_labels) to get the number of labels in the array, which is significantly faster and lower memory than np.unique.

It would be possible to return the max with the function and e.g. downsize the output datatype, but it requires me to change my C++ function to do multiple return and make a backwards incompatible change to cc3d's return value. I might do it in the future with a major version increment.

@wenshuaizhao
Copy link
Author

Thank you for your fast reply! I think I have made sense of such index. It is really great.

@william-silversmith
Copy link
Contributor

william-silversmith commented Jun 11, 2019 via email

william-silversmith added a commit that referenced this issue Jun 13, 2019
Per #18 and #13, and with some internet research, I found that
options are somewhat limited for deriving a region graph from
a labeled image.

Here is a very slow initial implementation (600s for a 512 cube
vs 1s for connected_components). This can be improved, but at least
there's something.
william-silversmith added a commit that referenced this issue Jun 13, 2019
* feat: very slow region graph function

Per #18 and #13, and with some internet research, I found that
options are somewhat limited for deriving a region graph from
a labeled image.

Here is a very slow initial implementation (600s for a 512 cube
vs 1s for connected_components). This can be improved, but at least
there's something.

* perf: 4x faster using cdef on neighbors

* docs: add docstring to region_graph and add declaration to connected_components

* test: basic test for region_graph

* docs: mentioned region_graph function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants