consistently use uint32 in candidate detection #388
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
What is this PR
Why is this PR needed?
Structure detection currently fails silently/crashes when there are more than
2**16-1
temporary connected components.What does this PR do?
Ensures cell candidate detection uses unsigned 32-bit integers (note this may cause an minor increase in memory usage, but will avoid
cellfinder
hanging/crashing when there are more than2**16-1
cells). I've documented some additional bits along the way.References
Closes #383 and https://forum.image.sc/t/append-error-with-cellfinder/88634/9
Related to #389
There are a few improvements that could be made, see #387 and #386 but it's more critical that we fix this bug quickly now.
How has this PR been tested?
Tested on user data provided by https://forum.image.sc/t/append-error-with-cellfinder/88634/9 and an internal MSc student, on the internal HPC system.
cellfinder
times out with old code, and completes successfully with reasonably looking cell candidates when using new code.Two external users not associated with the core BrainGlobe team have tested this development branch and have confirmed this fixes their problems.
I have compared the "old" (pre-numba) code and this development branch on the internal MSc student data, and found the below
In summary, for real-life data, this development branch basically preserves all but 7 cell candidates (out of 50000+) found pre-numba, and finds 147 additional ones. I would say this is good enough.
I have added a test for the key function in #385 .
Is this a breaking change?
No
Does this PR require an update to the documentation?
No
Checklist: