consistently use uint32 in candidate detection #388

alessandrofelder · 2024-02-16T14:32:39Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?
Structure detection currently fails silently/crashes when there are more than 2**16-1 temporary connected components.

What does this PR do?

Ensures cell candidate detection uses unsigned 32-bit integers (note this may cause an minor increase in memory usage, but will avoid cellfinder hanging/crashing when there are more than 2**16-1 cells). I've documented some additional bits along the way.

References

Closes #383 and https://forum.image.sc/t/append-error-with-cellfinder/88634/9
Related to #389
There are a few improvements that could be made, see #387 and #386 but it's more critical that we fix this bug quickly now.

How has this PR been tested?

Tested on user data provided by https://forum.image.sc/t/append-error-with-cellfinder/88634/9 and an internal MSc student, on the internal HPC system. cellfinder times out with old code, and completes successfully with reasonably looking cell candidates when using new code.

Two external users not associated with the core BrainGlobe team have tested this development branch and have confirmed this fixes their problems.

I have compared the "old" (pre-numba) code and this development branch on the internal MSc student data, and found the below

-	pre-numba	this dev branch
total candidates found	52502	52649
candidates not present in other	5977	6124
candidates with nearest candidate in other > 2 microns away	147	7

In summary, for real-life data, this development branch basically preserves all but 7 cell candidates (out of 50000+) found pre-numba, and finds 147 additional ones. I would say this is good enough.

I have added a test for the key function in #385 .

Is this a breaking change?

No

Does this PR require an update to the documentation?

No

Checklist:

The code has been tested locally (manually)
Tests have been added to cover all new functionality (unit & integration)
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

willGraham01

🥳 Fingers crossed we never encounter any data with 2**32-1 cells in it 😓 Though I did think of some ways that we can adapt the struct_id memory usage given the number of candidates.

Also, we should release a new version after merging this right?

adamltyson · 2024-02-27T10:04:53Z

I'll release a new version.

alessandrofelder self-assigned this Feb 16, 2024

alessandrofelder added 2 commits February 16, 2024 16:03

consistently use uint32 in candidate detection

4fbd49a

adapt tests to uint32 usage

ff724d5

alessandrofelder force-pushed the switch-detection-to-uint32 branch from bb44066 to ff724d5 Compare February 16, 2024 16:04

alessandrofelder added 2 commits February 21, 2024 11:39

fixed a missing update to uint32

129f0d8

improve/add docstrings

619d8cb

alessandrofelder marked this pull request as ready for review February 26, 2024 17:43

alessandrofelder requested a review from a team February 26, 2024 17:46

willGraham01 approved these changes Feb 27, 2024

View reviewed changes

willGraham01 merged commit 97e28b7 into main Feb 27, 2024
14 checks passed

willGraham01 deleted the switch-detection-to-uint32 branch February 27, 2024 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consistently use uint32 in candidate detection #388

consistently use uint32 in candidate detection #388

alessandrofelder commented Feb 16, 2024 •

edited

Loading

willGraham01 left a comment •

edited

Loading

adamltyson commented Feb 27, 2024

consistently use uint32 in candidate detection #388

consistently use uint32 in candidate detection #388

Conversation

alessandrofelder commented Feb 16, 2024 • edited Loading

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

willGraham01 left a comment • edited Loading

Choose a reason for hiding this comment

adamltyson commented Feb 27, 2024

alessandrofelder commented Feb 16, 2024 •

edited

Loading

willGraham01 left a comment •

edited

Loading