Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to set an explicit skip flag for custom regionprops and marker extraction functionality in notebook 1 #897

Merged
merged 11 commits into from
Feb 5, 2023

Conversation

alex-l-kong
Copy link
Contributor

@alex-l-kong alex-l-kong commented Feb 1, 2023

What is the purpose of this PR?

Closes #875. Since PyTorch/Tensorflow optimizations to running these extractions will require more changes to the Docker than anticipated, for now we add the functionality to skip these computations.

How did you implement your changes

A flag, skip_extraction, can be set for generate_cell_table, which gets propagated to the internal extractions, allowing them to avoid running these expensive processes if so desired by the user.

Remaining issues

We do need to maintain the 'cell_size', 'label', 'coords', and 'centroid' features from regionprops. The first 3 are needed for indexing and 'centroid-0' and 'centroid-1' (generated from 'centroid') is needed for spatial LDA. This means there will still be the bottleneck of running regionprops_table, since I believe the way it works is it generates all the regionprops features and then subsets on them.

Please comment if there are additional base features we should keep regardless of if skip_extraction is set.

@alex-l-kong alex-l-kong self-assigned this Feb 1, 2023
@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Feb 2, 2023

Tested with the skip_extraction flag and it runs about 70-80% times faster. The number of cells to assign is an unavoidable bottleneck, but without signal extraction on the example dataset it takes on our worst FOVs 40s (it's about 2-3x slower without skip_extraction).

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping the extraction step makes sense for testing purposes to see where the bottleneck is, but for in terms of actual user setting, isn't relevant. The only reason to generate the cell table is to have the extracted intensity values.

The parameter that would be helpful for the user is the option to skip the custom regionprops functions, which are quite slow.

My understanding is that all the features in REGIONPROPS_BASE don't contribute materially to the runtime. Is that correct? If so, then having a skip flag which sets REGIONPROPS_SINGLE_COMP and REGIONPROPS_MULTI_COMP to None would be sufficient. If it's the case that certain features in REGIONPROPS_BASE are also slow to compute, then the skip flag should also remove any of those slow ones.

@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Feb 3, 2023

@ngreenwald fixed up the pipeline to only remove regionprops_single_comp and regionprops_multi_comp and leave marker extraction intact.

By removing certain regionprops base calculations (all but label, centroid, and coord), we can trim the single compartment props retrieval time by several orders (up to 10x for our more expensive FOVs). So I'd say the skip_extraction step should also set regionprops_base to just these features in addition to setting regionprops_single_comp and regionprops_multi_comp to [].

@ngreenwald
Copy link
Member

Got it. And what if you remove centroid and coord as coord as well?

@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Feb 3, 2023

Got it. And what if you remove centroid and coord as coord as well?

No significant difference, a couple tenths of a second difference at most.

@ngreenwald
Copy link
Member

great, sounds good

@alex-l-kong alex-l-kong requested review from ngreenwald and removed request for ngreenwald February 3, 2023 21:51
Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation needs to be updated to reflect the behavior of skip extraction. Will you also add a flag in notebook1 with a description where the user can toggle this on or off?

ark/segmentation/marker_quantification.py Outdated Show resolved Hide resolved
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ngreenwald ngreenwald merged commit a4b2841 into main Feb 5, 2023
@ngreenwald ngreenwald deleted the cell_table_slow branch February 5, 2023 17:03
@srivarra srivarra added the enhancement New feature or request label Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize cell table generation code
3 participants