Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nuclear fix #422

Merged
merged 8 commits into from
Jun 4, 2021
Merged

Nuclear fix #422

merged 8 commits into from
Jun 4, 2021

Conversation

alex-l-kong
Copy link
Contributor

What is the purpose of this PR?

Closes #419. nc_ratio is currently taking a very long time to compute. We address this bottleneck in this PR.

How did you implement your changes

The original method worked well for small datasets but didn't scale well due to the inefficiencies of looping through hundreds to thousands of cells. To avoid the bottleneck of looping through each cell, we instead broadcast this operation.

Remaining issues

The actual accuracy of the values still needs to be checked. In addition to the test data provided in ark, it would help to test on maybe a couple other datasets cause it looked like @awedwards was having issues of his own on his dataset.

@alex-l-kong alex-l-kong self-assigned this May 26, 2021
@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented May 26, 2021

Sorry for the delay, but this version should run much faster. I'm still verifying accuracy on the test dataset in ark, @awedwards if you wouldn't mind testing on your own dataset that would be very helpful.

@ngreenwald
Copy link
Member

Okay, I see the problem. For generating the figures for the Mesmer paper, I computed NC ratio the same way you now have it, outside of the loop and all at once. However, the first pass at the refactor to include these morphology metrics in ark computed it the same way the other stuff, cell-by-cell, leading to the slowdown.

The next thing to check is why the wrong results are being returned, the indexing errors seems like the likely culprit.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alex-l-kong alex-l-kong requested review from ngreenwald and removed request for awedwards and ngreenwald June 2, 2021 21:36
" delimiter='_feature_0',\n",
" force_ints=True)\n",
"\n",
"segmentation_labels_nuc = load_utils.load_imgs_from_dir(data_dir=deepcell_output_dir,\n",
" xr_dim_name='compartments',\n",
" xr_channel_names=['nuclear'],\n",
" files=[fov + '_feature_1.tif' for fov in fovs],\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to address the underlying issue with the load_imgs function, otherwise this is just going to happen again.

No error message was thrown when the delimiter didn't match the file name. Was it you or @ackagel who added the delimiter_optional flag to the extract_delimited_names function? What was the reasoning behind that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds my (apparently not so handi)-work. I'll look into that, quickly rework the functionality, and rewrite/reimplement the test functions which should've caught this failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-l-kong I'll work on a PR for that and merge it into this branch once it's approved. Sounds good?

@alex-l-kong
Copy link
Contributor Author

@awedwards for the time being, we've implemented a temporary fix on this branch addressing the nc_ratios always equalling 1 bug, so you can test the new process on this branch. @ackagel is working on a more permanent fix to this issue.

* separates matching and trimming arguments for filename suffixes

* adds test for matching file suffixes

* updates `load_imgs_from_dir` usage in segmentation notebook

* updates notebook test to include nuclear segmentations

* changes filename matching from suffix based to substring based

* corrects notebook test error and adds nuclear check for mibitiff notebook test

* updates args in notebook tests

* fixes typo in a sample tiff generating function

* removes .DS_Store
@ngreenwald ngreenwald merged commit 8cf67b1 into master Jun 4, 2021
@ngreenwald ngreenwald deleted the nuclear_fix branch June 4, 2021 03:17
y2kbugger pushed a commit that referenced this pull request Jul 29, 2021
* Initial commit of nuclear parameter fixing

* Broadcast compute nc_ratio

* Accidentally switched nuclear and whole cell

* Add comments to new nc_ratio function

* Add delimiter check to load_imgs_from_dir so both compartments don't get loaded

* Add files parameter to load_imgs_from_dir so only _feature_0 loaded for whole cell (and _feature_1 for nuclear)

* Add _feature_0 and _feature_1 file support for notebook

* File delimiter fix (#424)

* separates matching and trimming arguments for filename suffixes

* adds test for matching file suffixes

* updates `load_imgs_from_dir` usage in segmentation notebook

* updates notebook test to include nuclear segmentations

* changes filename matching from suffix based to substring based

* corrects notebook test error and adds nuclear check for mibitiff notebook test

* updates args in notebook tests

* fixes typo in a sample tiff generating function

* removes .DS_Store

Co-authored-by: vacuous_planet <36260163+ackagel@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exporting multi-compartment morphological measurements
3 participants