Drop cell labels before every neighborhood clustering step #319

alex-l-kong · 2020-11-03T22:46:20Z

What is the purpose of this PR?

Addresses and closes #316. We were not generating the correct silhouette score visualization because compute_cluster_metrics didn't receive neighborhood_counts with the label column dropped. This PR patches that up.

How did you implement your changes

We move the .drop calls on neighborhood_counts into the arguments into the clustering functions.

review-notebook-app · 2020-11-03T22:46:23Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ngreenwald

Is there any reason we can't move the drop calls inside of compute_cluster_metrics and generate_cluster_matrix_results? Given that we're always going to want to perform that action for any cluster data that's passed in, I think it actually does make sense to do it in the function

alex-l-kong · 2020-11-04T03:40:32Z

@ngreenwald to me, it's a little awkward having to pass in the explicit label name into both functions when the sole purpose of doing so is simply to drop that column. That being said, that's kind of the same thing having to do .drop every time we pass it in.

How about dropping the label column at the end of create_neighborhood_matrix, so that way, both neighbor_counts and neighbor_freqs already have label dropped beforehand? I think that would save us a lot of trouble.

ackagel · 2020-11-04T20:17:45Z

so, the output of the create_neighborhood_matrix is basically a cell table yeah? Any reason we can't append (or add in some standardized way) the phenotype data onto the existing cell table, and just configure settings.py to do the proper trimming?

alex-l-kong · 2020-11-09T19:10:09Z

@ngreenwald so looking back at this problem, I made an explicit call to drop label columns before running the clustering steps in the notebook. The reason is because the user can simply refer to the corresponding label column in all_data if they really need label information for neighbor_counts/neighbor_freqs. I don't think there's a need to really keep a duplicate copy of label in the neighbor matrix.

alex-l-kong · 2020-11-09T19:35:25Z

@ngreenwald @ackagel while Will works on a fix for the database at Van Valen Lab (which affects Segment_Image_Data.ipynb, I'll request a review on this PR since I only changed a few things.

ngreenwald · 2020-11-09T20:29:29Z

okay, so neighbor_counts and neighbor_freqs are created by create_neighborhood_matrix. The only thing we use either of these for is plotting, correct? We're never going to return these to the user? So can we instead modify create_neighborhood_matrix to produce a version of these two arrays that already has the label column dropped? Is there a reason they ever need to be included?

alex-l-kong · 2020-11-09T21:34:51Z

@ngreenwald yeah, I can do it in the function as well, will update on my next commit and let you know when that's done.

…rom user

alex-l-kong · 2020-11-09T22:49:53Z

@ngreenwald moved the dropping of the cell label column to inside spatial_analysis, this PR is ready to be reviewed!

ngreenwald · 2020-11-09T23:10:42Z

Great, looks good. Can you add a new issue with Adam's suggestion? I think we can probably do some additional refactoring to consolidate neighbor_counts, neighbor_freqs, and all_data, but given that we haven't finished all the steps in this notebook yet I think it's fine to hold off.

* Make sure cell labels are dropped before every clustering step * Drop label column before running clustering * Minor comment fix, mostly to see if DeepCell server is back up * Move label dropping logic into neighborhood clustering step to hide from user * Don't need to drop cell label column anymore in tests

Make sure cell labels are dropped before every clustering step

900295f

alex-l-kong self-assigned this Nov 3, 2020

alex-l-kong requested review from ngreenwald and ackagel November 3, 2020 22:53

ngreenwald requested changes Nov 4, 2020

View reviewed changes

Merge branch 'master' into cluster_fix

70e0345

alex-l-kong and others added 4 commits November 4, 2020 15:29

Merge branch 'master' into cluster_fix

0516c10

Merge branch 'master' into cluster_fix

12f2ab9

Merge branch 'master' into cluster_fix

42208dd

Drop label column before running clustering

508ea4c

alex-l-kong requested a review from ngreenwald November 9, 2020 19:35

Minor comment fix, mostly to see if DeepCell server is back up

abb164e

alex-l-kong added 2 commits November 9, 2020 14:36

Move label dropping logic into neighborhood clustering step to hide f…

b5eb477

…rom user

Don't need to drop cell label column anymore in tests

4af9d8f

ngreenwald approved these changes Nov 9, 2020

View reviewed changes

ngreenwald merged commit c3b700b into master Nov 9, 2020

ngreenwald deleted the cluster_fix branch November 9, 2020 23:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop cell labels before every neighborhood clustering step #319

Drop cell labels before every neighborhood clustering step #319

alex-l-kong commented Nov 3, 2020

review-notebook-app bot commented Nov 3, 2020

ngreenwald left a comment

alex-l-kong commented Nov 4, 2020

ackagel commented Nov 4, 2020

alex-l-kong commented Nov 9, 2020 •

edited

Loading

alex-l-kong commented Nov 9, 2020 •

edited

Loading

ngreenwald commented Nov 9, 2020

alex-l-kong commented Nov 9, 2020 •

edited

Loading

alex-l-kong commented Nov 9, 2020

ngreenwald commented Nov 9, 2020

Drop cell labels before every neighborhood clustering step #319

Drop cell labels before every neighborhood clustering step #319

Conversation

alex-l-kong commented Nov 3, 2020

review-notebook-app bot commented Nov 3, 2020

ngreenwald left a comment

Choose a reason for hiding this comment

alex-l-kong commented Nov 4, 2020

ackagel commented Nov 4, 2020

alex-l-kong commented Nov 9, 2020 • edited Loading

alex-l-kong commented Nov 9, 2020 • edited Loading

ngreenwald commented Nov 9, 2020

alex-l-kong commented Nov 9, 2020 • edited Loading

alex-l-kong commented Nov 9, 2020

ngreenwald commented Nov 9, 2020

alex-l-kong commented Nov 9, 2020 •

edited

Loading

alex-l-kong commented Nov 9, 2020 •

edited

Loading

alex-l-kong commented Nov 9, 2020 •

edited

Loading