-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop cell labels before every neighborhood clustering step #319
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason we can't move the drop
calls inside of compute_cluster_metrics
and generate_cluster_matrix_results
? Given that we're always going to want to perform that action for any cluster data that's passed in, I think it actually does make sense to do it in the function
@ngreenwald to me, it's a little awkward having to pass in the explicit label name into both functions when the sole purpose of doing so is simply to drop that column. That being said, that's kind of the same thing having to do How about dropping the label column at the end of |
so, the output of the |
@ngreenwald so looking back at this problem, I made an explicit call to drop label columns before running the clustering steps in the notebook. The reason is because the user can simply refer to the corresponding |
@ngreenwald @ackagel while Will works on a fix for the database at Van Valen Lab (which affects |
okay, so |
@ngreenwald yeah, I can do it in the function as well, will update on my next commit and let you know when that's done. |
@ngreenwald moved the dropping of the cell label column to inside |
Great, looks good. Can you add a new issue with Adam's suggestion? I think we can probably do some additional refactoring to consolidate neighbor_counts, neighbor_freqs, and all_data, but given that we haven't finished all the steps in this notebook yet I think it's fine to hold off. |
* Make sure cell labels are dropped before every clustering step * Drop label column before running clustering * Minor comment fix, mostly to see if DeepCell server is back up * Move label dropping logic into neighborhood clustering step to hide from user * Don't need to drop cell label column anymore in tests
* Make sure cell labels are dropped before every clustering step * Drop label column before running clustering * Minor comment fix, mostly to see if DeepCell server is back up * Move label dropping logic into neighborhood clustering step to hide from user * Don't need to drop cell label column anymore in tests
What is the purpose of this PR?
Addresses and closes #316. We were not generating the correct silhouette score visualization because
compute_cluster_metrics
didn't receiveneighborhood_counts
with the label column dropped. This PR patches that up.How did you implement your changes
We move the
.drop
calls onneighborhood_counts
into the arguments into the clustering functions.