Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add context-dependent spatial randomization #215

Closed
wants to merge 76 commits into from
Closed

Conversation

alex-l-kong
Copy link
Contributor

@alex-l-kong alex-l-kong commented Sep 10, 2020

What is the purpose of this PR?

Addresses and closes #207. Now that we've built a foundation for context-dependent spatial randomization, we can now begin to work on the actual randomization process. Naive randomization would assume that we do not care about the cell types (aka the FlowSOM IDs) associated with the marker list we randomize over. We no longer make that assumption in a context-dependent environment.

How did you implement your changes

We will be working primarily out of a new function: compute_close_cell_num_random_context. Good news is that compute_close_cell_num does not need to be changed, only the randomization process we're comparing against. For channel enrichment, we allow the user to specify a list of FlowSOM IDs they wish to specifically facet over, with all non-specified IDs getting grouped into an 'else' category. Erin would also like for the user to be able to specify parameters to tune the randomization process. I'm not exactly sure how this would be done yet, but this is something we'll look into after we've gotten the basic logic written in stone.

Remaining issues

Non-optimal code is the biggest issue right now. To make everything clear, I've brute-forced my way through the initial implementation. This will need to be optimized to increase efficiency.

Some code had to be duplicated from compute_close_cell_num to properly ensure markers by type were indexed properly. If we can remove some or all of that code, that'd be mighty fine.

@alex-l-kong
Copy link
Contributor Author

Still need to add testing, this code review is mainly so you can kind of see what's going on.

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the logic makes sense. I'm sure there is more optimization to be had, two main things I noticed. The first is that I believe the first two nested for loops can be combined; see my comment. The second is that I don't think the code actually runs as written; thresh isn't defined, for example.

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
@ngreenwald
Copy link
Member

ngreenwald commented Sep 10, 2020 via email

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a good start, but taking a look into optimizing the loops would probs be a good idea

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Sep 14, 2020

Added very basic testing for context spatial analysis, the thing to look at now is the updated logic which is now (more) optimized. I'm unsure about one aspect which I've added in a long TODO: if anyone happens knows anything about that please feel free to comment!

I will be working on integrating this with calculate_channel_spatial_enrichment, which would include a flag indicating whether we want context-randomization, as well as another argument to specify the FlowSOM ID's we want to randomize over.

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking like it's getting there; I mostly have a few clarification questions

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with all of Adam's comments

@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Oct 1, 2020

The logic is sound now, at least when comparing Erin's data between her script and our optimized version. However, we will probably need to wait until next week until we can get full verification.

For the record, the context-randomized close_num_rand Erin provided me to test against is completely wrong. It doesn't maintain the same symmetry as would be expected, making it completely wacko. I suspect Felix ran a much older version of the context-based randomization which caused this. I don't want to bother Erin too much right now, but this is something that will have to be double checked.

I do know for a fact that when running Erin's non-optimized MATLAB context-randomization script on her data to produce the real close_num_rand, the ranges check out with our optimized version in spatial_analysis_utils. So I think we're pretty close to finishing this up, after verifying our process just need to add testing and notebook integration.

alex-l-kong and others added 26 commits September 30, 2020 19:27
@ngreenwald
Copy link
Member

Resolved by #451

@ngreenwald ngreenwald closed this Aug 26, 2022
@ngreenwald ngreenwald deleted the context_spatial branch August 26, 2022 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add context-dependent spatial randomization
3 participants