## Coding session 3: Multipopulation analysis

In this session we will focusing on three spatial metrics that account for more than two populations simultaneously:

1. Quadrat Correlation Matrix

2. Adjacency Permutation Test

3. Neighbourhood Clustering

This session is a free play session and we provide a list of questions for each metric to guide your analysis. These are not designed in any order, you can choose which ever metric you'd work on first.

For all three sections, we'll use the `Mouse-Colon-Carcinoma` dataset stored in the muspan `datasets` module. Here's a reminder on how to load in this dataset:

In [1]:
# Import the muspan module
import muspan as ms

# Load the 'Mouse-Colon-Carcinoma' example domain from the muspan datasets
a_domain = ms.datasets.load_example_domain('Mouse-Colon-Carcinoma')

MuSpAn domain loaded successfully. Domain summary:
Domain name: Mouse cells
Number of objects: 6676
Collections: ['Cell centres']
Labels: ['Celltype', 'CD4'] 
Networks: [] 
Distance matrices: []


### Quadrat Correlation Matrix

Using our [documentation](https://docs.muspan.co.uk/latest/generated/muspan.region_based.quadrat_correlation_matrix.html#muspan.region_based.quadrat_correlation_matrix) and [tutorial](https://docs.muspan.co.uk/latest/_collections/region_based_analysis/quadrat_correlation.html) on the QCM, try and answer the following question in the 'Mouse-Colon-Carcinoma' dataset:

1. What does the Quadrat Correlation Matrix reveal about the spatial relationships between different cell populations in the 'Mouse-Colon-Carcinoma' dataset?

2. Are there any specific cell populations that exhibit strong positive or negative correlations? What might these correlations indicate about their spatial interactions?

3. How does chaning the region type impact the QCM results? 

4. Is there a characteristic length scale of co-localisation in the dataset? How can this be identified?

5. How do the observed correlations between cell populations compare to what would be expected under spatial randomness? What biological insights can be drawn from any deviations?

6. Can you identify any patterns or clusters in the QCM that suggest higher-order spatial dependencies among multiple cell populations? How might these patterns relate to tumor microenvironment dynamics?

7. How does the QCM change when analysing different regions of the domain (e.g., sample core vs. periphery)? What does this tell us about the heterogeneity of spatial interactions in the dataset?


### Adjacency Permutation Test

Using our documentation on [generating networks](https://docs.muspan.co.uk/latest/generated/muspan.networks.generate_network.html#muspan.networks.generate_network) (and tutorials) and [Adjacency Permutation Test](https://docs.muspan.co.uk/latest/generated/muspan.networks.adjacency_permutation_test.html#muspan.networks.adjacency_permutation_test), try and answer the following question in the 'Mouse-Colon-Carcinoma' dataset:

1. Which cells pairs are significantly adjacnecy up to a distance of 40µm? 

2. What is the impact of changing between the 'Delaunay' and 'proximity' network type? What is the interpretation of these spatial networks?

3. How does the choice of distance threshold impact the results of the adjacency permutation test? 

4. Can you identify any clusters or regions where adjacency patterns deviate significantly from randomness? How might these relate to tumor microenvironment dynamics?



### Neighbourhood Clustering

Using our [documentation](https://docs.muspan.co.uk/latest/generated/muspan.networks.cluster_neighbourhoods.html#muspan.networks.cluster_neighbourhoods) and [tutorial](https://docs.muspan.co.uk/latest/_collections/network_analysis/Network%20methods%20-%201%20-%20neighbourhood_analysis.html) on the Neighbourhood clustering, try and answer the following question in the 'Mouse-Colon-Carcinoma' dataset:

1. What characteristic neighbourhood are present using a delaunay network with Kmean clustering? What is the neighbourhood defined as? What happens when you increasing / descrease k for the clustering?  

2. What happends when you change the network to a proximity network? !Careful with the neighbourhood definition - what is k_hops control? What is the interpretation of this network?

3. Compare supervised (kmeans) with unsupervised (hdbscan) clustering methods? What are the resultant neighbourhoods? - check out we can alter the clustering parameters from our tutorials

4. What does the immune profile around only the epithelial cells in the domain? For this - see neighbourhood source.