### Notebook for the cell proportion analysis of Healthy_vs_COPD CTRL_vs_IAV data

- **Developed by**: Carlos Talavera-López Ph.D
- **Würzburg Institute for Systems Immunology - Faculty of Medicine - Julius Maximilian Universität Würzburg**
- **Created on**: 231204
- **Last modified**: 231204

### Load required packages

In [1]:
import logging
import anndata
import anndata2ri
import numpy as np
import pandas as pd
import scanpy as sc
import matplotlib.pyplot as plt
import rpy2.rinterface_lib.callbacks

### Set up working environment

In [2]:
sc.settings.verbosity = 3
sc.logging.print_versions()
sc.settings.set_figure_params(dpi = 180, color_map = 'magma_r', dpi_save = 300, vector_friendly = True, format = 'svg')

-----
anndata     0.10.3
scanpy      1.9.6
-----
PIL                 10.1.0
anndata2ri          1.3.1
appnope             0.1.3
asttokens           NA
cffi                1.16.0
comm                0.2.0
cycler              0.12.1
cython_runtime      NA
dateutil            2.8.2
debugpy             1.8.0
decorator           5.1.1
exceptiongroup      1.2.0
executing           2.0.1
get_annotations     NA
h5py                3.10.0
igraph              0.10.8
importlib_resources NA
ipykernel           6.27.1
ipywidgets          8.1.1
jedi                0.19.1
jinja2              3.1.2
joblib              1.3.2
kiwisolver          1.4.5
leidenalg           0.10.1
llvmlite            0.41.1
markupsafe          2.1.3
matplotlib          3.8.2
mpl_toolkits        NA
mpmath              1.3.0
natsort             8.4.0
numba               0.58.1
numpy               1.24.4
packaging           23.2
pandas              2.1.3
parso               0.8.3
pexpect             4.9.0
platformdirs        

In [3]:
rpy2.rinterface_lib.callbacks.logger.setLevel(logging.ERROR)

In [4]:
anndata2ri.activate()

  anndata2ri.activate()


In [5]:
%load_ext rpy2.ipython

### Set up `milo` for the underlying analysis

In [6]:
%%R
library(miloR)
library(igraph)

Lade nötiges Paket: edgeR
Lade nötiges Paket: limma

Attache Paket: ‘igraph’

Das folgende Objekt ist maskiert ‘package:miloR’:

    graph

Die folgenden Objekte sind maskiert von ‘package:stats’:

    decompose, spectrum

Das folgende Objekt ist maskiert ‘package:base’:

    union



### Load working object

In [7]:
adata = sc.read_h5ad('../../../data/Marburg_cell_states_locked_ctl231212.raw.h5ad')
adata

AnnData object with n_obs × n_vars = 97573 × 27208
    obs: 'sex', 'age', 'ethnicity', 'PaCO2', 'donor', 'infection', 'disease', 'SMK', 'illumina_stimunr', 'bd_rhapsody', 'n_genes', 'doublet_scores', 'predicted_doublets', 'batch', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'pct_counts_ribo', 'percent_mt2', 'n_counts', 'percent_chrY', 'XIST-counts', 'S_score', 'G2M_score', 'condition', 'sample_group', 'IAV_score', 'group', 'Viral_score', 'cell_type', 'cell_states', 'leiden', 'cell_compartment', '_scvi_batch', '_scvi_labels', 'C_scANVI', 'viral_counts', 'infected_status'
    var: 'mt', 'ribo'
    uns: 'cell_states_colors', 'disease_colors', 'group_colors', 'infected_status_colors', 'infection_colors'
    obsm: 'X_scANVI', 'X_scVI', 'X_umap'

### Test for differential abundance with `milo`

In [8]:
sc.pp.neighbors(adata, n_neighbors = 50, random_state = 1712, use_rep = 'X_scANVI')

computing neighbors


OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.


In [None]:
pca_matrix = adata.obsm['X_scANVI']
pca_df = pd.DataFrame(pca_matrix)
pca_df


### Differential abundance (DA) analysis with `milo`

In [None]:
adata_no_knn = adata.copy()
adata_no_knn.obsp = None
adata_no_knn.uns.pop("neighbors")
adata_no_knn

- This can be converted to a SingleCellExperiment using R magic again

In [None]:
%%R -i adata_no_knn
adata_no_knn

- Make a Milo class object for DA analysis

In [None]:
%%R 
milo <- Milo(adata_no_knn)
milo

- Add KNN graph

In [None]:
knn_adjacency = adata.obsp["connectivities"]

In [None]:
%%R -i knn_adjacency

milo_graph <- buildFromAdjacency(knn_adjacency, k = 50, is.binary = TRUE)
graph(milo) <- miloR::graph(milo_graph)

- Add PCA matrix from X_scANVI

In [None]:
%%R -i pca_matrix

reducedDims(milo)$PCA <- as.matrix(pca_matrix)

### Run `milo` analysis 

In [None]:
design_df = adata.obs[["batch","donor", "group"]]
design_df.drop_duplicates(inplace = True)
design_df.index = design_df['batch']
design_df

In [None]:
%%R -i design_df -o DA_results

## Define neighbourhoods
milo <- makeNhoods(milo, prop = 0.1, k = 20, d = 30, refined = TRUE)

## Count cells in neighbourhoods
milo <- countCells(milo, meta.data = data.frame(colData(milo)), sample = "batch")

## Calculate distances between cells in neighbourhoods for spatial FDR correction
milo <- calcNhoodDistance(milo, d = 30)

## Test for differential abundance
DA_results <- testNhoods(milo, design = ~ group, design.df = design_df)

### Explore neighbourhoods using a volcano plot

- In the above dataframe, each row represents a neighbourhood (NOT a cell) and the log-Fold Change and adjusted p-value for differential abundance between stages are reported. We can start exploring the test results with a volcano plot.

In [None]:
DA_results

In [None]:
plt.plot(DA_results.logFC, -np.log10(DA_results.SpatialFDR), '.')
plt.xlabel("log-Fold Change")
plt.ylabel("- log10(Spatial FDR)")

### Visualizing results from Milo analysis

In [None]:
%%R
milo <- buildNhoodGraph(milo)

In [None]:
%%R -w 1000 -h 800
plotNhoodGraphDA(milo, DA_results, alpha = 0.05)