**<span style="color:darkred; font-size:22px;">11_3. Steady-state Ligand-Receptor inference</span>**



## Background

Cell-cell communication (CCC) events play a critical role in diseases, often experiencing deregulation. To identify differential expression of CCC events between conditions, we can build upon standard differential expression analysis (DEA) approaches, such as [DESeq2](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8?ref=https://githubhelp.com). While dimensionality reduction methods like extracting intercellular programmes with [MOFA+](https://liana-py.readthedocs.io/en/latest/notebooks/mofatalk.html) and [Tensor-cell2cell](https://liana-py.readthedocs.io/en/latest/notebooks/liana_c2c.html) reduce CCC into sets of loadings, hypothesis-driven DEA tests focus on individual gene changes, making them easier to understand and interpret.

In this tutorial, we perform DEA at the pseudobulk level to assess differential expression of genes between conditions. We then translate the results into deregulated complex-informed ligand-receptor interactions and analyze their connections to downstream signaling events.

For further information on pseudobulk DEA, please refer to the [Differential Gene Expression chapter](https://www.sc-best-practices.org/conditions/differential_gene_expression.html) in the Single-cell Best Practices book, as well as [Decoupler's pseudobulk vignette](https://decoupler-py.readthedocs.io/en/latest/notebooks/pseudobulk.html). These resources provide more comprehensive details on the subject.


Install mofa, decoupler, and omnipath via pip with the following commands:

```python
pip install "decoupler>=0.1.4"
pip install "pydeseq2>=0.4.0"
```

### Load Packages

In [None]:
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# import liana
import liana as li
# needed for visualization and toy data
import scanpy as sc

### Set working directory for analysis

In [None]:
cwd = '/media/bio/Disk/Research Data/EBV/omicverse'
os.chdir(cwd)
updated_dir = os.getcwd()
print("Updated working directory: ", updated_dir)

#### Load & Prep Data

In [None]:
adata = sc.read_h5ad("Processed Data/scRNA_Annotation.h5ad")
adata

In [None]:
adata_Epi = sc.read_h5ad("Processed Data/scRNA_Epi_CNV.h5ad")
adata_Epi

In [None]:
adata.obs['Cell_type'] = adata.obs['Cell_type'].astype(str)
adata.obs.loc[adata_Epi.obs.index, 'Cell_type'] = adata_Epi.obs['cnv_status']


In [None]:
adata.obs['Cell_type'] = adata.obs['Cell_type'].astype('category')
adata.obs['Cell_type'] = adata.obs['Cell_type'].cat.reorder_categories(['Normal', 'Tumor', 'Fibroblasts', 'T','NK','B','Myeloid','Plasma','Mast','pDC','Neutrophils'])

In [None]:
# Select samples for downstream analysis
ebv_groups = ['Negative', 'Positive']  
adata = adata[adata.obs['EBV_status'].isin(ebv_groups)].copy()

# Select samples for downstream analysis
cell_types = ['Tumor','T','NK','B','Myeloid','Plasma','Mast','pDC','Neutrophils']  
adata = adata[adata.obs['Cell_type'].isin(cell_types)].copy()
adata

adata

In [None]:
for i in adata.obs['Cell_type'].cat.categories:
  number = len(adata.obs[adata.obs['Cell_type']==i])
  print('the number of category {} is {}'.format(i,number))

In [None]:
adata.raw = adata

The anndata object contains counts that have been normalized (per cell) and log-transformed.

In [None]:
print(np.min(adata.X), np.max(adata.X))

##### Background
liana typically works with the log1p-trasformed counts matrix, in this object the normalized counts are stored in raw: adata.raw.X

Preferably, one would use liana with all features (genes) for which we have enough counts, but for the sake of this tutorial we are working with a matrix pre-filtered to the variable features alone.

In the background, liana aggregates the counts matrix and generates statistics, typically related to cell identies. These statistics are then utilized by each of the methods in liana.



In [None]:
li.mt.show_methods()

In [None]:
li.rs.show_resources()

Each method infers relevant ligand-receptor interactions relying on different assumptions and each method returns different ligand-receptor scores, typically a pair per method. One score corresponding to the magnitude (strength) of interaction and the other reflecting how specificity of a given interaction to a pair cell identities.

Note

##### Method Class

Methods in liana are callable instances of the Method class. To obtain further information for each method the user can refer to the methods documentation ?method_name or ?method.__call__. Alternatively, users can use the method.describe function to get a short summary for each method.

For example, if the user wishes to learn more about liana’s rank_aggregate implementation, where we combine the scores of multiple methods, they could do the following:

In [None]:
# import liana's rank_aggregate
from liana.mt import rank_aggregate

In [None]:
rank_aggregate.describe()

In [None]:
# import all individual methods
from liana.method import singlecellsignalr, connectome, cellphonedb, natmi, logfc, cellchat, geometric_mean

### Steady-state Ligand-Receptor inference between immune cells

In [None]:
adata_myeloid = sc.read_h5ad("Processed Data/scRNA_Myeloid.h5ad")
adata_myeloid

# Select samples for downstream analysis
ebv_groups = ['Negative', 'Positive']  
adata_myeloid = adata_myeloid[adata_myeloid.obs['EBV_status'].isin(ebv_groups)].copy()

adata.obs['Cell_type'] = adata.obs['Cell_type'].astype(str) 
adata.obs.loc[adata_myeloid.obs.index, 'Cell_type'] = adata_myeloid.obs['Myeloid_subtype']

In [None]:
adata_TCell = sc.read_h5ad("Processed Data/scRNA_TCell.h5ad")
adata_TCell

# Select samples for downstream analysis
ebv_groups = ['Negative', 'Positive']  
adata_TCell = adata_TCell[adata_TCell.obs['EBV_status'].isin(ebv_groups)].copy()

adata.obs['Cell_type'] = adata.obs['Cell_type'].astype(str)
adata.obs.loc[adata_TCell.obs.index, 'Cell_type'] = adata_TCell.obs['T_subtype']

In [None]:
adata.obs['Cell_type'] = adata.obs['Cell_type'].astype('category')
for i in adata.obs['Cell_type'].cat.categories:
  number = len(adata.obs[adata.obs['Cell_type']==i])
  print('the number of category {} is {}'.format(i,number))

In [None]:
cell_type_mapping = {
    # Monocyte
    'CD14+ Mono': 'Monocyte',
    'CD16+ Mono': 'Monocyte',

    # CD4⁺ T
    'CD4⁺ IL21⁺ Tfh': 'CD4⁺ T',
    'CD4⁺ ISG⁺ T': 'CD4⁺ T',
    'CD4⁺ Tcm': 'CD4⁺ T',

    # Treg
    'TNFRSF9⁺ Treg': 'Treg',
    'TNFRSF9⁻ Treg': 'Treg',

    # CD8⁺ T
    'CD8⁺ GZMB⁺ Tem': 'CD8⁺ T',
    'CD8⁺ GZMB⁺ Tex': 'CD8⁺ T',
    'CD8⁺ GZMB⁺ early Tem': 'CD8⁺ T',
    'CD8⁺ GZMK⁺ Tpex': 'CD8⁺ T',
    'CD8⁺ ISG⁺ T': 'CD8⁺ T',
    'CD8⁺ ZNF683⁺ Trm': 'CD8⁺ T',
    'CD8⁺ activated-stress Tem': 'CD8⁺ T',
}

adata.obs['Cell_type_merged'] = adata.obs['Cell_type'].replace(cell_type_mapping)
adata.obs['Cell_type_merged'].value_counts()

In [None]:
# Select samples for downstream analysis
cell_types = ['C1QC+ Macro','IL1B+ Macro','SPP1+ Macro', 'CD8⁺ T', 'CD4⁺ T', 'Treg']  
adata = adata[adata.obs['Cell_type_merged'].isin(cell_types)].copy()
adata

In [None]:
adata.obs['Cell_type_merged'] = adata.obs['Cell_type_merged'].astype('category')
for i in adata.obs['Cell_type_merged'].cat.categories:
  number = len(adata.obs[adata.obs['Cell_type_merged']==i])
  print('the number of category {} is {}'.format(i,number))

#### Rank Aggregate
In addition to the individual methods, LIANA also provides a consensus that integrates the predictions of individual methods. This is done by ranking and aggregating (RRA) the ligand-receptor interaction predictions from all methods.

In [None]:
# Run rank_aggregate
li.mt.rank_aggregate(adata,
                     groupby='Cell_type_merged',
                     resource_name='consensus',
                     expr_prop=0.1, 
                     verbose=True)

In [None]:
adata.uns['liana_res'].head()

For more plot modification options we refer the user to plotnine’s tutorials and to the following link for a quick intro: https://datacarpentry.org/python-ecology-lesson/07-visualization-ggplot-python/index.html.

#### Circle Plot
While the majority of liana’s plots are in plotnine, thanks to @WeipengMo, we also provide a circle plot (drawn in networkx):

In [None]:
li.pl.circle_plot(adata,
                  groupby='Cell_type_merged',
                  score_key='magnitude_rank',
                  inverse_score=True,
                  source_labels='SPP1+ Macro',
                  filter_fun=lambda x: x['specificity_rank'] <= 0.05,
                  pivot_mode='counts', # NOTE: this will simply count the interactions, 'mean' is also available
                  figure_size=(5, 5),
                  edge_alpha=0.5,
                  edge_arrow_size=10,
                  edge_width_scale=(1, 5),
                  node_alpha=1,
                  node_size_scale=(100, 400),
                  node_label_offset=(-0.1, -0.2),
                  node_label_size=8,
                  node_label_alpha= 0.7,
                  )
plt.tight_layout()
plt.savefig("Results/11.TCell/11.circle_plot_T_B.pdf", format='pdf', dpi=300, bbox_inches='tight')
plt.show()

The remainder of the columns in this dataframe are those coming from each of the methods included in the rank_aggregate - i.e. see the show_methods to map methods to scores.

#### Dotplot
We will now plot the most ‘relevant’ interactions ordered to the magnitude_rank results from aggregated_rank.



In [None]:
my_plot = li.pl.dotplot(adata = adata,
              colour='specificity_rank',
              size='magnitude_rank',
              inverse_size=True,
              inverse_colour=True,
              source_labels=['SPP1+ Macro'],
              target_labels=['CD8⁺ T'],
              top_n=10,
              orderby='specificity_rank',
              filter_fun=lambda x: x['magnitude_rank'] <=0.5,
              orderby_ascending=True,
              figure_size=(5, 8)
             )
my_plot

In [None]:
my_plot.save('Results/08.CCC/08.dotplot_SPP1Macro_CD8T.pdf')

In [None]:
my_plot = li.pl.dotplot(adata = adata,
              colour='specificity_rank',
              size='magnitude_rank',
              inverse_size=True,
              inverse_colour=True,
              source_labels=['SPP1+ Macro'],
              target_labels=['Treg'],
              top_n=10,
              orderby='specificity_rank',
              filter_fun=lambda x: x['magnitude_rank'] <=0.5,
              orderby_ascending=True,
              figure_size=(5, 8)
             )
my_plot

In [None]:
my_plot.save('Results/08.CCC/08.dotplot_SPP1Macro_Treg.pdf')

In [None]:
my_plot = li.pl.dotplot(adata = adata,
              colour='specificity_rank',
              size='magnitude_rank',
              inverse_size=True,
              inverse_colour=True,
              source_labels=['CD8⁺ T',],
              target_labels=['Treg'],
              #ligand_complex = 'MIF',
              top_n=10,
              orderby='magnitude_rank',
              filter_fun=lambda x: x['specificity_rank'] <= 0.5,
              orderby_ascending=True,
              figure_size=(5, 8)
             )
my_plot

In [None]:
my_plot.save('Results/08.CCC/08.dotplot_CD8T_Treg.pdf')

In [None]:
my_plot = li.pl.dotplot(adata = adata,
              colour='specificity_rank',
              size='magnitude_rank',
              inverse_size=True,
              inverse_colour=True,
              source_labels=['Treg'],
              target_labels=['CD8⁺ T'],
              #ligand_complex = 'MIF',
              top_n=10,
              orderby='magnitude_rank',
              filter_fun=lambda x: x['specificity_rank'] <= 0.5,
              orderby_ascending=True,
              figure_size=(5, 8)
             )
my_plot

In [None]:
my_plot.save('Results/08.CCC/08.dotplot_Treg_CD8T.pdf')

In [None]:
my_plot = li.pl.tileplot(adata = adata,
                         # NOTE: fill & label need to exist for both
                         # ligand_ and receptor_ columns
                         fill='specificity_rank',
                         label='magnitude_rank',
                         label_fun=lambda x: f'{x:.2f}',
                         top_n=10,
                         orderby='cellphone_pvals',
                         orderby_ascending=True,
                        source_labels=['SPP1+ Macro'],
                        target_labels=['CD8⁺ T','Treg'],
                         uns_key='liana_res', # NOTE: default is 'liana_res'
                         source_title='Ligand',
                         target_title='Receptor',
                         figure_size=(8, 7)
                         )
my_plot

For more plot modification options we refer the user to plotnine’s tutorials and to the following link for a quick intro: https://datacarpentry.org/python-ecology-lesson/07-visualization-ggplot-python/index.html.

#### Circle Plot
While the majority of liana’s plots are in plotnine, thanks to @WeipengMo, we also provide a circle plot (drawn in networkx):

In [None]:
li.pl.circle_plot(adata,
                  groupby='Cell_type_merged',
                  score_key='magnitude_rank',
                  inverse_score=True,
                  source_labels='SPP1+ Macro',
                  filter_fun=lambda x: x['specificity_rank'] <= 0.05,
                  pivot_mode='counts', # NOTE: this will simply count the interactions, 'mean' is also available
                  node_label_offset=(-0.1, -0.2),
                  figure_size=(4, 5),
                  )
plt.tight_layout()
plt.savefig("Results/08.CCC/08.circle_plot_Tumor_Macrophage_NEW.pdf", format='pdf', dpi=300, bbox_inches='tight')
plt.show()