 ![CellphoneDB Logo](https://www.cellphonedb.org/images/cellphonedb_logo_33.png) | CellphoneDB is a publicly available repository of curated receptors, ligands and their interactions. ||
 :------------- | :------------- | :-------------

CellphoneDB includes subunit architecture for both ligands and receptors, representing heteromeric complexes accurately. This is crucial, as cell-cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies.

CellPhoneDB integrates existing datasets that pertain to cellular communication and new manually reviewed information. CellPhoneDB utilises information from the following data bases: [UniProt](https://www.uniprot.org/), [Ensembl](https://www.ensembl.org/), [PDB](https://www.ebi.ac.uk/pdbe/), [the IMEx consortium](https://www.imexconsortium.org/) and [IUPHAR](https://www.guidetopharmacology.org/).

CellPhoneDB can be used to search for a particular ligand/receptor or interrogate your own single-cell transcriptomics data.



## Install CellphoneDB package

In [None]:
%%capture
pip install --force-reinstall "git+https://github.com/ventolab/CellphoneDB.git"

## List CellphoneDB data releases

In [None]:
from IPython.display import HTML, display
from cellphonedb.utils import db_releases_utils
display(HTML(db_releases_utils.get_remote_database_versions_html()['db_releases_html_table']))

## Set CellphoneDB version and local directories for the database and user data

In [4]:
import os
# The default version of CellphoneDB data is the latest one, but you can change it to a previous version 
# at any point in this notebook (by re-setting the value of cpdb_version variable). 
# Please note that the format of the database from version v4.1.0 is incompatible with that of previous 
# versions, hence the lowest version number you may choose in this notebook is v4.1.0
cpdb_version = "v4.1.0"
# N.B. At the very least, please replace <your_user_id> with your user id
cpdb_dir = os.path.join("/Users/rp23/.cpdb/releases", cpdb_version)
# If you generated your own CellphoneDB database file, please replace the default path below to the your file's path
cpdb_file_path = os.path.join(cpdb_dir, "cellphonedb.zip")

## Download CellphoneDB database from https://github.com/ventolab/cellphonedb-data/

In [None]:
import os
from cellphonedb.utils import db_utils
db_utils.download_database(cpdb_dir, cpdb_version)

## Search CellphoneDB Interactions
#### Search CellphoneDB interactions by (a comma- or space-separated list of): 
* Ensembl ID (e.g. ENSG00000165029), 
* Gene name (e.g. ABCA1), 
* UniProt ID (e.g. KLRG2_HUMAN), 
* UniProt Accession (e.g. A4D1S0) or 
* Complex name (e.g. 12oxoLeukotrieneB4_byPTGR1)

In [None]:
import os
from cellphonedb.utils import file_utils, search_utils
from IPython.display import HTML, display
# Search CellphoneDB interactions by (a comma- or space-separated list of):
# Ensembl ID (e.g. ENSG00000165029), Gene name (e.g. ABCA1), UniProt ID (e.g. KLRG2_HUMAN), 
# UniProt Accession (e.g. A4D1S0) or Complex name (e.g. 12oxoLeukotrieneB4_byPTGR1)
(results, complex_name2proteins_text) = \
    search_utils.search(query_str = 'D17S1718,ENSG00000134780,integrin_a10b1_complex', 
                        cpdb_file_path = cpdb_file_path)
# Display results in a html table
# Note: Mouse over complex names to see constituent proteins
display(HTML(search_utils.get_html_table(results, complex_name2proteins_text)))

## Run Basic Analysis

In [8]:
# Please populate the following variables before executing the analysis
meta_file_path = None
counts_file_path = None
output_path = None
output_path = "/Users/rp23/.cpdb/user_files/out"
counts_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/counts.h5ad"
meta_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/meta.tsv"
degs_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/degs_in_epithelials.tsv"
microenvs_file_path = "/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/microenviroments.tsv"
cpdb_file_path = os.path.join(cpdb_dir, "cellphonedb.zip")
# Execute basic analysis
from cellphonedb.src.core.methods import cpdb_analysis_method
means, deconvoluted, deconvoluted_percents, interaction_scores = cpdb_analysis_method.call(
    cpdb_file_path = cpdb_file_path, 
    meta_file_path = meta_file_path, 
    counts_file_path = counts_file_path,
    counts_data = 'hgnc_symbol',
    output_path = output_path,
    microenvs_file_path = None,
    separator = "|",
    threshold = 0.1,
    result_precision = 3,
    debug = False,
    output_suffix = None,
    score_interactions = True,
    threads = 4)
# print(means.info)
# print(deconvoluted.info)
# print(deconvoluted_percents.info)
# print(interaction_scores.info)
example_table = interaction_scores[['id_cp_interaction','partner_a','partner_b','Lymphoid|SOX9_prolif', 
                    'SOX9_prolif|Lymphoid']].sort_values('Lymphoid|SOX9_prolif', ascending = False)
example_table

Unnamed: 0,id_cp_interaction,partner_a,partner_b,Lymphoid|SOX9_prolif,SOX9_prolif|Lymphoid
952,CPI-SS05632AB58,simple:P13591,simple:P11362,100.0,0.000
1095,CPI-SC02DEB4893,simple:P01579,complex:Type_II_IFNR,100.0,0.000
1658,CPI-SS0D168412F,simple:P48023,simple:P25445,100.0,0.000
1165,CPI-SS0E292C126,simple:Q12918,simple:Q9UHP7,100.0,0.000
1651,CPI-SS02C509183,simple:Q06643,simple:P36941,100.0,0.000
...,...,...,...,...,...
970,CPI-SC0E029AB31,simple:Q03405,complex:integrin_a4b1_complex,0.0,22.137
971,CPI-SC0B2557C9D,simple:P19320,complex:integrin_a4b1_complex,0.0,0.000
972,CPI-SC034EE8AAA,simple:P02751,complex:integrin_a4b7_complex,0.0,0.000
973,CPI-SC056FC1F11,simple:Q13477,complex:integrin_a4b7_complex,0.0,0.000


## Run Statistical Analysis

In [None]:
# Please populate the following variables before executing the analysis
meta_file_path = None
counts_file_path = None
output_path = None
output_path = "/Users/rp23/.cpdb/user_files/out"
counts_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/counts.h5ad"
meta_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/meta.tsv"
degs_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/degs_in_epithelials.tsv"
microenvs_file_path = "/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/microenviroments.tsv"
cpdb_file_path = os.path.join(cpdb_dir, "cellphonedb.zip")
# Execute statistical analysis
from cellphonedb.src.core.methods import cpdb_statistical_analysis_method
deconvoluted, deconvoluted_percents, means, pvalues, significant_means, interaction_scores = \
    cpdb_statistical_analysis_method.call(
        cpdb_file_path = cpdb_file_path, 
        meta_file_path = meta_file_path, 
        counts_file_path = counts_file_path,
        counts_data = 'hgnc_symbol',
        output_path = output_path,
        microenvs_file_path = None,
        iterations = 1000,
        threshold = 0.1,
        threads = 4,
        debug_seed = -1,
        result_precision = 3,
        pvalue = 0.05,
        subsampling = False,
        subsampling_log = False,
        subsampling_num_pc = 100,
        subsampling_num_cells = None,
        separator = '|',
        debug = False,
        output_suffix = None,
        score_interactions = True)
# print(deconvoluted.info)
# print(deconvoluted_percents.info)
# print(means.info)
# print(pvalues.info)
# print(significant_means.info)
# print(interaction_scores.info)
example_table = interaction_scores[['id_cp_interaction','partner_a','partner_b','Lymphoid|SOX9_prolif', 
                    'SOX9_prolif|Lymphoid']].sort_values('Lymphoid|SOX9_prolif', ascending = False)
example_table

Reading user files...
The following user files were loaded successfully:
/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/counts.h5ad
/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/meta.tsv
[ ][CORE][03/05/23-10:39:11][INFO] [Cluster Statistical Analysis] Threshold:0.1 Iterations:1000 Debug-seed:-1 Threads:4 Precision:3
[ ][CORE][03/05/23-10:39:18][INFO] Running Real Analysis
[ ][CORE][03/05/23-10:39:18][INFO] Running Statistical Analysis


  2%|██▊                                                                                                                    | 24/1000 [00:14<07:41,  2.11it/s]

## Run Differential Analysis

In [12]:
# Please populate the following variables before executing the analysis
meta_file_path = None
counts_file_path = None
degs_file_path = None
output_path = None
output_path = "/Users/rp23/.cpdb/user_files/out"
counts_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/counts.h5ad"
meta_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/meta.tsv"
degs_file_path="/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/degs_in_epithelials.tsv"
microenvs_file_path = "/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/microenviroments.tsv"
cpdb_file_path = os.path.join(cpdb_dir, "cellphonedb.zip")
# Execute differential analysis
from cellphonedb.src.core.methods import cpdb_degs_analysis_method
deconvoluted, deconvoluted_percents, means, relevant_interactions, significant_means, interaction_scores = \
    cpdb_degs_analysis_method.call(
        cpdb_file_path = cpdb_file_path, 
        meta_file_path = meta_file_path, 
        counts_file_path = counts_file_path,
        degs_file_path = degs_file_path,
        counts_data = 'hgnc_symbol',
        microenvs_file_path=None,
        threshold = 0.1,
        result_precision = 3,
        separator = '|',
        debug = False,
        output_path = output_path,
        output_suffix = None,
        score_interactions = True,
        threads = 4)
# print(deconvoluted.info)
# print(means.info)
# print(relevant_interactions.info)
# print(significant_means.info)
# print(interaction_scores.info)
example_table = interaction_scores[['id_cp_interaction','partner_a','partner_b','Lymphoid|SOX9_prolif', 
                    'SOX9_prolif|Lymphoid']].sort_values('Lymphoid|SOX9_prolif', ascending = False)
example_table

[ ][CORE][03/05/23-10:37:11][INFO] [Cluster DEGs Analysis] Threshold:0.1 Precision:3
Reading user files...
The following user files were loaded successfully:
/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/counts.h5ad
/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/meta.tsv
/Users/rp23/.cpdb/tests/data/examples/endometrium_v1/degs_in_epithelials.tsv
[ ][CORE][03/05/23-10:37:33][INFO] Running Real Analysis
[ ][CORE][03/05/23-10:37:33][INFO] Running DEGs-based Analysis
[ ][CORE][03/05/23-10:37:34][INFO] Building results
[ ][CORE][03/05/23-10:37:35][INFO] Scoring interactions: Filtering genes per cell type..


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 20.89it/s]

[ ][CORE][03/05/23-10:37:35][INFO] Scoring interactions: Calculating mean expression of each gene per group/cell type..



100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 93.00it/s]


[ ][CORE][03/05/23-10:37:36][INFO] Scoring interactions: Calculating scores for all interactions and cell types..


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196/196 [00:40<00:00,  4.90it/s]


Saved deconvoluted_result to /Users/rp23/.cpdb/user_files/out/degs_analysis_deconvoluted_result_05_03_2023_103817.txt
Saved deconvoluted_percents to /Users/rp23/.cpdb/user_files/out/degs_analysis_deconvoluted_percents_05_03_2023_103817.txt
Saved means_result to /Users/rp23/.cpdb/user_files/out/degs_analysis_means_result_05_03_2023_103817.txt
Saved relevant_interactions_result to /Users/rp23/.cpdb/user_files/out/degs_analysis_relevant_interactions_result_05_03_2023_103817.txt
Saved significant_means to /Users/rp23/.cpdb/user_files/out/degs_analysis_significant_means_05_03_2023_103817.txt
Saved interaction_scores to /Users/rp23/.cpdb/user_files/out/degs_analysis_interaction_scores_05_03_2023_103817.txt


Unnamed: 0,id_cp_interaction,partner_a,partner_b,Lymphoid|SOX9_prolif,SOX9_prolif|Lymphoid
952,CPI-SS05632AB58,simple:P13591,simple:P11362,100.0,0.000
1095,CPI-SC02DEB4893,simple:P01579,complex:Type_II_IFNR,100.0,0.000
1658,CPI-SS0D168412F,simple:P48023,simple:P25445,100.0,0.000
1165,CPI-SS0E292C126,simple:Q12918,simple:Q9UHP7,100.0,0.000
1651,CPI-SS02C509183,simple:Q06643,simple:P36941,100.0,0.000
...,...,...,...,...,...
970,CPI-SC0E029AB31,simple:Q03405,complex:integrin_a4b1_complex,0.0,22.137
971,CPI-SC0B2557C9D,simple:P19320,complex:integrin_a4b1_complex,0.0,0.000
972,CPI-SC034EE8AAA,simple:P02751,complex:integrin_a4b7_complex,0.0,0.000
973,CPI-SC056FC1F11,simple:Q13477,complex:integrin_a4b7_complex,0.0,0.000


## Plot Statistical Analysis results

In [None]:
import warnings
warnings.filterwarnings('ignore')
from ktplotspy.plot import plot_cpdb, plot_cpdb_heatmap
from cellphonedb.utils import file_utils
import os

meta_fp='test_meta.txt'
# counts_fn='test_counts.txt'
counts_fp='test.h5ad'
# Create AnnData object with obs set to a DataFrame containing data from meta_fn
adata = file_utils.get_counts_meta_adata(counts_fp, meta_fp)

# Exmaple dot plot
g1 = plot_cpdb(
        adata=adata,
        cell_type1="Myeloid",
        # '.' means any cell type
        cell_type2=".",
        means=means,
        pvals=pvalues,
        celltype_key="cell_type",
        genes=["FN1", "integrin-a5b1-complex","COLEC12"],
        title="Example dot plot"
    )

# Example heatmap
g2 = plot_cpdb_heatmap(
        adata=adata,
        pvals=pvalues,
        celltype_key="cell_type",
        log1p_transform=True,
        title="Example heatmap"
    )
g1, g2


## Search for interactions in Statistical or Differential Analysis Results

In [None]:
"""
Search results of either statistical or DEG analysis for relevant interactions matching any of:
        1. A gene in genes
        2. A complex containing a gene in genes
        3. An interaction name in interactions (e.g. 12oxoLeukotrieneB4_byPTGR1)
    where at least one pair of cell types containing one cell type from cell_types_1
    and one cell type from cell_types_2 has a significant mean.
    NB. If genes and interactions are empty, and cell_types_1 and cell_types_2 are both set to "All"
    then all relevant interactions are returned.
"""
from IPython.display import HTML, display
from cellphonedb.utils import search_utils

cell_types_1="All"
cell_types_2="All"
genes=None
interactions=None

cell_types_1=['Fibroblast dS','EVT_2']
cell_types_2=['epi_Ciliated','dS2']
genes=['DKK1']
interactions=['DKK1_LRP6','LPAR2_ADGRE5','TGFB1_TGFbeta_receptor2']
separator="|"
# Set long_format to True to transpose the results table (so that cell type pairs are shown in a single column)
long_format = False

search_results = search_utils.search_analysis_results(
        query_cell_types_1=cell_types_1,
        query_cell_types_2=cell_types_2,
        query_genes=genes,
        query_interactions=interactions,
        significant_means=significant_means,
        deconvoluted=deconvoluted,
        separator=separator,
        long_format=long_format
)
display(HTML(search_results.to_html(index=False)))
