 ![CellphoneDB Logo](https://www.cellphonedb.org/images/cellphonedb_logo_33.png) CellphoneDB

### Check python version

In [12]:
import pandas as pd
import numpy as np
import sys
import os

pd.set_option('display.max_columns', 100)
# Define our base directory for the analysis
os.chdir('/home/jovyan/RepTract/CELLPHONEDB/')

Checking that environment contains a Python >= 3.8 as required by CellPhoneDB.

In [13]:
print(sys.version)

3.8.16 | packaged by conda-forge | (default, Feb  1 2023, 16:01:55) 
[GCC 11.3.0]


___
## Input files
The differential expression method accepts 5 input files (4 mandatory).
- **cpdb_file_path**: (mandatory) path to the database.
- **meta_file_path**: (mandatory) path to the meta file linking cell barcodes to cluster labels.
- **counts_file_path**: (mandatory) paths to normalized counts file (not z-transformed), either in text format or h5ad (recommended).
- **degs_file_path**: (mandatory) path to the DEG file indicating the differentially expressed genes in each cluster. Only differentially expressed genes that are significant should be included.
- **microenvs_file_path** (optional) path to microenvironment file that groups cell types/clusters by microenvironments. When providing a microenvironment file, CellphoneDB will restrict the interactions to those cells within the microenvironment.

Both, `degs_file_path` and `microenvs_file_path` content will depend on the biological question that the researcher wants to answer.


> In **this example** we are studying how cell-cell interactions change between endometrial cells as epithelia and stromal cells differentiate in response to hormones in the three endoemtrial layers (see image above). Therefore, the `degs_file_path` contains only the genes differentially expressed along the epithelials or stromal/fibroblasts to capture the genes that change along their differentiation in response to progesterone.  The `microenvs_file_path` specifies the cells present in each spaitotemporal compartment of the differentiating endometrium. The `meta_file_path` and `counts_file_path` contain all cells that we are interested in.

> CellphoneDB will retrieve all the interactions occurring between epithelials or stromals and any other celltype in the meta/counts file where: (i) all the proteins are expressed in the corresponding cell type and (ii) at least one gene is differentially expressed by an epithelial/stromal subset.

In [14]:
cpdb_file_path = '/nfs/team292/vl6/FetalReproductiveTract/v5.0.0/cellphonedb.zip' # this is the downloaded database
meta_file_path = '/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/meta.tsv'
counts_file_path = '/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/counts_normalised.h5ad'
microenvs_file_path = '/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/microenvironments.tsv'
degs_file_path = '/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/DEGs_upregulated_genes.tsv'
out_path = '/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/'

### Inspect input files

<span style="color:green">**1)**</span> The **metadata** file is compossed of two columns:
- **barcode_sample**: this column indicates the barcode of each cell in the experiment.
- **celltype**: this column denotes the cell label assigned.

In [15]:
metadata = pd.read_csv(meta_file_path, sep = '\t')
metadata.head(3)

Unnamed: 0,Cell,cell_type
0,HD_F_GON13679794_CGGAGCTAGTTAAGTG,Müllerian Epi
1,HD_F_GON13679794_GTATCTTCAAGCGCTC,Müllerian Epi
2,HD_F_GON13679794_GTCTCGTTCCTGCCAT,Müllerian Epi


<span style="color:green">**2)**</span>  The **counts** files is a h5ad object from scanpy. The dimensions and order of this object must coincide with the dimensions of the metadata file, i.e. must have the same number of cells in both files.

In [18]:
import anndata

adata = anndata.read_h5ad(counts_file_path)
adata.shape

(6284, 20582)

Check barcodes in metadata and counts are the same.

In [19]:
list(adata.obs.index).sort() == list(metadata['Cell']).sort()

True

<span style="color:green">**3)**</span> **Differentially expressed genes** file is a two columns file indicating which genes are up-regulated (or specific) in a cell type. The first column corresponds to the cluster name (these match with those in the metadata file) and the second column the up-regulated gene. The remaining columns are ignored by CellPhoneDB. All genes present in this file will be taken into account, thus the user must provide in this file only those genes considered as up-regulated or relevant for the analysis.

In [20]:
degs = pd.read_csv(degs_file_path, sep = '\t')
degs.head(3)

Unnamed: 0,cluster,gene,p_val_adj,p_val,avg_log2FC,pct.1,pct.2
0,Müllerian Epi,PDLIM1,0.0,0.0,1.292841,0.872,0.103
1,Müllerian Epi,CDH2,0.0,0.0,1.202645,0.93,0.133
2,Müllerian Epi,PNOC,0.0,0.0,1.139028,0.781,0.084


<span style="color:green">**4)**</span> **Micronevironments** defines the cell types that belong to a a given microenvironemnt. CellPhoneDB will only calculate interactions between cells that belong to a given microenvironment. In this file we are defining two microenvionments.

In [21]:
microenv = pd.read_csv(microenvs_file_path, sep = '\t')
microenv

Unnamed: 0,celltype,microenvironment
0,Müllerian Epi,Early
1,Wolffian Epi,Early
2,Müllerian Mese,Early
3,Wolffian/Mesonephros Mese,Early


Displaying cells grouped per microenvironment

In [22]:
microenv.groupby('microenvironment')['celltype'].apply(lambda x : list(x.value_counts().index))

microenvironment
Early    [Müllerian Epi, Wolffian Epi, Müllerian Mese, ...
Name: celltype, dtype: object

<span style="color:green">**5)**</span> **Check cell type names are the same** 

In [23]:

# all cells in microenv are in meta
[ item in set(metadata['cell_type']) for item in set(microenv['celltype']) ]

[True, True, True, True]

In [24]:
# all cells in microenv are in meta - who is not?
list(set(microenv['celltype']) - set(metadata['cell_type']) )

[]

In [25]:
# all cells in degs are in meta
[ item in set(metadata['cell_type']) for item in set(degs['cluster']) ]

[True, True, True, True]

In [26]:
# all cells in degs are in meta - who is not?
list(set(degs['cluster']) - set(metadata['cell_type']) )

[]

____
# Run CellphoneDB with differential analysis (method 3)
The output of this method will be saved in `out_path` and also assigned to the predefined variables.

In [27]:
from cellphonedb.src.core.methods import cpdb_degs_analysis_method
res = \
    cpdb_degs_analysis_method.call(
        cpdb_file_path = cpdb_file_path, 
        meta_file_path = meta_file_path, 
        counts_file_path = counts_file_path,
        degs_file_path = degs_file_path,
        counts_data = 'hgnc_symbol',
        microenvs_file_path=microenvs_file_path,
        threshold = 0.1,
        result_precision = 3,
        separator = '|',
        debug = False,
        output_path = out_path,
        output_suffix = None,
        score_interactions = True,
        threads = 4)

[ ][CORE][07/02/25-12:32:02][INFO] [Cluster DEGs Analysis] Threshold:0.1 Precision:3
Reading user files...
The following user files were loaded successfully:
/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/counts_normalised.h5ad
/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/meta.tsv
/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/microenvironments.tsv
/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/input/DEGs_upregulated_genes.tsv
[ ][CORE][07/02/25-12:32:13][INFO] Running Real Analysis
[ ][CORE][07/02/25-12:32:13][INFO] Limiting cluster combinations using microenvironments
[ ][CORE][07/02/25-12:32:13][INFO] Running DEGs-based Analysis
[ ][CORE][07/02/25-12:32:13][INFO] Building results
[ ][CORE][07/02/25-12:32:14][INFO] Scoring interactions: Filtering genes per cell type..


100%|██████████| 4/4 [00:00<00:00, 48.88it/s]

[ ][CORE][07/02/25-12:32:14][INFO] Scoring interactions: Calculating mean expression of each gene per group/cell type..



100%|██████████| 4/4 [00:00<00:00, 175.58it/s]


[ ][CORE][07/02/25-12:32:14][INFO] Scoring interactions: Calculating scores for all interactions and cell types..


100%|██████████| 16/16 [00:02<00:00,  6.57it/s]


Saved deconvoluted_result to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_deconvoluted_result_02_07_2025_123217.txt
Saved deconvoluted_percents to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_deconvoluted_percents_02_07_2025_123217.txt
Saved means_result to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_means_result_02_07_2025_123217.txt
Saved relevant_interactions_result to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_relevant_interactions_result_02_07_2025_123217.txt
Saved significant_means to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_significant_means_02_07_2025_123217.txt
Saved interaction_scores to /nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/degs_analysis_i

In [28]:
res.keys()

dict_keys(['deconvoluted_result', 'deconvoluted_percents', 'means_result', 'relevant_interactions_result', 'significant_means', 'interaction_scores'])

___
### Description of output files

**Relevant interaction** fields:
- **id_cp_interaction**: interaction identifier.
- **interacting_pair**: Name of the interacting pairs.
- **partner A/B**: Identifier for the first interacting partner (A) or the second (B). It could be: UniProt (prefix simple:) or complex (prefix complex:)
- **gene A/B**: Gene identifier for the first interacting partner (A) or the second (B).
- **secreted**: True if one of the partners is secreted.
- **receptor A/B**: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.
- **annotation_strategy**: Curated if the interaction was annotated by the CellPhoneDB developers. Other value if it was added by the user.
- **is_integrin**: True if one of the partners is integrin.
- **cell_a|cell_b**: 1 if interaction is detected as significant, 0 if not.

In [29]:
res['relevant_interactions_result']

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,is_integrin,Müllerian Epi|Müllerian Epi,Müllerian Epi|Wolffian Epi,Müllerian Epi|Müllerian Mese,Müllerian Epi|Wolffian/Mesonephros Mese,Wolffian Epi|Müllerian Epi,Wolffian Epi|Wolffian Epi,Wolffian Epi|Müllerian Mese,Wolffian Epi|Wolffian/Mesonephros Mese,Müllerian Mese|Müllerian Epi,Müllerian Mese|Wolffian Epi,Müllerian Mese|Müllerian Mese,Müllerian Mese|Wolffian/Mesonephros Mese,Wolffian/Mesonephros Mese|Müllerian Epi,Wolffian/Mesonephros Mese|Wolffian Epi,Wolffian/Mesonephros Mese|Müllerian Mese,Wolffian/Mesonephros Mese|Wolffian/Mesonephros Mese
0,CPI-SC0A2DB962D,CDH1_integrin_a2b1_complex,simple:P12830,complex:integrin_a2b1_complex,CDH1,,False,False,False,curated,True,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0
2,CPI-SC0C8B7BCBB,COL11A1_integrin_a2b1_complex,simple:P12107,complex:integrin_a2b1_complex,COL11A1,,True,False,False,curated,True,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0
4,CPI-SC0B86B7CED,COL12A1_integrin_a2b1_complex,simple:Q99715,complex:integrin_a2b1_complex,COL12A1,,True,False,False,curated,True,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0
5,CPI-SC0FA343CEF,COL13A1_integrin_a2b1_complex,simple:Q5TAT6,complex:integrin_a2b1_complex,COL13A1,,False,False,False,curated,True,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0
6,CPI-SC0CCCF9A7F,COL14A1_integrin_a2b1_complex,simple:Q05707,complex:integrin_a2b1_complex,COL14A1,,True,False,False,curated,True,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2828,CPI-SS0E7CBFD03,WNT5A_ROR2,simple:P41221,simple:Q01974,WNT5A,ROR2,True,False,True,curated,False,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1
2869,CPI-SS0DBA22B92,CHL1_CHL1,simple:O00533,simple:O00533,CHL1,CHL1,True,False,False,curated,False,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0
2870,CPI-SC08497FAC9,CLCF1_CNTF_1R,simple:Q9UBD9,complex:CNTF_1R,CLCF1,,True,False,True,curated,False,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0
2883,CPI-SS0A612BCEF,KITLG_KIT,simple:P21583,simple:P10721,KITLG,KIT,True,False,True,curated,False,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0


In [30]:
res['relevant_interactions_result'].head(3)

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,is_integrin,Müllerian Epi|Müllerian Epi,Müllerian Epi|Wolffian Epi,Müllerian Epi|Müllerian Mese,Müllerian Epi|Wolffian/Mesonephros Mese,Wolffian Epi|Müllerian Epi,Wolffian Epi|Wolffian Epi,Wolffian Epi|Müllerian Mese,Wolffian Epi|Wolffian/Mesonephros Mese,Müllerian Mese|Müllerian Epi,Müllerian Mese|Wolffian Epi,Müllerian Mese|Müllerian Mese,Müllerian Mese|Wolffian/Mesonephros Mese,Wolffian/Mesonephros Mese|Müllerian Epi,Wolffian/Mesonephros Mese|Wolffian Epi,Wolffian/Mesonephros Mese|Müllerian Mese,Wolffian/Mesonephros Mese|Wolffian/Mesonephros Mese
0,CPI-SC0A2DB962D,CDH1_integrin_a2b1_complex,simple:P12830,complex:integrin_a2b1_complex,CDH1,,False,False,False,curated,True,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0
2,CPI-SC0C8B7BCBB,COL11A1_integrin_a2b1_complex,simple:P12107,complex:integrin_a2b1_complex,COL11A1,,True,False,False,curated,True,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0
4,CPI-SC0B86B7CED,COL12A1_integrin_a2b1_complex,simple:Q99715,complex:integrin_a2b1_complex,COL12A1,,True,False,False,curated,True,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0


**Deconvoluted** fields:
- **gene_name**: Gene identifier for one of the subunits that are participating in the interaction defined in “means.csv” file. The identifier will depend on the input of the user list.
- **uniprot**: UniProt identifier for one of the subunits that are participating in the interaction defined in “means.csv” file.
- **is_complex**: True if the subunit is part of a complex. Single if it is not, complex if it is.
- **protein_name**: Protein name for one of the subunits that are participating in the interaction defined in “means.csv” file.
- **complex_name**: Complex name if the subunit is part of a complex. Empty if not.
- **id_cp_interaction**: Unique CellPhoneDB identifier for each of the interactions stored in the database.
- **mean**: Mean expression of the corresponding gene in each cluster.

In [31]:
res['deconvoluted_result'].head(3)

Unnamed: 0_level_0,gene_name,uniprot,is_complex,protein_name,complex_name,id_cp_interaction,Müllerian Epi,Müllerian Mese,Wolffian Epi,Wolffian/Mesonephros Mese
gene,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
UBASH3B,UBASH3B,Q8TF42,True,UBS3B_HUMAN,Dehydroepiandrosterone_bySTS,CPI-CS09B8977D7,0.015,0.153,0.008,0.055
UBASH3B,UBASH3B,Q8TF42,True,UBS3B_HUMAN,Dehydroepiandrosterone_bySTS,CPI-CS05760BB78,0.015,0.153,0.008,0.055
UBASH3B,UBASH3B,Q8TF42,True,UBS3B_HUMAN,Dehydroepiandrosterone_bySTS,CPI-CS0259A0EB4,0.015,0.153,0.008,0.055


**Means** fields:
- **id_cp_interaction**: Unique CellPhoneDB identifier for each interaction stored in the database.
- **interacting_pair**: Name of the interacting pairs.
- **partner A or B**: Identifier for the first interacting partner (A) or the second (B). It could be: UniProt (prefix simple:) or complex (prefix complex:)
- **gene A or B**: Gene identifier for the first interacting partner (A) or the second (B). The identifier will depend on the input user list.
- **secreted**: True if one of the partners is secreted.
- **Receptor A or B**: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.
- **annotation_strategy**: Curated if the interaction was annotated by the CellPhoneDB developers. Otherwise, the name of the database where the interaction has been downloaded from.
- **is_integrin**: True if one of the partners is integrin.
- **means**: Mean values for all the interacting partners: mean value refers to the total mean of the individual partner average expression values in the corresponding interacting pairs of cell types. If one of the mean values is 0, then the total mean is set to 0.

In [32]:
res['means_result'].head(3)

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,is_integrin,Müllerian Epi|Müllerian Epi,Müllerian Epi|Wolffian Epi,Müllerian Epi|Müllerian Mese,Müllerian Epi|Wolffian/Mesonephros Mese,Wolffian Epi|Müllerian Epi,Wolffian Epi|Wolffian Epi,Wolffian Epi|Müllerian Mese,Wolffian Epi|Wolffian/Mesonephros Mese,Müllerian Mese|Müllerian Epi,Müllerian Mese|Wolffian Epi,Müllerian Mese|Müllerian Mese,Müllerian Mese|Wolffian/Mesonephros Mese,Wolffian/Mesonephros Mese|Müllerian Epi,Wolffian/Mesonephros Mese|Wolffian Epi,Wolffian/Mesonephros Mese|Müllerian Mese,Wolffian/Mesonephros Mese|Wolffian/Mesonephros Mese
0,CPI-SC0A2DB962D,CDH1_integrin_a2b1_complex,simple:P12830,complex:integrin_a2b1_complex,CDH1,,False,False,False,curated,True,0.183,0.548,0.114,0.108,0.783,1.148,0.714,0.709,0.083,0.447,0.014,0.008,0.08,0.445,0.011,0.005
2,CPI-SC0C8B7BCBB,COL11A1_integrin_a2b1_complex,simple:P12107,complex:integrin_a2b1_complex,COL11A1,,True,False,False,curated,True,0.199,0.564,0.13,0.125,0.108,0.473,0.039,0.034,0.678,1.043,0.609,0.604,0.765,1.13,0.696,0.691
3,CPI-SC0D3C12C3F,COL11A2_integrin_a2b1_complex,simple:P13942,complex:integrin_a2b1_complex,COL11A2,,True,False,False,curated,True,0.084,0.449,0.015,0.01,0.106,0.471,0.037,0.032,0.104,0.468,0.035,0.029,0.086,0.45,0.017,0.011


**Significant means** fields:
- **id_cp_interaction**: Unique CellPhoneDB identifier for each interaction stored in the database.
- **interacting_pair**: Name of the interacting pairs.
- **partner A or B**: Identifier for the first interacting partner (A) or the second (B). It could be: UniProt (prefix simple:) or complex (prefix complex:)
- **gene A or B**: Gene identifier for the first interacting partner (A) or the second (B). The identifier will depend on the input user list.
- **secreted**: True if one of the partners is secreted.
- **Receptor A or B**: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.
- **annotation_strategy**: Curated if the interaction was annotated by the CellPhoneDB developers. Otherwise, the name of the database where the interaction has been downloaded from.
- **is_integrin**: True if one of the partners is integrin.
- **significant_mean**: Significant mean calculation for all the interacting partners. If the interaction has been found relevant, the value will be the mean. Alternatively, the value is set to 0.

In [33]:
res['significant_means'].head(3)

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,is_integrin,rank,Müllerian Epi|Müllerian Epi,Müllerian Epi|Wolffian Epi,Müllerian Epi|Müllerian Mese,Müllerian Epi|Wolffian/Mesonephros Mese,Wolffian Epi|Müllerian Epi,Wolffian Epi|Wolffian Epi,Wolffian Epi|Müllerian Mese,Wolffian Epi|Wolffian/Mesonephros Mese,Müllerian Mese|Müllerian Epi,Müllerian Mese|Wolffian Epi,Müllerian Mese|Müllerian Mese,Müllerian Mese|Wolffian/Mesonephros Mese,Wolffian/Mesonephros Mese|Müllerian Epi,Wolffian/Mesonephros Mese|Wolffian Epi,Wolffian/Mesonephros Mese|Müllerian Mese,Wolffian/Mesonephros Mese|Wolffian/Mesonephros Mese
168,CPI-SC092A2325E,COL14A1_integrin_a11b1_complex,simple:Q05707,complex:integrin_a11b1_complex,COL14A1,,True,False,False,curated,True,0.062,,,,,,,,,,,,,,,,1.304
37,CPI-SC05640277E,COL7A1_integrin_a2b1_complex,simple:Q02388,complex:integrin_a2b1_complex,COL7A1,,True,False,False,curated,True,0.062,,,,,,,,,,,,,,0.558,,
2818,CPI-SS03C93A8FE,WNT5A_WIF1,simple:P41221,simple:Q9Y5W5,WNT5A,WIF1,True,False,False,curated,False,0.062,,,,,,,,,,,,,,,1.593,


In [34]:
res['interaction_scores'].head(3)

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,is_integrin,Müllerian Epi|Müllerian Epi,Müllerian Epi|Wolffian Epi,Müllerian Epi|Müllerian Mese,Müllerian Epi|Wolffian/Mesonephros Mese,Wolffian Epi|Müllerian Epi,Wolffian Epi|Wolffian Epi,Wolffian Epi|Müllerian Mese,Wolffian Epi|Wolffian/Mesonephros Mese,Müllerian Mese|Müllerian Epi,Müllerian Mese|Wolffian Epi,Müllerian Mese|Müllerian Mese,Müllerian Mese|Wolffian/Mesonephros Mese,Wolffian/Mesonephros Mese|Müllerian Epi,Wolffian/Mesonephros Mese|Wolffian Epi,Wolffian/Mesonephros Mese|Müllerian Mese,Wolffian/Mesonephros Mese|Wolffian/Mesonephros Mese
0,CPI-SC0A2DB962D,CDH1_integrin_a2b1_complex,simple:P12830,complex:integrin_a2b1_complex,CDH1,,False,False,False,curated,True,6.127,14.873,0.0,0.0,41.199,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,CPI-SC0C8B7BCBB,COL11A1_integrin_a2b1_complex,simple:P12107,complex:integrin_a2b1_complex,COL11A1,,True,False,False,curated,True,7.274,17.657,0.0,0.0,0.0,0.0,0.0,0.0,35.98,87.332,0.0,0.0,41.199,100.0,0.0,0.0
3,CPI-SC0D3C12C3F,COL11A2_integrin_a2b1_complex,simple:P13942,complex:integrin_a2b1_complex,COL11A2,,True,False,False,curated,True,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


# Explore CellPhoneDB results
This module allows to filter CellPhoneDB results by specifying either
- **cell types pairs**. We can specify two list of cell types (`query_celltype_1`and `query_celltype_2`), the method will subset the interactions to those pairs of cells. Cell types within each list will not be paired.  If we are interested in filtering interactions ocurring between a given cell to all the rest of cells in the dataset we can define `query_celltype_1 = 'All'` and `query_celltype_2 = ['cellA', 'cellB', ...]`. 
- **genes** The argument `genes` allows the user to filter interactions in which a gene participates.
- or **specific interactions** to define specific interactions based on the interaction name `query_interactions`.

This method filters the rows and columns of the significant_means file; NaN value correspond to interacting pairs found not significant, non-NaN value correspond to those interacting pairs found relevant.

In [35]:
from cellphonedb.utils import search_utils

search_results = search_utils.search_analysis_results(
    query_cell_types_1 = ['Wolffian Epi'],  # List of cells 1, will be paired to cells 2 (list or 'All').
    query_cell_types_2 = ['Müllerian Epi', 'Müllerian Mese'],     # List of cells 2, will be paired to cells 1 (list or 'All').
#     query_genes = ['LRP5'],                                       # filter interactions based on the genes participating (list).
#     query_interactions = ['CSF1_CSF1R'],                            # filter intereactions based on their name (list).
    significant_means = res['significant_means'],                          # significant_means file generated by CellPhoneDB.
    deconvoluted = res['deconvoluted_result'],                                    # devonvoluted file generated by CellPhoneDB.
    separator = '|',                                                # separator (default: |) employed to split cells (cellA|cellB).
    long_format = True                                              # converts the output into a wide table, removing non-significant interactions
)

# search_results.head(60)
search_results = search_results[search_results['interacting_cells'].isin(['Wolffian Epi|Müllerian Epi',
                                                                         'Wolffian Epi|Müllerian Mese'])]
print(search_results.shape)

search_results.sort_values('significant_mean', ascending=False).head(50)

(96, 7)


Unnamed: 0,interacting_pair,partner_a,partner_b,gene_a,gene_b,interacting_cells,significant_mean
376,WNT9B_SFRP1,simple:O14905,simple:Q8N474,WNT9B,SFRP1,Wolffian Epi|Müllerian Mese,14.244
444,WNT2B_SFRP1,simple:Q93097,simple:Q8N474,WNT2B,SFRP1,Wolffian Epi|Müllerian Mese,13.437
346,WNT7B_SFRP1,simple:P56706,simple:Q8N474,WNT7B,SFRP1,Wolffian Epi|Müllerian Mese,13.403
468,atRetinoicAcid_byALDH1A1_RAreceptor_RARB,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RARB,,,Wolffian Epi|Müllerian Mese,3.988
463,atRetinoicAcid_byALDH1A1_RAreceptor_RARA,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RARA,,,Wolffian Epi|Müllerian Mese,3.862
459,atRetinoicAcid_byALDH1A1_RAreceptor_RARG,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RARG,,,Wolffian Epi|Müllerian Mese,3.718
300,atRetinoicAcid_byALDH1A1_RAreceptor_RARA,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RARA,,,Wolffian Epi|Müllerian Epi,3.7
461,atRetinoicAcid_byALDH1A1_RAreceptor_RXRB,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RXRB,,,Wolffian Epi|Müllerian Mese,3.674
467,atRetinoicAcid_byALDH1A1_RAreceptor_RARB_RXRB,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RARB_RXRB,,,Wolffian Epi|Müllerian Mese,3.674
298,atRetinoicAcid_byALDH1A1_RAreceptor_RXRB,complex:atRetinoicAcid_byALDH1A1,complex:RAreceptor_RXRB,,,Wolffian Epi|Müllerian Epi,3.621


In [36]:
from cellphonedb.utils import search_utils

search_results = search_utils.search_analysis_results(
    query_cell_types_1 = ['Wolffian/Mesonephros Mese'],  # List of cells 1, will be paired to cells 2 (list or 'All').
    query_cell_types_2 = ['Müllerian Epi', 'Müllerian Mese'],     # List of cells 2, will be paired to cells 1 (list or 'All').
#     query_genes = ['LRP5'],                                       # filter interactions based on the genes participating (list).
#     query_interactions = ['CSF1_CSF1R'],                            # filter intereactions based on their name (list).
    significant_means = res['significant_means'],                          # significant_means file generated by CellPhoneDB.
    deconvoluted = res['deconvoluted_result'],                                    # devonvoluted file generated by CellPhoneDB.
    separator = '|',                                                # separator (default: |) employed to split cells (cellA|cellB).
    long_format = True                                              # converts the output into a wide table, removing non-significant interactions
)

# search_results.head(60)
search_results = search_results[search_results['interacting_cells'].isin(['Wolffian/Mesonephros Mese|Müllerian Epi',
                                                                         'Wolffian/Mesonephros Mese|Müllerian Mese'])]
print(search_results.shape)
search_results.sort_values('significant_mean', ascending=False).tail(50)

(112, 7)


Unnamed: 0,interacting_pair,partner_a,partner_b,gene_a,gene_b,interacting_cells,significant_mean
411,WNT5A_WIF1,simple:P41221,simple:Q9Y5W5,WNT5A,WIF1,Wolffian/Mesonephros Mese|Müllerian Mese,1.593
482,WNT5A_ROR2,simple:P41221,simple:Q01974,WNT5A,ROR2,Wolffian/Mesonephros Mese|Müllerian Mese,1.586
285,WNT5A_FZD6_LRP6,simple:P41221,complex:FZD6_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Epi,1.578
365,WNT5A_FZD2_LRP6,simple:P41221,complex:FZD2_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Epi,1.578
315,WNT5A_FZD10_LRP6,simple:P41221,complex:FZD10_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Epi,1.578
361,WNT5A_FZD3_LRP6,simple:P41221,complex:FZD3_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Epi,1.578
442,CXCL12_CXCR4,simple:P48061,simple:P61073,CXCL12,CXCR4,Wolffian/Mesonephros Mese|Müllerian Mese,1.574
314,WNT4_FZD10_LRP5,simple:P56705,complex:FZD10_LRP5,WNT4,,Wolffian/Mesonephros Mese|Müllerian Epi,1.515
519,WNT5A_FZD4_LRP6,simple:P41221,complex:FZD4_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Mese,1.479
349,WNT5A_FZD7_LRP6,simple:P41221,complex:FZD7_LRP6,WNT5A,,Wolffian/Mesonephros Mese|Müllerian Epi,1.474


### Save interactions for supplementary table

In [39]:
from cellphonedb.utils import search_utils

search_results = search_utils.search_analysis_results(
    query_cell_types_1 = ['Wolffian Epi', 'Wolffian/Mesonephros Mese'],  # List of cells 1, will be paired to cells 2 (list or 'All').
    query_cell_types_2 = ['Müllerian Mese', 'Müllerian Epi'],     # List of cells 2, will be paired to cells 1 (list or 'All').
#     query_genes = ['LRP5'],                                       # filter interactions based on the genes participating (list).
#     query_interactions = ['CSF1_CSF1R'],                            # filter intereactions based on their name (list).
    significant_means = res['significant_means'],                          # significant_means file generated by CellPhoneDB.
    deconvoluted = res['deconvoluted_result'],                                    # devonvoluted file generated by CellPhoneDB.
    separator = '|',                                                # separator (default: |) employed to split cells (cellA|cellB).
    long_format = True                                              # converts the output into a wide table, removing non-significant interactions
)

# search_results.head(60)
search_results = search_results[search_results['interacting_cells'].isin(['Wolffian/Mesonephros Mese|Müllerian Mese',
                                                                         'Wolffian/Mesonephros Mese|Müllerian Epi', 
                                                                         'Wolffian Epi|Müllerian Mese', 
                                                                         'Wolffian Epi|Müllerian Epi'])]
print(search_results.shape)
# search_results.sort_values('significant_mean', ascending=False).tail(50)

(208, 7)


In [40]:
search_results.to_csv('/nfs/team292/vl6/FetalReproductiveTract/CellPhoneDB/Mullerian_and_Wolffian_early/output/wolffian_to_mullerian_signalling.csv')